Adaptive Query Compilation for Stream Processing (MA4662/5-2)

PI: Volker Markl (TU Berlin), Steffen Zeuch (DFKI Berlin)
Project Collaborator: Philipp M. Grulich
Project website: https://nebula.stream

Over the last decades, the requirements of data processing workloads significantly changed. Nowadays, real-time analytics requires the execution of long-running queries over unbounded, continuously changing, high-velocity data streams. Common SPEs such as Flink and Storm scale-out execution to achieve high throughput and low-latency. However, recent research revealed that these SPEs can not fully utilize available hardware resources.
First, they do not take the particular hardware resources into account for optimization. Second, they do not take changing data characteristics intro account, which hinders a variety of adaptive optimizations. Third, they rely heavily on User-defined Functions, which introduce a high processing overhead due to data serialization and transfer.
In this project, we want to face these challenges to enable efficient processing of complex stream processing pipelines on modern hardware. To this end, we propose a novel adaptive query compiler for stream processing techniques to optimize code with regards to the hardware resources and changing data characteristics. Furthermore, we study possibilities to embed complex user-defined functions into compiled pipelines efficiently to support a wide range of advanced analytical data processing workloads.