“Multi-Level Elasticity for Wide-Area Data Streaming Systems: A Reinforcement Learning Approach” (2018)
Summary: In this article, the authors describe their solution of scaling in/out data stream queries in an adaptive way. They applied a reinforcement learning strategy which continuously improves itself during query execution time, leading to an close to optimal scaling decision for long-running queries.
“Partitioning functions for stateful data parallelism in stream processing” (2014)
Summary: This paper compares different partitioning functions under properties like balance, compactness, or migration/computation costs for stream processing.
“StreamCloud: A Large Scale Data Streaming System” (2010)
Summary: This paper describes the streaming system StreamCloud, which contains various parallelization strategies for data stream queries. It allows the processing of high volumes of data on shared-nothing clusters by scaling out efficiently.