Transactional Stream Processing on Non-Volatile Memory (SA 782/28)

PI: Kai-Uwe Sattler (TU Ilmenau)
Project collaborator: Philipp Götze (TU Ilmenau), Constantin Pohl (TU Ilmenau)
Project website: Transactional Stream Processing on Non-Volatile Memory

Transactional stream processing combines the paradigm of data stream processing
with traditional transaction processing in databases to address novel
application domains such as cyber-physical systems, Internet of Things, and
healthcare analytics. In these applications, continuous streams of data have to
be processed and often multiple sources (both streams and stored data) are
combined, e.g. for enriching stream data or correlating streams with historical
data. Furthermore, stream data represent a source for sequences of inserts or
updates on persistent data. Thus, the goal is to combine data streams with
transactional data while providing guarantees such as ACID, exactly-once and
ordered execution. In this context, the big challenges are the (time-restricted)
storage of data stream elements as well as proving efficient access to the then
persistent data while at the same time still providing realtime processing of
streaming data.

With this proposal we plan to address these challenges of transactional stream
processing by exploiting opportunities of modern hardware technology – in
particular non-volatile memory (NVRAM) which promises byte-addressable and
persistent storage with latencies close to DRAM. These features should allows
for high transaction rates and easy durability guarantees. However, NVRAM comes
with higher write latency (compared to DRAM and its own read latency) and
requires atomic write mechanisms to ensure consistency in the presence of a CPU
cache hierarchy.

Our proposal has three objectives. First, we plan to define an operational model
of transactional stream processing that handles streams and tables as well as
queries and updates on them in a unified way. This includes to model data
streams as updates on tables and updates as sources of data streams while
providing transactional guarantees (including isolation). The second objective
is the design and evaluation of novel data and index structures for persistent
streams and tables optimized for NVRAM. For this purpose, we take into account
the specific properties of NVRAM (asymmetry of reads/writes, atomicity of
writes), access patterns of transactional stream processing, e.g. mainly
append-optimized, time-based ordering of elements but also random update
patterns). Furthermore, as a third objective we plan to investigate and evaluate
architectural patterns for NVRAM-based stream processing. Since NVRAM offers a
technology that combines the best parts of DRAM and disks, i.e. speed and
instant persistency, it is not obvious, how NVRAM will fit in the current system
architecture. Therefore, we will analyze patterns such as NVRAM only for
replacing DRAM and/or disk, NVRAM as anti-cache for DRAM, disk as anti-cache for
NVRAM, or simple DRAM+NVRAM+disk setups in the context of transactional stream
processing.