Distributed, fault-tolerant in-place consensus sequence on innovative hardware as building block for data management (RE1389/10)

PI: Alexander Reinefeld, Florian Schintke
Project Collaborator: Florian Schintke, Thorsten Schütt, Jan Skrzypczak
Project website: Distributed, fault-tolerant in-place consensus sequence on innovative hardware as building block for data management

Quorum consensus algorithms like the Paxos algorithm are widely used as basic building blocks for fault-tolerance in distributed systems. Unfortunately, distributed quorum consensus causes much overhead to negotiate and safely store the consensus. We therefore plan to optimize Paxos-based fault-tolerance for sequences of consensus in three ways:

  1. exploit multicast and reduce operations of modern interconnects to reduce the latency and number of messages,
  2. use remote direct memory access (RDMA) in combination with NVRAM to manage a distributed shared state,
  3. modify Paxos to support a sequence of consensus decisions in-place and avoid separate memory resources for each Paxos instance.
  4. We then build efficient custom datatypes on top of consensus sequences, that support partial updates, multiple-reader-single-writer locks, or compare-and-swap semantics.

The resulting distributed fault-tolerant consensus will provide low latency and high-throughput decisions. It will allow to apply recoverable distributed consensus in new scenarios where it was avoided before due to its high latency. The optimized consensus can be used as a building block in current and future distributed data management and database systems – including those developed in SPP 2037 – that often rely on a sequence of decisions to process locks, transactions, to make atomic changes like compare and swap, to support replicated state machines, or to elect the next master etc.