“FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory” (2016)
Summary: The paper proposes a hybrid NVM-DRAM B+-tree, whereby the leaves are stored unsorted in NVM and the inner nodes sorted in DRAM. These are reconstructed in case of a restart. For failure-atomicity it makes use of indirection and invalidation. For more complex operations (split/merge) micro-logs are used. To reduce the search overhead fingerprints (hashing) are added. For concurrency the FPTree uses a combination of HTM (DRAM) and fine-grained locks (NVM).
“BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory” (2018)
Summary: This paper presents a latch-free B-tree for NVM. They use a persistent multi-word CaS operation (PMwCAS) which buffers changes and then atomically applies these via a descriptor providing both failure-atomicity and thread-safety. On the one hand, this simplifies implementation and, on the other, it enables high performance. The recovery includes only advancing or undoing incomplete PMwCAS operations.
“WiscKey: Separating Keys from Values in SSD-Conscious Storage” (2016)
Summary: LSM-trees are widely used as the index of choice for high volume, write intensive applications and are often featured as the backbone for key-value stores. One of the key advantages of LSM-trees is that updates are first batched in main memory before a batch is written to secondary storage in a sequential pattern. To increase query performance, those batches are also merged in the background through multiple index levels of varying sizes. While this behavior works well for HDDs, it overlooks some potential of SSD technology. Background merges of multiple levels means data written multiple times, contributing to SSD wear leaving. Furthermore, SSDs have multiple channels and planes which can work in parallel. To leverage these features, the authors propose to perform the LSM mechanism on keys only while storing values in separate log structure. Since only keys are merged in the background, this reduces write amplification, while parallelism is exploited for range queries through utilizing multiple threads retrieving values from the value log in parallel.