Copy Ahead Segment Ring: an Ephemeral Memtable Design for Distributed LSM Tree
Abstract
Abstract
We present a novel Memtable design for distributed LSM trees that effectively reduces contention between concurrent range scans, writes, and prune operations. This design emphasizes the centralization of the Memtable on Disaggregated Memory to eliminate internode
replication and reduce associated costs. Our approach can be adapted to include fixed-size
page allocation, which helps minimize memory fragmentation within Disaggregated Memory. Our approach builds upon established methodologies like Copy on Write BTree, Hashed
Wheel Timer, Separation of Key & Value, and Single Thread per Partition with Lock Free
Ring Buffer to effectively reduce contention and improve system throughput. By integrating
these proven methodologies into our approach, we have developed a highly effective solution
that leverages extra memory to achieve better throughput under contention. Overall, our
approach represents a significant step towards exploring the potential of large disaggregated
memory to improve the performance of distributed LSM tree databases.