Nimble Cluster Index: Efficient Indexed Lookups on Columnar Data
Introduction
Analytical data lakes excel at full-table scans but struggle with point lookups. Key-value stores handle point lookups efficiently but cannot serve analytical queries. What if a single file format could serve both workloads?
Nimble's Cluster Index bridges this gap. It is a lightweight, hierarchical index structure embedded directly inside Nimble columnar files. It enables O(log n) point lookups and range scans on sorted data — without a separate index file, without an external service, and without sacrificing Nimble's columnar scan performance.
We have integrated the cluster index with Presto for analytical index joins and are actively integrating with ZippyDB for prefix key scans — both powered by the same underlying index structure, served through Velox.





