Skip to main content

One post tagged with "nimble"

View All Tags

Nimble Cluster Index: Efficient Indexed Lookups on Columnar Data

· 9 min read
Xiaoxuan Meng
Software Engineer @ Meta
Jialiang Tan
Software Engineer @ Meta
Zac Wen
Software Engineer @ Meta
Zhenyuan Zhao
Software Engineer @ Meta
Pedro Pedreira
Software Engineer @ Meta
Masha Basmanova
Software Engineer @ Meta

Introduction

Analytical data lakes excel at full-table scans but struggle with point lookups. Key-value stores handle point lookups efficiently but cannot serve analytical queries. What if a single file format could serve both workloads?

Nimble's Cluster Index bridges this gap. It is a lightweight, hierarchical index structure embedded directly inside Nimble columnar files. It enables O(log n) point lookups and range scans on sorted data — without a separate index file, without an external service, and without sacrificing Nimble's columnar scan performance.

We have integrated the cluster index with Presto for analytical index joins and are actively integrating with ZippyDB for prefix key scans — both powered by the same underlying index structure, served through Velox.