Context-Aware Segments: Solving the “Scatter-Read” Problem

Scale

Context-Aware Segments: Solving the “Scatter-Read” Problem

Session Abstract

Traditional OpenSearch segments are context-blind, scattering data across multiple segments. We introduce Context-Aware Segments (CAS), an architecture that brings “sharding” logic to the segment level. By enforcing document locality during indexing, we slashed query latency and minimized data footprint through superior pruning and compression.

Session Description

#### The Friction: The “Everything, Everywhere” Problem

In distributed search engines like OpenSearch, the Shard is the unit of scale, but the Segment is the unit of storage. Traditionally, documents are written to segments based purely on arrival time. For multi-tenant SaaS platforms or high-velocity observability clusters, this means data for a specific tenant or time-range is scattered across every single segment within a shard. A simple filter query becomes an expensive fan-out operation, thrashing the file system cache and wasting CPU cycles checking documents that will never match.

#### The Solution: Context-Aware Segments (CAS)

This session dissects the design and implementation of CAS (OpenSearch RFC #18576) and its foundation in Lucene (Issue #13387). This architectural shift introduces a logical “context” dimension to segment creation. Instead of a temporal log, we treat segments as optimized containers for specific data subsets.

In this technical deep-dive, we will cover:

Granular Segment Pruning: How the query coordinator leverages new segment-level metadata to perform “pre-search” filtering—effectively skipping files on disk before the engine even opens them.

Vector Segment Pruning: How we use segment-level metadata to skip entire HNSW graphs during a k-NN search. If a segment doesn’t contain “Tenant A,” we don’t even load its vector blobs into memory.

Supercharged Compression: We demonstrate how grouping similar data by context significantly increases compression ratios. When the storage engine sees repetitive data patterns in a single segment, the bit-packing and dictionary compression become far more efficient, slashing the data footprint.

Talk

Rishav Sagar

Tejas Shah

All Speakers

All Sessions

PS DEV