[core][python] Add global index search modes by JingsongLi · Pull Request #8255 · apache/paimon

JingsongLi · 2026-06-16T13:37:05Z

Summary

Add global-index.search-mode as the freshness/performance switch for global-index queries. It defaults to fast for index-only reads, supports full to use snapshot nextRowId plus index coverage before scanning raw data, and supports detail to scan data file metadata for exact unindexed ranges caused by updates or rewrites.

Changes

Replace the unreleased global-index.fast-search option with global-index.search-mode = fast | full | detail in Java, Python, and generated docs.
Make scalar global-index scans and vector search honor the three search modes.
In full mode, use snapshot nextRowId and global-index row-id coverage to avoid planning all data files unless an uncovered range exists.
In detail mode, scan data file metadata to detect exact unindexed row ranges, including invalidation caused by file rewrites or updates.
Carry snapshot nextRowId through Java and Python vector scan plans so vector raw-data search can use the lightweight full path.
Update Java and Python tests for default fast mode, full-mode freshness, filtered unindexed scans, and detail-mode exact range detection.

Testing

git diff --check
rg -n "fast-search|global-index\\.fast-search|GLOBAL_INDEX_FAST_SEARCH|globalIndexFastSearch|global_index_fast_search|slowSearch|slow search|FastSearch" paimon-api paimon-core paimon-python docs/docs docs/generated
python -m compileall paimon-python/pypaimon/common/options/core_options.py paimon-python/pypaimon/read/scanner/file_scanner.py paimon-python/pypaimon/table/source/vector_search_read.py paimon-python/pypaimon/table/source/vector_search_scan.py
python -m pytest paimon-python/pypaimon/tests/global_index_test.py::PlanSnapshotFetchRegressionTest::test_search_mode_detail_filters_unindexed_rows_exactly paimon-python/pypaimon/tests/vector_search_filter_test.py::VectorSearchManySplitsTest::test_search_mode_controls_unindexed_range_scan
mvn -pl paimon-api,paimon-core -DskipTests spotless:check
mvn -pl paimon-core -am -Pfast-build -DskipTests -DfailIfNoTests=false compile test-compile
mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=VectorSearchBuilderTest#testVectorSearchFullModeScansUnindexedData+testVectorSearchFastModeSkipsUnindexedDataByDefault+testVectorSearchFullModeScansFilteredUnindexedData,BtreeGlobalIndexTableTest#testBTreeGlobalIndexSearchModeControlsUnindexedData test

Notes

global-index.fast-search is intentionally not kept as a compatibility alias because it has not been released.

JingsongLi changed the title ~~[core] Support raw fallback for vector search~~ [WIP][core] Support raw fallback for vector search Jun 16, 2026

JingsongLi changed the title ~~[WIP][core] Support raw fallback for vector search~~ [core] Support raw fallback for vector search Jun 16, 2026

JingsongLi marked this pull request as draft June 16, 2026 13:46

JingsongLi changed the title ~~[core] Support raw fallback for vector search~~ [core][python] Support raw fallback for vector search Jun 16, 2026

JingsongLi force-pushed the codex/vector-raw-fallback branch from 658ab46 to baefa9d Compare June 18, 2026 05:43

JingsongLi changed the title ~~[core][python] Support raw fallback for vector search~~ [core][python] Add global index fast search option Jun 18, 2026

JingsongLi force-pushed the codex/vector-raw-fallback branch 3 times, most recently from 867c483 to 8b1c6b2 Compare June 19, 2026 03:29

JingsongLi added 7 commits June 19, 2026 15:28

[core][python] Add global index fast search option

53a2a04

fix checkstyle

447e3fc

[vector] Rename native vector global indexer

b2350bc

[vector] Add native prefix to vector index classes

03ca56c

[vector] Fix native vector index formatting

8e0ea12

[lance] Add test dependency for vector fast search

a5b6471

[core][python] Add global index search modes

a49d70c

JingsongLi force-pushed the codex/vector-raw-fallback branch from 23cc0fb to a49d70c Compare June 19, 2026 07:29

JingsongLi changed the title ~~[core][python] Add global index fast search option~~ [core][python] Add global index search modes Jun 19, 2026

JingsongLi added 3 commits June 19, 2026 15:54

[core][python] Refactor unindexed global index scan

8182023

[spark] Fix vector search mode compile

33e1e97

[python] Fix global index search lint

be05b44

JingsongLi mentioned this pull request Jun 20, 2026

[core] Add global index search mode #8296

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core][python] Add global index search modes#8255

[core][python] Add global index search modes#8255
JingsongLi wants to merge 10 commits into
apache:masterfrom
JingsongLi:codex/vector-raw-fallback

JingsongLi commented Jun 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JingsongLi commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JingsongLi commented Jun 16, 2026 •

edited

Loading