Skip to content

Use the doc-values skip index to skip per-doc value lookups in LongRangeFacetCutter#16268

Draft
slow-J wants to merge 4 commits into
apache:mainfrom
slow-J:lucene-16249-skipper-range-facets
Draft

Use the doc-values skip index to skip per-doc value lookups in LongRangeFacetCutter#16268
slow-J wants to merge 4 commits into
apache:mainfrom
slow-J:lucene-16249-skipper-range-facets

Conversation

@slow-J

@slow-J slow-J commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Resolves #16249

Implementation heavily inspired by HistogramCollector.java.

Range faceting (in the sandbox module -LongRangeFacetCutter) currently reads the doc-values value for every matching document and binary-searches it into an elementary interval. When the faceted field is single-valued, we can use a doc-values skip index. For a dense skip block whose min and max values fall into the same elementary interval, every document in that block maps to that interval, allowing us to skip the per-doc value lookup and binary search.

Limitation - applies to single-valued, long fields only.

Benchmark (luceneutil)

I used my branch of https://github.com/slow-J/luceneutil/tree/github-16249-range-facet-bench which cherry picked 2 of @epotyom 's commits (mainly mikemccand/luceneutil#582 which adds range-facet support)

Setup:
runlocal.py, wikimediumall (33.3M docs), index-sorted by lastMod_skipper with
addDVSkippers=true. baseline = main, candidate = this change, both DURING_COLLECTION, so
the only difference is this optimization. 30 JVM iterations.

Command: python3 -u src/python/localrun.py -s rangeFacetsWikimediumAll -b lucene_baseline -c lucene_candidate -iterations 30 -warmups 20 2>&1 | tee "$BASE/run-timing7.txt"

Edit: new benchmark results after the changes for Egors first 2 comments.
Edit2: new benchmark results after unwrapping removed

QPS

Task QPS baseline StdDev QPS modified StdDev Pct diff p-value
BrowseLastModOvlpRangeFacets 1.26 (7.7%) 2.72 (10.6%) 115.5% (90% - 145%) 0.000
BrowseLastModRangeFacets 2.21 (6.0%) 3.31 (8.8%) 50.0% (33% - 68%) 0.000
MedTermLastModOvlpRangeFacets 3.82 (13.5%) 5.48 (5.7%) 43.5% (21% - 72%) 0.000
MedTermLastModRangeFacets 4.15 (13.6%) 5.26 (7.9%) 26.5% (4% - 55%) 0.000
BrowseIDOvlpRangeFacets 1.21 (6.6%) 1.10 (6.7%) -9.6% (-21% - 4%) 0.000
BrowseIDRangeFacets 2.33 (8.6%) 2.57 (5.1%) 10.1% (-3% - 26%) 0.000
MedTermIDOvlpRangeFacets 3.79 (13.5%) 4.61 (11.1%) 21.6% (-2% - 53%) 0.000
MedTermIDRangeFacets 5.98 (4.6%) 5.92 (2.7%) -0.9% (-7% - 6%) 0.340

Latency (ms) — aggregated across all iterations

Task P50 B P50 C Diff P90 B P90 C Diff P99 B P99 C Diff P999 B P999 C Diff P100 B P100 C Diff
BrowseLastModOvlpRangeFacets 844.184 386.006 -54.3% 1437.289 581.094 -59.6% 7523.983 828.460 -89.0% 9510.480 868.764 -90.9% 9555.393 888.500 -90.7%
BrowseLastModRangeFacets 474.762 319.836 -32.6% 854.574 546.789 -36.0% 4412.829 781.421 -82.3% 7775.105 862.760 -88.9% 7910.258 893.449 -88.7%
MedTermLastModOvlpRangeFacets 286.226 187.654 -34.4% 552.668 436.448 -21.0% 771.820 599.881 -22.3% 1327.279 705.213 -46.9% 1445.804 707.766 -51.0%
MedTermLastModRangeFacets 260.932 200.115 -23.3% 652.004 510.872 -21.6% 847.848 635.331 -25.1% 2966.134 743.950 -74.9% 3060.317 745.647 -75.6%
BrowseIDOvlpRangeFacets 860.895 976.209 +13.4% 1419.693 1279.444 -9.9% 8271.185 1476.704 -82.1% 9919.502 1531.237 -84.6% 9928.280 1536.195 -84.5%
BrowseIDRangeFacets 461.967 404.593 -12.4% 799.144 625.845 -21.7% 5972.427 860.420 -85.6% 8963.973 930.259 -89.6% 9483.903 942.619 -90.1%
MedTermIDOvlpRangeFacets 294.831 235.198 -20.2% 676.861 539.088 -20.4% 897.009 671.736 -25.1% 1835.175 742.857 -59.5% 2055.182 744.089 -63.8%
MedTermIDRangeFacets 169.089 170.565 +0.9% 495.786 401.676 -19.0% 697.206 591.299 -15.2% 1026.169 690.797 -32.7% 1647.263 695.272 -57.8%

@slow-J slow-J force-pushed the lucene-16249-skipper-range-facets branch from 03d7d2a to 066c419 Compare June 17, 2026 16:03
@github-actions github-actions Bot added this to the 10.5.0 milestone Jun 17, 2026
@slow-J

slow-J commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

I reran benchmarks, this time correctly using localrun, and updated the results in #16268 (comment)

@slow-J slow-J force-pushed the lucene-16249-skipper-range-facets branch from 2e7144b to 0c72d5f Compare June 19, 2026 14:45

@epotyom epotyom left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change! One suggestion below

for (int level = 0; level < skipper.numLevels(); ++level) {
int totalDocsAtLevel = skipper.maxDocID(level) - skipper.minDocID(level) + 1;
if (skipper.docCount(level) != totalDocsAtLevel) {
// Some docs at this level have no value, so we can't resolve the whole block at once.

@epotyom epotyom Jun 19, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think skipper can stil improve performance for this case, as we can still cache ordinal, it is just that in this case we have to always call longValues.advanceExact(doc). If it returns true - we return cached ordinal (and avoid reading long value as well as binary search elementary interval), otherwise return false

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review Egor! Good point!
I'll try to address these 2 comments and run new benchmarks.

@slow-J slow-J Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me some time to publish the changes as I managed to initially introduce a regression into range facets without skipper, I done a small refactor while fixing that.

I am getting much better performance after implementing your suggestions, I will update the benchmark results in the top level comment.
Edit: updated benchmark results.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm looking at the new benchmark results, there is an improvement in the ID tasks, which do not have a doc-values skip index. This is due to a change in the latest commit.

id is a single-valued field, but in main, fromLongField never unwraps to a single-valued source, always picking the multi-valued leaf cutter even when the field is single-valued.

We now route single-valued segments to the single-valued cutter instead. The new create(String field, …) keeps the field name, which lets createLeafCutter inspect each segment during search and pick the right cutter.

I have kept this in this PR but its slightly increasing the scope.

@epotyom let me know what you think about this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, nice catch!

This is due to a change in the latest commit.

Does the latest commit also include the interval-tracking rewind change? If so, could you please run benchmarks for the unwrapping change only?

Unwrapping adds a little bit of complexity, but if it improves performance, I think we should keep it.

@slow-J slow-J Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the latest commit also include the interval-tracking rewind change?

Yes, it has all three changes: non-dense fast path, rewind reuse, and unwrapping.

Unwrapping adds a little bit of complexity, but if it improves performance, I think we should keep it.

I'll setup and run a benchmark now just to see the perf diff due to the unwrapping.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran: python3 -u src/python/localrun.py -s rangeFacetsWikimediumAll -b lucene_baseline -c lucene_candidate -iterations 30 -warmups 20 2>&1 | tee "$BASE/run9-onlyunwrapping-timing.txt"

Heres the result.

Latency (ms) — aggregated across all iterations

Task P50 B P50 C Diff P90 B P90 C Diff P99 B P99 C Diff P999 B P999 C Diff P100 B P100 C Diff
BrowseIDOvlpRangeFacets 946.928 820.170 -13.4% 1279.722 1178.839 -7.9% 1509.599 3489.425 +131.1% 1579.256 10588.392 +570.5% 1647.232 10662.196 +547.3%
BrowseIDRangeFacets 410.635 296.512 -27.8% 630.174 670.131 +6.3% 869.641 866.369 -0.4% 931.130 8681.360 +832.3% 1008.106 9240.540 +816.6%
BrowseLastModOvlpRangeFacets 381.116 365.262 -4.2% 577.681 639.843 +10.8% 866.191 2046.696 +136.3% 1028.218 10214.331 +893.4% 1029.600 10234.017 +894.0%
BrowseLastModRangeFacets 324.172 317.631 -2.0% 526.130 634.818 +20.7% 808.836 884.852 +9.4% 851.734 9413.995 +1005.3% 887.375 9912.493 +1017.1%
MedTermIDOvlpRangeFacets 213.936 181.524 -15.1% 433.905 437.942 +0.9% 603.100 702.917 +16.6% 679.574 838.447 +23.4% 761.983 848.642 +11.4%
MedTermIDRangeFacets 211.334 171.214 -19.0% 519.706 555.351 +6.9% 735.139 713.797 -2.9% 834.724 836.369 +0.2% 840.486 843.231 +0.3%
MedTermLastModOvlpRangeFacets 176.353 173.086 -1.9% 486.851 537.847 +10.5% 712.630 719.706 +1.0% 830.068 832.742 +0.3% 842.260 840.165 -0.2%
MedTermLastModRangeFacets 188.926 182.942 -3.2% 487.783 553.020 +13.4% 731.373 718.764 -1.7% 830.363 832.185 +0.2% 841.503 834.076 -0.9%

QPS

Task QPS baseline StdDev QPS modified StdDev Pct diff p-value
BrowseLastModRangeFacets 3.26 (5.8%) 3.34 (8.1%) 2.6% (-10% - 17%) 0.155
BrowseLastModOvlpRangeFacets 2.75 (5.1%) 2.85 (6.2%) 3.6% (-7% - 15%) 0.015
MedTermLastModRangeFacets 5.58 (7.9%) 5.80 (8.6%) 3.8% (-11% - 22%) 0.072
MedTermLastModOvlpRangeFacets 5.89 (6.9%) 6.13 (8.2%) 4.1% (-10% - 20%) 0.035
MedTermIDOvlpRangeFacets 4.94 (6.3%) 5.68 (6.6%) 15.0% (1% - 29%) 0.000
MedTermIDRangeFacets 5.19 (9.9%) 6.17 (7.1%) 18.7% (1% - 39%) 0.000
BrowseIDOvlpRangeFacets 1.12 (7.4%) 1.33 (11.9%) 19.1% (0% - 41%) 0.000
BrowseIDRangeFacets 2.52 (4.5%) 3.60 (12.0%) 42.7% (25% - 61%) 0.000

Mainly impacts the ID tasks which do not have a skipper.
But it does seem to cause a worrying latency regression at high percentile latency (p90 and onward).

Hmm, @epotyom what do you think? Since it is not related to the skipper change, I am partial towards removing the unwrapping and retesting performance.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, let's separate these two changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the unwrapping in latest commit, will edit the main benchmark in PR description with new numbers.

@slow-J slow-J force-pushed the lucene-16249-skipper-range-facets branch from 1065433 to 7db2833 Compare June 23, 2026 10:39
@github-actions github-actions Bot modified the milestones: 10.5.0, 10.6.0 Jun 23, 2026
@slow-J slow-J requested a review from epotyom June 23, 2026 11:29
@slow-J slow-J force-pushed the lucene-16249-skipper-range-facets branch from 7db2833 to 88fe293 Compare June 29, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can we use DocValuesSkipper for range facets?

2 participants