Skip to content

Oximeter: jitter collection intervals.#10711

Open
jmcarp wants to merge 1 commit into
mainfrom
jmcarp/oximeter-collector-jitter
Open

Oximeter: jitter collection intervals.#10711
jmcarp wants to merge 1 commit into
mainfrom
jmcarp/oximeter-collector-jitter

Conversation

@jmcarp

@jmcarp jmcarp commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

As of this writing, the oximeter collector's database batcher can queue at most 100_000 samples. In realistic conditions, it can receive about 80_000 metrics at once: each mgs producer sends about 40_000 metrics per collection on a full rack, and with the two mgs producers sampled with the same interval and offset, we receive all their collected metrics at roughly the same time. As a result, even when overall metrics volume isn't particularly high, we can drop samples when too many arrive at once.

To mitigate thundering herds of samples, this patch introduces a jittered offset to each collection task timer, offsetting by 0..interval. This doesn't save us from having to size the collector queue properly, but spreads out load over time on average.

Part of #10552.

@jmcarp jmcarp requested a review from bnaecker July 1, 2026 17:54
@jmcarp jmcarp force-pushed the jmcarp/oximeter-collector-jitter branch from 424b1eb to 25ef2f7 Compare July 1, 2026 23:28
As of this writing, the oximeter collector's database batcher can queue at most
100_000 samples. In realistic conditions, it can receive about 80_000 metrics
at once: each mgs producer sends about 40_000 metrics per collection on a full
rack, and with the two mgs producers sampled with the same interval and offset,
we receive all their collected metrics at roughly the same time. As a result,
even when overall metrics volume isn't particularly high, we can drop samples
when too many arrive at once.

To mitigate thundering herds of samples, this patch introduces a jittered
offset to each collection task timer, offsetting by 0..interval. This doesn't
save us from having to size the collector queue properly, but spreads out load
over time on average.

Part of #10552.
@jmcarp jmcarp force-pushed the jmcarp/oximeter-collector-jitter branch from 25ef2f7 to 2f05d86 Compare July 1, 2026 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant