Skip to content

Clear worker slot metrics on stop#2412

Open
kritibehl wants to merge 2 commits into
temporalio:mainfrom
kritibehl:fix-worker-slots-available-metric-on-stop
Open

Clear worker slot metrics on stop#2412
kritibehl wants to merge 2 commits into
temporalio:mainfrom
kritibehl:fix-worker-slots-available-metric-on-stop

Conversation

@kritibehl

Copy link
Copy Markdown
Contributor

What was changed

This PR clears worker slot gauges when a worker stops.

Specifically, it adds metric cleanup to the tracking slot supplier and calls it from baseWorker.Stop() after worker shutdown/drain. The cleanup is idempotent and guarded so late slot releases after shutdown do not republish stale slot availability.

A regression test was added to verify that worker slot gauges are reset to zero on stop and remain zero after a late slot release.

Why?

Stopped workers should not continue reporting available task capacity through worker_task_slots_available.

Without cleanup, slot metrics can remain at their last published value after worker shutdown, which may leave stale capacity signals in metrics reporting.

Checklist

  1. Closes Ensure slots available metric is updated on worker stop #873

  2. How was this tested:

  • go test ./internal -run 'TestScalableTaskPollerSuite/TestTrackingSlotSupplierStopsSlotMetrics' -count=1
  • go test ./internal -run 'TestScalableTaskPollerSuite/TestTrackingSlotSupplier' -count=1
  • go test ./internal -race -run 'TestScalableTaskPollerSuite/TestTrackingSlotSupplierStopsSlotMetrics' -count=1
  • go test ./internal -count=1
  1. Any docs updates needed?

No docs updates needed.

@kritibehl kritibehl requested a review from a team as a code owner June 20, 2026 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ensure slots available metric is updated on worker stop

1 participant