Skip to content

test(e2e): show CI timings and keep services on fork timeout#173

Draft
aprimakina wants to merge 1 commit into
mainfrom
ci-e2e-timing-keep-on-fork-timeout
Draft

test(e2e): show CI timings and keep services on fork timeout#173
aprimakina wants to merge 1 commit into
mainfrom
ci-e2e-timing-keep-on-fork-timeout

Conversation

@aprimakina

@aprimakina aprimakina commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Two improvements to the E2E fork integration test and CI, aimed at diagnosing intermittent fork-provisioning slowness:

  • CI test timings always visible — add -v to the CI go test command (.github/workflows/test.yml) so per-test and per-subtest durations are printed (--- PASS: TestServiceForkIntegration/ForkService_Now (35.2s)). Makes it easy to see how long each E2E test ran and spot a slow/hanging operation.
  • Keep services on a fork timeout — in TestServiceForkIntegration, when a fork command fails specifically because its --wait-timeout was exceeded (CLI exit code 2), the source service and the forked service are no longer deleted, so the failure can be investigated. Remaining subtests are skipped, and the deferred cleanups log the kept service IDs. The existing 1h sweepStaleIntegrationServices reclaims them on a later run, so nothing leaks permanently.

Non-timeout failures and the normal success path are unchanged — services are still cleaned up as before.

Implementation notes

  • handleForkErr centralizes the three forks' error handling; on a wait-timeout it captures the fork's service ID (the fork command prints the service JSON even on timeout), flags the run, and reports both IDs.
  • Timeout detection uses errors.As against common.ExitCodeError / common.ExitTimeout.
  • The two deferred cleanups were deduplicated into a single cleanupService helper.

Add `-v` to the CI `go test` command so per-test and per-subtest
durations are always visible in CI logs, making it easy to see how long
each E2E test ran and spot slow/hanging operations.

In TestServiceForkIntegration, when a fork command fails specifically
because its wait-timeout was exceeded, keep the source and forked
services instead of deleting them so the failure can be investigated.
Remaining subtests are skipped and the deferred cleanups log the kept
service IDs; the existing 1h stale-service sweeper reclaims them later.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aprimakina aprimakina marked this pull request as draft June 30, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant