Skip to content

DON'T REVIEW YET: Introduce RuntimeConfig wrapper for vMCP ConfigMap surface#5238

Draft
ChrisJBurns wants to merge 1 commit into
mainfrom
cburns/runtime-config-wrapper
Draft

DON'T REVIEW YET: Introduce RuntimeConfig wrapper for vMCP ConfigMap surface#5238
ChrisJBurns wants to merge 1 commit into
mainfrom
cburns/runtime-config-wrapper

Conversation

@ChrisJBurns

Copy link
Copy Markdown
Collaborator

Summary

VirtualMCPServerSpec.Config is typed as pkg/vmcp/config.Config in v1beta1, so controller-gen walks every field reachable from Config into the public CRD schema. That blocks adding operator-resolved sidecar fields (per-backend secret-identifier maps, resolved CA bundle paths, future BackendHeaderForward for MCPServerEntry references, etc.) without churning the CRD and triggering the v1beta1 stability gate.

This PR introduces pkg/vmcp/config.RuntimeConfig: a wrapper that embeds Config inline and is the designated home for operator-resolved fields. Today the wrapper adds nothing — marshalled YAML is byte-identical, parsed YAML is identical, and task operator-manifests produces zero CRD diff. Future PRs add sidecar fields onto RuntimeConfig without touching the public Config or v1beta1.

This is a foundational refactor for #4996 / #5013 (forward MCPServerEntry.headerForward to vMCP outbound requests). The current #5013 implementation adds 96 lines of CRD YAML and 86 lines of crd-api.md because HeaderForward was placed on StaticBackendConfig (a Config-reachable type). Once this PR lands, #5013 can be reworked to put the field on RuntimeConfig instead — net CRD diff drops to zero.

Wiring
  • Loader: YAMLLoader.Load now returns *RuntimeConfig. Callers that only need user-facing Config fields read them through the embed (rc.Name, rc.Group, etc.); callers that consume sidecars read them off the wrapper directly. Strict KnownFields(true) validation is preserved.
  • Operator write: cmd/thv-operator/controllers/virtualmcpserver_vmcpconfig.go wraps *Config in RuntimeConfig{} before YAML marshal. Single write path, single read path.
  • CLI boundary: loadAndValidateConfig in pkg/vmcp/cli/serve.go and validate.go unwrap to *Config to keep the existing serve pipeline tight. A comment marks where to thread the wrapper through when a sidecar consumer arrives.
Tests pinning the seam
  • pkg/vmcp/config/runtime_config_test.go:
    • TestRuntimeConfig_MarshalsIdenticallyToConfig — byte-identity vs Config today.
    • TestRuntimeConfig_Load_RoundTrip — Load through the operator's write shape.
    • TestRuntimeConfig_DisjointTopLevelTags — reflect-based check that catches a future field on RuntimeConfig sharing a JSON or YAML key with any Config field. encoding/json (anonymous-field promotion) and yaml.v3 (,inline) handle key collisions differently, so a collision would silently produce divergent serialization. This is forward-looking — today the wrapper has no extras and the test is trivially green.
  • cmd/thv-operator/pkg/spectoconfig/runtime_config_drift_test.go:
    • Asserts RuntimeConfig is a strict superset of Config.
    • Any extras must appear in runtimeOnlyLeafJustifications with a written rationale (today empty).
    • Catches stale entries and contradicting classifications.
    • Lives operator-side because the drift harness in cmd/thv-operator/internal/testutil/ is operator-internal.
Acceptance gate

git diff main -- deploy/charts/operator-crds/ docs/operator/crd-api.md is empty after running both:

task operator-manifests
task operator-generate

The wrapper is invisible to controller-gen because RuntimeConfig is not field-referenced from any v1beta1 type. The doc on RuntimeConfig calls out the only way to break that invariant (retyping VirtualMCPServerSpec.Config from config.Config to config.RuntimeConfig — don't).

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change
  • Refactoring
  • Documentation
  • Other

Test plan

  • task build passes (verified via go build ./...)
  • task test passes for all touched packages (pkg/vmcp/..., cmd/thv-operator/pkg/spectoconfig/...)
  • New unit tests:
    • TestRuntimeConfig_MarshalsIdenticallyToConfig
    • TestRuntimeConfig_Load_RoundTrip
    • TestRuntimeConfig_DisjointTopLevelTags
    • TestRuntimeConfigSeam (operator-side drift test)
  • task operator-manifests and task operator-generate produce zero diff under deploy/charts/operator-crds/ and docs/operator/crd-api.md

Special notes for reviewers

  • Pure refactor, no behaviour change. The wrapper adds zero serialized keys today; YAML output and parse semantics are identical to Config directly. The point is to establish the seam now so the next operator-resolved field doesn't have to fight v1beta1 stability.
  • Load() signature change. YAMLLoader.Load() now returns *RuntimeConfig instead of *Config. Most existing callers are field-access only (works transparently through embed). Two callers needed &cfg.Config for Validator.Validate(*Config). The CLI's loadAndValidateConfig unwraps to *Config at its return boundary to keep the serve pipeline tight; a comment documents the migration path when a sidecar consumer is added.
  • Drift test placement. Test lives in cmd/thv-operator/pkg/spectoconfig/runtime_config_drift_test.go because the drift harness lives in cmd/thv-operator/internal/testutil/ and is operator-internal. The TYPES it tests live in pkg/vmcp/config/. This layering is acceptable; moving the harness up to a shared location is out of scope for this PR.

Implementation plan

Approved plan (AI-assisted)

Three agents reviewed the design before commit:

  • toolhive-expert verified caller safety: 3 production + 8 test callers of Load() covered, single ConfigMap write path, no checksum fixtures pinned to current YAML.
  • kubernetes-go-expert verified the four key claims (CRD invisibility, deepcopy-skip, ,inline semantics, strict-decode behaviour). One refinement landed in the type doc — the only way to leak RuntimeConfig fields into the CRD is retyping VirtualMCPServerSpec.Config, now explicitly called out.
  • go-architect caught two real issues that were addressed before commit: (1) collapsed Load/LoadRuntime into one Load() returning *RuntimeConfig (interface pollution; lossy view would have become a latent bug); (2) added the disjoint-tag reflect test to catch the top forward-looking hazard.

🤖 Generated with Claude Code

@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 9, 2026
@codecov

codecov Bot commented May 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.96%. Comparing base (6733a54) to head (471ae22).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5238      +/-   ##
==========================================
- Coverage   67.97%   67.96%   -0.01%     
==========================================
  Files         612      612              
  Lines       62723    62724       +1     
==========================================
  Hits        42633    42633              
+ Misses      16908    16900       -8     
- Partials     3182     3191       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 9, 2026
@ChrisJBurns ChrisJBurns marked this pull request as draft May 9, 2026 22:14
@ChrisJBurns ChrisJBurns changed the title Introduce RuntimeConfig wrapper for vMCP ConfigMap surface DON'T REVIEW YET: Introduce RuntimeConfig wrapper for vMCP ConfigMap surface May 10, 2026
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 10, 2026
jhrozek
jhrozek previously approved these changes May 11, 2026

@jhrozek jhrozek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits on comments — both easy fixes before merge. The overall design is solid and the test coverage for the seam invariants is thorough.

Comment thread pkg/vmcp/config/yaml_loader.go Outdated
Comment on lines +36 to +41
// Note: when the operator writes the vMCP ConfigMap it wraps Config in
// runtime.RuntimeConfig (see pkg/vmcp/config/runtime). Today the wrapper
// adds no extra keys, so parsing into Config directly succeeds. When a
// future operator-resolved sidecar field lands on RuntimeConfig, callers
// that need it should use runtime.Load instead, which parses into the
// wrapper.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: two stale references from before the RuntimeConfig → Config rename — runtime.RuntimeConfig on line 37 should be runtime.Config, and runtime.Load on line 40 doesn't exist anywhere in the codebase.

Suggested change
// Note: when the operator writes the vMCP ConfigMap it wraps Config in
// runtime.RuntimeConfig (see pkg/vmcp/config/runtime). Today the wrapper
// adds no extra keys, so parsing into Config directly succeeds. When a
// future operator-resolved sidecar field lands on RuntimeConfig, callers
// that need it should use runtime.Load instead, which parses into the
// wrapper.
// Note: when the operator writes the vMCP ConfigMap it wraps Config in
// runtime.Config (see pkg/vmcp/config/runtime). Today the wrapper adds no
// extra keys, so parsing into Config directly succeeds. When a future
// operator-resolved sidecar field lands on runtime.Config, a new
// runtime.Load function will need to be created that parses into runtime.Config
// instead of vmcpconfig.Config.

// `,inline`) handle key collisions differently — yaml.v3 errors or has
// a different precedence than encoding/json's outer-wins rule. The
// disjoint-tag test in runtime_config_test.go pins this.
type Config struct {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit / question: when the first operator-only field lands here, YAMLLoader.Load() uses KnownFields(true) and decodes into vmcpconfig.Config, so it will reject the new YAML key and the vMCP binary will fail to start. The fix at that point is to create a runtime.Load() function and update pkg/vmcp/cli/serve.go and pkg/vmcp/cli/validate.go to use it. Is there a plan to track this? A TODO comment on the struct would make the coupling visible to whoever adds the first field here.

VirtualMCPServerSpec.Config was typed as pkg/vmcp/config.Config, so
controller-gen walked the entire internal config tree into the public
CRD schema. Any change to the internal on-disk/runtime config model
(field, tag, validation) therefore leaked into the v1beta1 CRD.

Introduce an operator-owned mirror, cmd/thv-operator/pkg/vmcpcrd, a
field-for-field duplicate of pkg/vmcp/config (incl. Duration and the
composite-tool validation). Retype VirtualMCPServerSpec.Config and the
VirtualMCPCompositeToolDefinition embed onto the mirror, and convert
mirror -> config.Config in the operator's converter via a JSON transcode
(crdToRuntime), keeping the existing Kubernetes-resolution overrides.

The no-leak guarantee is now structural: nothing reachable from the CRD
types references pkg/vmcp/config, so internal config changes cannot
reach the CRD schema. Generated CRD manifests are byte-identical.

Tests:
- structural parity (config <-> mirror JSON leaf-set equality)
- round-trip transcode fuzz (randfill)
- categorical no-leak boundary (no CRD field type in pkg/vmcp/config)

Scope note: external shared types embedded in config (telemetry, audit,
ratelimit, auth strategy) are not yet mirrored; that and the crd-ref-docs
rendering of the mirror are tracked follow-ups.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ChrisJBurns ChrisJBurns force-pushed the cburns/runtime-config-wrapper branch from 471ae22 to 17ef19c Compare June 22, 2026 23:58
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/M Medium PR: 300-599 lines changed labels Jun 23, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants