Skip to content

fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API#2113

Open
aoshen02 wants to merge 1 commit into
THUDM:mainfrom
aoshen02:fix/gpt-oss-bridge-patch-signature
Open

fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API#2113
aoshen02 wants to merge 1 commit into
THUDM:mainfrom
aoshen02:fix/gpt-oss-bridge-patch-signature

Conversation

@aoshen02

Copy link
Copy Markdown
Contributor

Summary

radixark/Megatron-Bridge@bridge added a fourth parameter hf_state_dict to ModelBridge.maybe_modify_converted_hf_weight. The monkey-patch in _patch_bridge_expert_cache_to_cpu still uses the old 3-arg signature, causing a crash during GPT-OSS colocated weight sync:

TypeError: _patch_bridge_expert_cache_to_cpu.<locals>._patched() takes 3 positional arguments but 4 were given

The fix accepts and forwards the new optional hf_state_dict argument.

Root cause

Megatron-Bridge updated ModelBridge.maybe_modify_converted_hf_weight to:

def maybe_modify_converted_hf_weight(
    self,
    task: WeightConversionTask,
    converted_weights_dict: Dict[str, torch.Tensor],
    hf_state_dict: Mapping[str, torch.Tensor],   # ← new param
) -> Dict[str, torch.Tensor]:

The patch only handled 3 args.

Fix

# before
def _patched(self, task, converted_weights_dict):
    cpu_dict = {k: v.cpu() for k, v in converted_weights_dict.items()}
    result = _orig(self, task, cpu_dict)

# after
def _patched(self, task, converted_weights_dict, hf_state_dict=None):
    cpu_dict = {k: v.cpu() for k, v in converted_weights_dict.items()}
    result = _orig(self, task, cpu_dict, hf_state_dict)

🤖 Generated with Claude Code

…ron-Bridge API

radixark/Megatron-Bridge@bridge added a fourth parameter `hf_state_dict` to
`ModelBridge.maybe_modify_converted_hf_weight`. The monkey-patch in
`_patch_bridge_expert_cache_to_cpu` still used the old 3-arg signature, causing
a TypeError when running GPT-OSS weight sync:

  TypeError: _patch_bridge_expert_cache_to_cpu.<locals>._patched()
             takes 3 positional arguments but 4 were given

Fix: accept and forward the optional `hf_state_dict` argument.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant