scripts: complete slime-exact port of all scripts + gpt-oss 20B support#260
scripts: complete slime-exact port of all scripts + gpt-oss 20B support#260aoshen02 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds and updates several training and rollout shell scripts for various models, including Qwen, Kimi-K2, DeepSeek-R1, and GLM, to support low-precision training (INT4 and FP8) and integrate vLLM. The review feedback highlights several critical issues, including a missing trailing backslash in run-kimi-k2-Instruct.sh that breaks the Ray job submission, incorrect relative source paths for model configurations across multiple scripts, leftover paths and package names from the 'slime' repository, a typo in the Python buffering environment variable, and a leading blank line before the shebang in run-mimo-7B-rl-eagle.sh.
| --actor-num-nodes 32 \ | ||
| --actor-num-gpus-per-node 8 \ | ||
| --colocate \ | ||
| --update-weight-buffer-size $(( 4 * 512 * 1024 * 1024)) |
There was a problem hiding this comment.
| # --global-batch-size 256 | ||
|
|
||
| --over-sampling-batch-size 256 | ||
| --dynamic-sampling-filter-path slime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std |
There was a problem hiding this comment.
The package has been renamed/translated from slime to vime (as seen in the codebase structure, e.g., vime/rollout/vllm_rollout.py). Using slime.rollout... will result in a ModuleNotFoundError. Please update this path to use vime instead of slime.
| --dynamic-sampling-filter-path slime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std | |
| --dynamic-sampling-filter-path vime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std |
|
|
||
| ray job submit --address="http://127.0.0.1:8265" \ | ||
| --runtime-env-json="${RUNTIME_ENV_JSON}" \ | ||
| -- python3 /personal/slime/slime/train.py \ |
There was a problem hiding this comment.
The script is executing /personal/slime/slime/train.py which is a leftover path from the slime repository. It should be updated to train.py to run the vime training script in the current workspace, consistent with the other run scripts.
| -- python3 /personal/slime/slime/train.py \ | |
| -- python3 train.py \ |
| echo "HAS_NVLINK: $HAS_NVLINK (detected $NVLINK_COUNT NVLink references)" | ||
|
|
||
| SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)" | ||
| source "${SCRIPT_DIR}/../scripts/models/qwen3-30B-A3B.sh" |
There was a problem hiding this comment.
The source path ../scripts/models/qwen3-30B-A3B.sh is incorrect. Since this script is located in scripts/low_precision/, .. resolves to scripts/, making the path scripts/scripts/models/... which does not exist. It should be ../models/qwen3-30B-A3B.sh.
| source "${SCRIPT_DIR}/../scripts/models/qwen3-30B-A3B.sh" | |
| source "${SCRIPT_DIR}/../models/qwen3-30B-A3B.sh" |
| echo "HAS_NVLINK: $HAS_NVLINK (detected $NVLINK_COUNT NVLink references)" | ||
|
|
||
| SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)" | ||
| source "${SCRIPT_DIR}/../scripts/models/qwen3-4B.sh" |
There was a problem hiding this comment.
The source path ../scripts/models/qwen3-4B.sh is incorrect. Since this script is located in scripts/low_precision/, .. resolves to scripts/, making the path scripts/scripts/models/... which does not exist. It should be ../models/qwen3-4B.sh.
| source "${SCRIPT_DIR}/../scripts/models/qwen3-4B.sh" | |
| source "${SCRIPT_DIR}/../models/qwen3-4B.sh" |
|
|
||
| SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)" | ||
| source "${SCRIPT_DIR}/models/qwen2.5-0.5B.sh" | ||
| source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh" |
There was a problem hiding this comment.
The source path was incorrectly changed to ${SCRIPT_DIR}/scripts/models/.... Since this script is located in scripts/, ${SCRIPT_DIR} is already scripts/, making the path scripts/scripts/models/... which does not exist. It should be reverted to ${SCRIPT_DIR}/models/qwen2.5-0.5B.sh.
| source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh" | |
| source "${SCRIPT_DIR}/models/qwen2.5-0.5B.sh" |
| set -ex | ||
|
|
||
| export PYTHONUNBUFFERED=1 | ||
| export PYTHONBUFFERED=16 |
There was a problem hiding this comment.
|
|
||
| #!/bin/bash |
| # 229B MoE, 256 experts -> requires many GPUs | ||
| # Typical config: TP=2, PP=2, EP=4, training side 16 GPUs (2 nodes x 8 GPUs) | ||
| # Inference side: vLLM on separate GPUs, EP=16+ | ||
| # Inference side: SGLang on separate GPUs, EP=16+ |
There was a problem hiding this comment.
The comment was updated to refer to SGLang instead of vLLM. Since this PR is migrating the codebase from SGLang to vLLM, this comment is backwards and misleading. It should refer to vLLM.
| # Inference side: SGLang on separate GPUs, EP=16+ | |
| # Inference side: vLLM on separate GPUs, EP=16+ |
d5c572e to
e5d6f3a
Compare
e5d6f3a to
2864b34
Compare
Translate all slime scripts to vime following SGLANG_TO_VLLM_TRANSLATION.md: - sglang→vllm prefix swap for CLI flags and variables - _slime→_vime for checkpoint paths - EP: --sglang-ep-size N → --vllm-enable-expert-parallel (boolean) - Speculative: multi-param → --vllm-speculative-config JSON (§5.2) - Delete genuinely sglang-coupled params (DP-attention, DeepEP, NSA, etc.) - flashinfer → FLASHINFER case fix (§2.4) 23 new scripts + 6 existing updated to match slime@cutoff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
This PR consolidates three work streams:
1. slime-exact translation of run scripts (original scope)
_slime→_vimecheckpoint paths, EP boolean conversion, speculative config merge to JSONSGLANG_TO_VLLM_TRANSLATION.md2. GPT-OSS 20B support
Three fixes required to run GPT-OSS 20B RLHF on vLLM backend:
hf_weight_iterator_bridge.py: match Megatron-Bridge 0.5.0 API —maybe_modify_converted_hf_weightgained a 4thhf_state_dictparameter; the monkey-patch accepted only 3, causingTypeErrorduring weight sync. Same fix submitted upstream: fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API THUDM/slime#2113.--hf-checkpointfused BF16: vLLM_load_weights_otherexpectsgate_up_proj [E, hidden, 2×ffn](fused). Old per-expert split format causesKeyErroron bias loading.tools/convert_gpt_oss_to_fused.pyconverts without re-running slow MXFP4 dequantization.--qkv-format bshd: GPT-OSS learnable softmax +qkv_format=thddisables all TE attention backends. bshd avoids this; replaced--use-dynamic-batch-sizewith--seq-length 10240.3. Restore deleted examples and scripts (from PR #220)
examples/coding_agent_rl/,examples/geo3k_vlm/,examples/multi_agent/,examples/train_infer_mismatch_helper/scripts/run-glm4.7-30B-A3B.sh,run-glm4.7-355B-A32B.sh,run-minimax-m2.sh,run-qwen3-30B-A3B.sh4. Precise pkill pattern (all scripts)
Replace
pkill -9 vllmwithpkill -9 -f '[v]llm serve|VLL[M]::'— targets onlyvllm serveand RayVLLM::actors, avoiding accidental kill in colocated mode.Test plan
run-gpt-oss-20B.sh: validate rollout starts and weight sync completes (step 1)bash -nsyntax check🤖 Generated with Claude Code