-
Notifications
You must be signed in to change notification settings - Fork 952
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(rm_hub): guard deepscaler reward against a missing response
#2115
opened Jun 21, 2026 by
vjsai
Loading…
fix(ppo): stop corrupting the logged rollout/kl metric
#2114
opened Jun 21, 2026 by
EazyReal
Contributor
Loading…
fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API
#2113
opened Jun 21, 2026 by
aoshen02
Contributor
Loading…
Fix(rollout): Fail closed on unknown SGLang model names
#2112
opened Jun 21, 2026 by
Baiyu-Su
Contributor
Loading…
fix: support eval-only mode (--num-rollout 0)
#2109
opened Jun 20, 2026 by
EazyReal
Contributor
Loading…
feat(examples/strands_sglang): update to strands-sglang 0.4.2
#2106
opened Jun 20, 2026 by
Lawhy
Contributor
Loading…
feat(tracking): add MLflow tracking support alongside W&B
#2099
opened Jun 17, 2026 by
rrranlyu
Loading…
fix(scripts): correct model config source path in FP8 low_precision scripts
#2094
opened Jun 17, 2026 by
aoshen02
Contributor
Loading…
2 tasks done
fix(fully-async): respect partial_rollout=False when requeuing ABORTED groups
#2092
opened Jun 16, 2026 by
Kagura-0001
Loading…
Add
--loss-aggregation for the four ScaleRL pg_loss aggregation modes
#2090
opened Jun 16, 2026 by
EazyReal
Contributor
Loading…
fix(opd): score teacher logprobs at rollout temperature, not 0
#2085
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
feat(rl): composable current-policy importance-sampling correction (TIS hook)
#2084
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
feat(rl): add REINFORCE advantage estimator
#2083
opened Jun 15, 2026 by
EazyReal
Contributor
Loading…
fix(rollout): isolate per-trajectory exceptions in generate_and_rm_group
#2078
opened Jun 15, 2026 by
aoshen02
Contributor
Loading…
fix(script): correct GLM-4.7 expert_model_parallel_size for single-node 8 GPU
#2077
opened Jun 15, 2026 by
aoshen02
Contributor
Loading…
1 task
perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)
#2076
opened Jun 14, 2026 by
Mantissagithub
Loading…
Support Qwen3.5-VL (dense + MoE) via Megatron-Bridge
#2075
opened Jun 14, 2026 by
demouo
Contributor
Loading…
feat(rollouts) external rollouts endpoint with publish-only weight sync
#2071
opened Jun 12, 2026 by
jvmncs
Loading…
4 tasks done
fix(agent): reuse a pooled SGLang client across turns and retry once on pre-connect connector errors
#2069
opened Jun 12, 2026 by
EazyReal
Contributor
Loading…
fix(sglang): authenticate engine control-plane and router calls
#2068
opened Jun 12, 2026 by
EazyReal
Contributor
Loading…
[megatron] don't re-assert no_sync_func every step with overlap_grad_reduce
#2066
opened Jun 12, 2026 by
HaozheZhang6
Loading…
fix(dp_schedule): drop trailing rollouts when the aligned micro-batch target exceeds the sample count
#2065
opened Jun 12, 2026 by
EazyReal
Contributor
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.