Skip to content

perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)#2076

Open
Mantissagithub wants to merge 3 commits into
THUDM:mainfrom
Mantissagithub:perf/logprob-response-only-gather
Open

perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)#2076
Mantissagithub wants to merge 3 commits into
THUDM:mainfrom
Mantissagithub:perf/logprob-response-only-gather

Commits

Commits on Jun 14, 2026

Commits on Jun 15, 2026

Commits on Jun 18, 2026