ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n8-sample4-iter1-step_9 2B • Updated Mar 22
ScaleML-RLHF/Qwen2.5-Math-1.5B-raft-plusplus-numina_math_em-sample1n8-sample4-iter1-step_5 2B • Updated Mar 22
ScaleML-RLHF/verl-math-new-Qwen2.5-1.5B-Instruct-raft-vanilla-numina_math_flat_em_stage1n64-sample64-iter1 Updated Mar 21
ScaleML-RLHF/verl-math-new-Qwen2.5-0.5B-Instruct-raft-vanilla-numina_math_flat_em_stage1n64-sample64-iter1 Updated Mar 21