yizhilll commited on
Commit
18fbbcc
·
verified ·
1 Parent(s): b5a5def

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - m-a-p/TreePO_data
4
+ base_model:
5
+ - Qwen/Qwen2.5-7B
6
+ ---
7
+
8
+
9
+ We release the resources for the paper [TreePO](arxiv.org/abs/2508.17445):
10
+ - Checkpoint with average weighted subgroup advantages + more diverse intial divergence ([the final one](https://huggingface.co/m-a-p/TreePO-Qwen2.5-7B)). ← You are here.
11
+ - Checkpoint with average weighted subgroup advantages + [fixed divergence](https://huggingface.co/m-a-p/TreePO-Qwen2.5-7B_fixed-div).
12
+ - The [training dataset](https://huggingface.co/datasets/m-a-p/TreePO_data) consisted of deepscaler and simplerl math reasoning.
13
+
14
+
15
+ More links:
16
+ - [Huggingface Paper](https://huggingface.co/papers/2508.17445)
17
+ - [Project Page](https://m-a-p.ai/TreePO)
18
+ - [X/Twitter Thread](https://x.com/yizhilll/status/1960616873180954854)
19
+ - [Github Repo](https://github.com/multimodal-art-projection/TreePO)