CodeGoat24 commited on
Commit
8ac1768
·
verified ·
1 Parent(s): 6d3c106

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -14,7 +14,7 @@ This model is trained using [Pref-GRPO](https://codegoat24.github.io/UnifiedRewa
14
 
15
 
16
  For further details, please refer to the following resources:
17
- - 📰 Paper:
18
  - 🪐 Project Page: https://codegoat24.github.io/UnifiedReward/Pref-GRPO
19
  - 🤗 UniGenBench: https://github.com/CodeGoat24/UniGenBench
20
  - 🤗 Leaderboard: https://huggingface.co/spaces/CodeGoat24/UniGenBench_Leaderboard
@@ -51,5 +51,10 @@ image.save("flux-dev.png")
51
  ## Citation
52
 
53
  ```
54
-
 
 
 
 
 
55
  ```
 
14
 
15
 
16
  For further details, please refer to the following resources:
17
+ - 📰 Paper: https://arxiv.org/pdf/2508.20751
18
  - 🪐 Project Page: https://codegoat24.github.io/UnifiedReward/Pref-GRPO
19
  - 🤗 UniGenBench: https://github.com/CodeGoat24/UniGenBench
20
  - 🤗 Leaderboard: https://huggingface.co/spaces/CodeGoat24/UniGenBench_Leaderboard
 
51
  ## Citation
52
 
53
  ```
54
+ @article{Pref-GRPO&UniGenBench,
55
+ title={Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning.},
56
+ author={Wang, Yibin and Li, Zhimin and Zang, Yuhang and Zhou, Yujie and Bu, Jiazi and Wang, Chunyu and Lu, Qinglin, and Jin, Cheng and Wang, Jiaqi},
57
+ journal={arXiv preprint arXiv:2508.20751},
58
+ year={2025}
59
+ }
60
  ```