RLHFlow/reinforce_ada_hard_prompt_1-5b
Viewer
•
Updated
•
13.3k
•
32
Workflow of Reinforcement Learning from Human Feedback (RLHF). Blog: https://rlhflow.github.io/
Totally Free + Zero Barriers + No Login Required