thejaminator's picture
verl GRPO trained model at step 50
e79d8eb verified
---
base_model: thejaminator/checkpoints_multiple_datasets_layer_1_decoder-fixed
library_name: peft
tags:
- lora
- peft
pipeline_tag: text-generation
---