MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated Aug 21 • 12
MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated Aug 21 • 10
MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated Aug 21 • 13
MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated Aug 21 • 12
MattBou00/llama-3-2-1b-detox_v1f_round4-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated Aug 21 • 11
MattBou00/llama-3-2-1b-detox_retry-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated Aug 25 • 6
mradermacher/VeriReason-codeLlama-7b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 7B • Updated Aug 26 • 226
MattBou00/llama-3-2-1b-detox_v1f_testing_sameaseval-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 12 days ago • 14
MattBou00/llama-3-2-1b-detox_v1f_testing_sameaseval-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 12 days ago • 16
MattBou00/llama-3-2-1b-detox_v1f_testing_sameaseval-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 12 days ago • 10
MattBou00/llama-3-2-1b-detox_v1f_testing_sameaseval-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 12 days ago • 31
MattBou00/llama-3-2-1b-detox_RETRY_scale15-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 12 days ago • 8
MattBou00/llama-3-2-1b-detox_RETRY_scale15-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 12 days ago • 7
MattBou00/llama-3-2-1b-detox_RETRY_scale15-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 12 days ago • 6
MattBou00/llama-3-2-1b-detox_RETRY_scale15-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 12 days ago • 6
MattBou00/llama-3-2-1b-detox_RETRY_scale15-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 12 days ago • 6
MattBou00/llama-3-2-1b-detox_RETRY_scale10-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 12 days ago • 10
MattBou00/llama-3-2-1b-detox_RETRY_scale10-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 12 days ago • 8
MattBou00/llama-3-2-1b-detox_RETRY_scale10-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 12 days ago • 6
MattBou00/llama-3-2-1b-detox_RETRY_scale10-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 12 days ago • 8
MattBou00/llama-3-2-1b-detox_RETRY_scale10-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 12 days ago • 18