MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 7 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 7 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 7 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 7 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND5 Reinforcement Learning • 1B • Updated 7 days ago • 9
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 7 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 7 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 7 days ago • 10
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 7 days ago • 9
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 7 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND3 Reinforcement Learning • 1B • Updated 7 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 7 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 7 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 7 days ago • 8
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 7 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 7 days ago • 8
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND1 Reinforcement Learning • 1B • Updated 7 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 7 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 7 days ago • 5
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 7 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 7 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 7 days ago • 5
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_ROUND2 Reinforcement Learning • 1B • Updated 7 days ago • 3
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 6 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3-checkpoint-epoch-40 Reinforcement Learning • 1B • Updated 6 days ago • 2
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3-checkpoint-epoch-60 Reinforcement Learning • 1B • Updated 6 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3-checkpoint-epoch-80 Reinforcement Learning • 1B • Updated 6 days ago • 6
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3-checkpoint-epoch-100 Reinforcement Learning • 1B • Updated 6 days ago • 4
MattBou00/llama-3-2-1b-detox_v1f_RRETRT_Again_AGAIN_ROUND3 Reinforcement Learning • 1B • Updated 6 days ago • 7
MattBou00/llama-3-2-1b-detox_v1f_SCALE9_round5-checkpoint-epoch-20 Reinforcement Learning • 1B • Updated 6 days ago • 6