ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8 Reinforcement Learning • 8B • Updated Mar 28 • 2.37k • 188
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-i1-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 737 • 4
JonusNattapong/Reinforcement-Learning-for-Gold-Trading-Model Reinforcement Learning • Updated about 2 hours ago • 16 • 1