Running Featured 55 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 55 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article Compute and Competition in AI: Different FlOPs for Different Folks 16 days ago • 12
Smol-Data Collection Tried and tested mixes for strong pretraining • 14 items • Updated 13 days ago • 2