Running Featured 56 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 56 Who needs 1T parameters? Olympiad proofs with a 4B model
Running 53 Bringing paper to life: A modern template for scientific writing 📝 53 Download a ready-to-use scientific paper template
Running 3.72k The Ultra-Scale Playbook 🌌 3.72k The ultimate guide to training LLM on large GPU Clusters
Running 88 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks 📝 88 Evaluate multilingual models using FineTasks
Running 133 TxT360: Trillion Extracted Text 📖 133 Explore and download the TxT360 LLM pre‑training dataset