NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 12 items • Updated about 15 hours ago • 176
MixtureVitae study models and datasets Collection Collection of models and dataset related to MixtureVitae, open and fully reproducible pretraining dataset built from permissive sources • 16 items • Updated 27 days ago • 1
view article Article Scaling Pedagogical Pre-training: From Optimal Mixing to 10 Billion Tokens 6 days ago • 3
🤏 Smol-Data Collection Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated 10 days ago • 12
Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets Paper • 2602.22207 • Published 15 days ago • 41
view article Article Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism? 16 days ago • 17
The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder Paper • 2602.18487 • Published 29 days ago • 5
Avey B1 experimental Collection Experimental pre-trained checkpoints for Avey-B1 • 3 items • Updated 17 days ago • 3
jina-embeddings-v5-text: Task-Targeted Embedding Distillation Paper • 2602.15547 • Published 23 days ago • 26
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jul 31, 2025 • 27
LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules Paper • 2602.10993 • Published 29 days ago • 1
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning Paper • 2602.11149 • Published 29 days ago • 15
SteuerLLM: Local specialized large language model for German tax law analysis Paper • 2602.11081 • Published 29 days ago • 1
Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay Paper • 2602.06942 • Published Feb 6 • 3
GLiNER- Linker Collection GLiNER-bi-Encoder models for entity linking with the GLiNKER framework • 3 items • Updated Feb 3 • 6
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published Jan 29 • 9