bartowski/allura-forge_Llama-3.3-8B-Instruct-GGUF Text Generation • 8B • Updated 9 days ago • 7.97k • 22
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 19 days ago • 79
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 22 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 23 days ago • 7