Set Block Decoding is a Language Model Inference Accelerator Paper • 2509.04185 • Published 5 days ago • 32
Beyond Transcription: Mechanistic Interpretability in ASR Paper • 2508.15882 • Published 19 days ago • 84
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds Paper • 2508.14879 • Published 19 days ago • 64
Persona Vectors: Monitoring and Controlling Character Traits in Language Models Paper • 2507.21509 • Published Jul 29 • 29
Red Hat AI validated models - v1.0 Collection v1.0 Collection of third-party generative AI models validated by Red Hat AI for use across the Red Hat AI Product Portfolio. • 39 items • Updated Jul 29 • 17
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published May 20 • 78
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published Apr 15 • 62
Model Optimizer Collection A collection of generative models quantized and optimized with TensorRT Model Optimizer. • 37 items • Updated 4 days ago • 29
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation Paper • 2503.19693 • Published Mar 25 • 77
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated Jul 31 • 27
view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.29k
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published Feb 24 • 19
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published Feb 25 • 76
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published Feb 25 • 58
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 165
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published Feb 13 • 36
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published Feb 13 • 149