Edit Models filters

Apps

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

90

Full-text search

Active filters: llm-compressor

RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-block

Text Generation • 109B • Updated 6 days ago • 218 • 3

RedHatAI/Llama-4-Maverick-17B-128E-Instruct-FP8-block

Text Generation • 402B • Updated 6 days ago • 31 • 1

RedHatAI/Qwen2.5-0.5B-quantized.w8a16

Text Generation • 0.4B • Updated Nov 26, 2024

RedHatAI/Qwen2.5-1.5B-quantized.w8a16

Text Generation • 0.8B • Updated Nov 26, 2024 • 4

RedHatAI/Qwen2.5-3B-quantized.w8a16

Text Generation • 1B • Updated Nov 26, 2024

RedHatAI/Qwen2.5-7B-quantized.w8a16

Text Generation • 3B • Updated Nov 26, 2024 • 2 • 1

RedHatAI/Qwen2.5-32B-quantized.w8a16

Text Generation • 9B • Updated Nov 26, 2024 • 3

RedHatAI/Qwen2.5-72B-quantized.w8a16

Text Generation • 20B • Updated Nov 26, 2024 • 2

RedHatAI/Qwen2.5-Coder-14B-Instruct-FP8-dynamic

Text Generation • 15B • Updated Sep 23 • 440 • 1

brokenlander/AlphaBuffett-FP8-Dynamic

Text Generation • 24B • Updated Feb 16

divish/M-Prometheus-3B-FP8-Dynamic

Text Generation • 3B • Updated Apr 16 • 5

textgeflecht/Devstral-Small-2505-FP8-llmcompressor

Text Generation • 24B • Updated May 25 • 2

ConfidentialMind/InternVL3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jul 7 • 3 • 2

textgeflecht/Qwen2.5-Coder-32B-Instruct-FP8-dynamic

Text Generation • 33B • Updated Jun 12

brandonbeiler/InternVL3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jun 23 • 185

brandonbeiler/InternVL3-78B-FP8-Dynamic

Image-Text-to-Text • 78B • Updated Jun 23 • 12

brandonbeiler/InternVL3-8B-FP8-Dynamic

Image-Text-to-Text • 8B • Updated Jun 23 • 6 • 2

RedHatAI/Qwen3-30B-A3B-FP8-block

Text Generation • 31B • Updated 6 days ago • 35

JustJaro/SmolLM-135M-FP8-Static

Image-Text-to-Text • 0.2B • Updated Jul 7 • 3

JustJaro/SmolLM-135M-FP8-Dynamic-Test

Image-Text-to-Text • 0.1B • Updated Jul 7 • 3

JustJaro/GOT-OCR-2.0-hf-FP8-Static

Image-Text-to-Text • 0.6B • Updated Jul 7 • 3

brandonbeiler/Skywork-R1V3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated Jul 18 • 116 • 1

ramblingpolymath/Qwen3-32B-W8A8

Text Generation • 33B • Updated Aug 2 • 8

ramblingpolymath/Qwen3-14B-W8A8

Text Generation • 15B • Updated Aug 3 • 11

ramblingpolymath/Qwen3-8B-W8A8

Text Generation • 8B • Updated Aug 3 • 6

ramblingpolymath/Qwen3-4B-W8A8

Text Generation • 4B • Updated Aug 2 • 7

ramblingpolymath/qwen3-30B-A3B-w8a8

Text Generation • 31B • Updated Aug 2 • 41

ramblingpolymath/Qwen3-0.6B-W8A8

Text Generation • 0.8B • Updated Aug 3 • 37

ramblingpolymath/Qwen3-30B-A3B-Instruct-2507-W8A8

Text Generation • 31B • Updated Aug 2 • 448 • 1

ramblingpolymath/Qwen3-30B-A3B-thinking-2507-W8A8

Text Generation • 31B • Updated Aug 2 • 58 • 3