Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

83

Full-text search

Active filters: quark

aigdat/Llama-3.2-1B-Instruct-awq-uint4-float16

0.4B • Updated Apr 29 • 6

aigdat/Llama-3.2-3B-Instruct-awq-uint4-float16

0.8B • Updated Apr 24 • 7

aigdat/Phi-3.5-mini-instruct-awq-uint4-float16

0.6B • Updated Apr 29 • 6

aigdat/DeepSeek-R1-Distill-Qwen-1.5B_quantized_int4_bfloat16

0.4B • Updated Apr 29 • 6

aigdat/Qwen3-0.6B_quantized_int4_float16

0.2B • Updated Apr 30 • 10

aigdat/Arch-Function-Chat-3B_quantized_int4_float16

0.7B • Updated May 5 • 6

aigdat/DeepCoder-14B-Preview_quantized_int4_float16

3B • Updated May 5 • 5

aigdat/Qwen2.5-Coder-1.5B-Instruct_quantized_int4_bfloat16

0.4B • Updated May 5 • 11

aigdat/Qwen2.5-Coder-7B-Instruct_quantized_int4_bfloat16

1B • Updated May 6 • 8

aigdat/Qwen2.5-3B-Instruct_quantized_int4_bfloat16

0.7B • Updated May 8 • 6

aigdat/Qwen2.5-Coder-32B-Instruct_quantized_int4_bfloat16

5B • Updated May 9 • 6

aigdat/Llama-xLAM-2-8b-fc-r_quantized_int4_bfloat16

2B • Updated May 9 • 6

fxmarty/qwen_1.5-moe-a2.7b-mxfp4

8B • Updated May 13 • 194

amd/Llama-3.3-70B-Instruct-MXFP4-Preview

38B • Updated Aug 5 • 1.83k

fxmarty/deepseek_r1_3_layers_mxfp4

8B • Updated May 15 • 1.09k • 1

fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4

5B • Updated May 19 • 39

amd/DeepSeek-R1-MXFP4-Preview

357B • Updated Aug 6 • 1.69k • 2

mohitsha/Llama-2-7b-hf-w_mx_fp4_per_group_sym

4B • Updated May 23 • 7

amd/Llama-3.1-405B-Instruct-MXFP4-Preview

218B • Updated Aug 5 • 16.6k • 1

amd/DeepSeek-R1-MXFP4-ASQ

363B • Updated 15 days ago • 237

haoyang-amd/qwen1.5-0.5B-ptpc

0.5B • Updated Jul 1 • 11

amd/DeepSeek-R1-0528-MXFP4-Preview

363B • Updated 15 days ago • 537

fxmarty/Llama-3.1-70B-Instruct-2-layers-mxfp6

3B • Updated Jul 9 • 21

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp4_a_fp6_e2m3

8B • Updated Jul 11 • 23

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e2m3_a_fp6_e2m3

11B • Updated Jul 11 • 38

fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e3m2_a_fp6_e3m2

11B • Updated Jul 11 • 24

amd/Llama-2-70b-chat-hf-WMXFP4-AMXFP4-KVFP8-Scale-UINT8-MLPerf-GPTQ

37B • Updated Aug 5 • 21

sudhab1988/rakuten-7b-awq-g128-int4-asym-fp16-hf

1B • Updated Jul 15 • 9

matmelis/Llama_3.2_1B_w_uint4_gptq

0.4B • Updated Jul 16 • 11

EliovpAI/Qwen3-14B-FP8-KV

Text Generation • 15B • Updated Aug 1 • 20 • 1