AmpereComputing/granite-4.0-tiny-preview-gguf
7B • Updated
• 7
Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp
Totally Free + Zero Barriers + No Login Required