cmh
/

phi-4_exl2

cmh commited on Mar 22

Commit

a45500f

verified ·

1 Parent(s): 8fab6d3

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,6 +12,7 @@ pipeline_tag: text-generation
 [ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.](https://github.com/turboderp-org/exllamav2)
 | Filename | Quant type | File Size | Vram*|
 | -------- | ---------- | --------- | -------- |
 | [phi-4_hb8_3bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_3bpw) | 3.00 bits per weight | 6.66 GB | **10,3 GB** |
@@ -20,6 +21,7 @@ pipeline_tag: text-generation
 | [phi-4_hb8_6bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_6bpw) | 6.00 bits per weight | 11.8 GB | **15,1 GB** |
 | [phi-4_hb8_7bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_7bpw) | 7.00 bits per weight | 13.5 GB | **16,7 GB** |
 | [phi-4_hb8_8bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_8bpw) | 8.00 bits per weight | 15.2 GB | **18,2 GB** |
 *at 16k context
 # Phi-4 Model Card

 [ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.](https://github.com/turboderp-org/exllamav2)
 | Filename | Quant type | File Size | Vram*|
 | -------- | ---------- | --------- | -------- |
 | [phi-4_hb8_3bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_3bpw) | 3.00 bits per weight | 6.66 GB | **10,3 GB** |
 | [phi-4_hb8_6bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_6bpw) | 6.00 bits per weight | 11.8 GB | **15,1 GB** |
 | [phi-4_hb8_7bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_7bpw) | 7.00 bits per weight | 13.5 GB | **16,7 GB** |
 | [phi-4_hb8_8bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_8bpw) | 8.00 bits per weight | 15.2 GB | **18,2 GB** |
 *at 16k context
 # Phi-4 Model Card