cmh commited on
Commit
8dc83c8
·
verified ·
1 Parent(s): 89f0d99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -13,14 +13,14 @@ pipeline_tag: text-generation
13
  [ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.](https://github.com/turboderp-org/exllamav2)
14
 
15
 
16
- | Filename | Quant type | File Size | ~Vram*|
17
  | -------- | ---------- | --------- | -------- |
18
- | [phi-4_hb8_3bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_3bpw) | 3.00 bits per weight | 6.66 GB | **10,3 GB** |
19
- | [phi-4_hb8_4bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_4bpw) | 4.00 bits per weight | 8.36 GB | **11,9 GB** |
20
- | [phi-4_hb8_5bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_5bpw) | 5.00 bits per weight | 10.1 GB | **13,5 GB** |
21
- | [phi-4_hb8_6bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_6bpw) | 6.00 bits per weight | 11.8 GB | **15,1 GB** |
22
- | [phi-4_hb8_7bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_7bpw) | 7.00 bits per weight | 13.5 GB | **16,7 GB** |
23
- | [phi-4_hb8_8bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_8bpw) | 8.00 bits per weight | 15.2 GB | **18,2 GB** |
24
 
25
  <sub>*approximate value at **16k context, FP16 cache**.<sup>
26
 
 
13
  [ExLlamaV2 is an inference library for running local LLMs on modern consumer GPUs.](https://github.com/turboderp-org/exllamav2)
14
 
15
 
16
+ | | Quant type | File Size | ~Vram*|
17
  | -------- | ---------- | --------- | -------- |
18
+ | [phi-4 hb8_3bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_3bpw) | 3.00 bits per weight | 6.66 GB | **10,3 GB** |
19
+ | [phi-4 hb8_4bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_4bpw) | 4.00 bits per weight | 8.36 GB | **11,9 GB** |
20
+ | [phi-4 hb8_5bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_5bpw) | 5.00 bits per weight | 10.1 GB | **13,5 GB** |
21
+ | [phi-4 hb8_6bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_6bpw) | 6.00 bits per weight | 11.8 GB | **15,1 GB** |
22
+ | [phi-4 hb8_7bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_7bpw) | 7.00 bits per weight | 13.5 GB | **16,7 GB** |
23
+ | [phi-4 hb8_8bpw](https://huggingface.co/cmh/phi-4_exl2/tree/hb8_8bpw) | 8.00 bits per weight | 15.2 GB | **18,2 GB** |
24
 
25
  <sub>*approximate value at **16k context, FP16 cache**.<sup>
26