Update README.md
Browse files
README.md
CHANGED
|
@@ -125,7 +125,7 @@ Coming soon!
|
|
| 125 |
## Quantization Reproduction
|
| 126 |
|
| 127 |
> [!NOTE]
|
| 128 |
-
> In order to quantize Llama 3.1 8B Instruct using AutoAWQ, you will need to use an instance with at least enough CPU RAM to fit the whole model i.e. ~8GiB, and an NVIDIA GPU with
|
| 129 |
|
| 130 |
In order to quantize Llama 3.1 8B Instruct, first install `torch` and `autoawq` as follows:
|
| 131 |
|
|
|
|
| 125 |
## Quantization Reproduction
|
| 126 |
|
| 127 |
> [!NOTE]
|
| 128 |
+
> In order to quantize Llama 3.1 8B Instruct using AutoAWQ, you will need to use an instance with at least enough CPU RAM to fit the whole model i.e. ~8GiB, and an NVIDIA GPU with 16GiB of VRAM to quantize it.
|
| 129 |
|
| 130 |
In order to quantize Llama 3.1 8B Instruct, first install `torch` and `autoawq` as follows:
|
| 131 |
|