Update README.md
Browse files
README.md
CHANGED
@@ -282,6 +282,7 @@ Results are shown for this model, as well as two variants:
|
|
282 |
- **Dense:** [Llama-3.1-8B-tldr](https://huggingface.co/RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4)
|
283 |
- **Dense-quantized:** [Llama-3.1-8B-tldr-FP8-dynamic](https://huggingface.co/RedHatAI/Llama-3.1-8B-tldr-FP8-dynamic)
|
284 |
- **Sparse-quantized:** [Sparse-Llama-3.1-8B-tldr-2of4-FP8-dynamic](https://huggingface.co/RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4-FP8-dynamic)
|
|
|
285 |
Although sparsity by itself does not significantly improve performance, when combined with quantization it results in up to 1.6x speedup.
|
286 |
|
287 |
|
|
|
282 |
- **Dense:** [Llama-3.1-8B-tldr](https://huggingface.co/RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4)
|
283 |
- **Dense-quantized:** [Llama-3.1-8B-tldr-FP8-dynamic](https://huggingface.co/RedHatAI/Llama-3.1-8B-tldr-FP8-dynamic)
|
284 |
- **Sparse-quantized:** [Sparse-Llama-3.1-8B-tldr-2of4-FP8-dynamic](https://huggingface.co/RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4-FP8-dynamic)
|
285 |
+
|
286 |
Although sparsity by itself does not significantly improve performance, when combined with quantization it results in up to 1.6x speedup.
|
287 |
|
288 |
|