clowman
/

Llama-3.2-3B-Instruct-Dynamic-F8

Text Generation

text-generation-inference

compressed-tensors

Model card Files Files and versions

clowman commited on Apr 2

Commit

3f16a1e

·

verified ·

1 Parent(s): fa30a27

Update README.md

Files changed (1) hide show

README.md +20 -16

README.md CHANGED Viewed

@@ -1,19 +1,3 @@
-# Quantization
-Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov  6 2024, 20:22:13) [GCC 11.4.0]`
-Base Model: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
-Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
-Steps to create:
-1. `git clone https://github.com/LambdaLabsML/lambda-quant`
-2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
-3. `python quantize.py -m meta-llama/Llama-3.2-3B-Instruct -q Dynamic-F8`
-## Evaluation
-TODO
-## Benchmarks
-TODO
-# Base Model README.md
 ---
 language:
 - en
@@ -234,6 +218,26 @@ extra_gated_description: >-
   Policy](https://www.facebook.com/privacy/policy/).
 extra_gated_button_content: Submit
 ---
 ## Model Information

 ---
 language:
 - en
   Policy](https://www.facebook.com/privacy/policy/).
 extra_gated_button_content: Submit
 ---
+# Quantization
+Created with [lambda-quant](https://github.com/LambdaLabsML/lambda-quant/tree/f97108fe4a9ee061a7b969b23a9605a6d561863d) on `Python 3.10.12 (main, Nov  6 2024, 20:22:13) [GCC 11.4.0]`
+Base Model: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
+Quantized using [llmcompressor==0.4.1](https://github.com/vllm-project/llm-compressor)
+Steps to create:
+1. `git clone https://github.com/LambdaLabsML/lambda-quant`
+2. `git checkout f97108fe4a9ee061a7b969b23a9605a6d561863d`
+3. `python quantize.py -m meta-llama/Llama-3.2-3B-Instruct -q Dynamic-F8`
+## Evaluation
+TODO
+## Benchmarks
+TODO
+# Base Model README.md
 ## Model Information