RedHatAI
/

Sparse-Llama-3.1-8B-tldr-2of4

Text Generation

Generated from Trainer

text-generation-inference

compressed-tensors

Model card Files Files and versions

alexmarques commited on Jun 6

Commit

eb855d8

·

verified ·

1 Parent(s): f8d85de

Update README.md

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -42,7 +42,6 @@ Once your server is started, you can query the model using the OpenAI API:
 ```python
 from openai import OpenAI
-# Modify OpenAI's API key and API base to use vLLM's API server.
 openai_api_key = "EMPTY"
 openai_api_base = "http://localhost:8000/v1"
 client = OpenAI(
@@ -215,7 +214,7 @@ The model was evaluated on the test split of trl-lib/tldr using the Neural Magic
 One can reproduce these results by using the following command:
 ```bash
-lm_eval --model vllm --model_args "pretrained=RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4,dtype=auto,add_bos_token" --batch-size auto --tasks tldr
 ```
 <table>

 ```python
 from openai import OpenAI
 openai_api_key = "EMPTY"
 openai_api_base = "http://localhost:8000/v1"
 client = OpenAI(
 One can reproduce these results by using the following command:
 ```bash
+lm_eval --model vllm --model_args "pretrained=RedHatAI/Sparse-Llama-3.1-8B-tldr-2of4,dtype=auto,add_bos_token=True" --batch-size auto --tasks tldr
 ```
 <table>