|
--- |
|
language: |
|
- en |
|
tags: |
|
- text-to-sql |
|
- mistral |
|
- gguf |
|
- sql-generation |
|
- cpu-inference |
|
pipeline_tag: text-generation |
|
license: apache-2.0 |
|
--- |
|
|
|
# Mistral-7B SQL GGUF |
|
|
|
A GGUF-quantized version of Mistral-7B fine-tuned for SQL query generation. Optimized for CPU inference with clean SQL outputs. |
|
|
|
## Model Details |
|
- Base Model: Mistral-7B-Instruct-v0.3 |
|
- Quantization: Q8_0 |
|
- Context Length: 32768 tokens (default from base model) |
|
- Format: GGUF (V3 latest) |
|
- Size: 7.17 GB |
|
- Parameters: 7.25B |
|
- Architecture: Llama |
|
- Use Case: Text to SQL conversion |
|
|
|
## Usage |
|
|
|
```python |
|
from huggingface_hub import hf_hub_download |
|
from llama_cpp import Llama |
|
|
|
# Download and setup |
|
model_path = hf_hub_download( |
|
repo_id="tharun66/mistral-sql-gguf", |
|
filename="mistral_sql_q4.gguf" |
|
) |
|
|
|
# Initialize model |
|
llm = Llama( |
|
model_path=model_path, |
|
n_ctx=512, |
|
n_threads=4, |
|
verbose=False |
|
) |
|
|
|
def generate_sql(question): |
|
prompt = f"""### Task: Convert to SQL |
|
### Question: {question} |
|
### SQL:""" |
|
|
|
response = llm( |
|
prompt, |
|
max_tokens=128, |
|
temperature=0.7, |
|
stop=["system", "user", "assistant", "###"], |
|
echo=False |
|
) |
|
|
|
return response['choices'][0]['text'].strip() |
|
|
|
# Example |
|
question = "Show all active users" |
|
sql = generate_sql(question) |
|
print(sql) |
|
# Output: SELECT * FROM users WHERE status = 'active' |