|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
tags: |
|
- lora |
|
- fused |
|
- text-to-sql |
|
- natural-language-to-sql |
|
- mlx |
|
- apple-silicon |
|
- fine-tuning |
|
- instruction-following |
|
model_creator: Jerome Mohanan |
|
datasets: |
|
- spider |
|
--- |
|
|
|
# TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL |
|
|
|
**TinyLlama-1.1B-Chat-LoRA-Fused-v1.0** is a 1.1 billion parameter model derived from *TinyLlama/TinyLlama-1.1B-Chat-v1.0*. |
|
Using parameter-efficient **LoRA** fine-tuning and the new Apple-Silicon-native **MLX** framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases. |
|
After training, the LoRA adapters were **merged (“fused”)** into the base weights, so you only need this single checkpoint for inference. |
|
|
|
--- |
|
|
|
## 🗝️ Key Facts |
|
| Property | Value | |
|
|---|---| |
|
| Base model | TinyLlama 1.1B Chat v1.0 | |
|
| Task | Natural-Language → SQL generation | |
|
| Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 | |
|
| Training framework | MLX 0.8 + PEFT | |
|
| Hardware | MacBook Pro M4 Pro (20-core GPU) | |
|
| Checkpoint size | 2.1 GB (fp16, fused) | |
|
| License | Apache 2.0 | |
|
|
|
--- |
|
|
|
## ✨ Intended Use |
|
|
|
* **Interactive data exploration** inside BI notebooks or chatbots. |
|
* **Customer-support analytics** — empower non-SQL users to ask free-form questions. |
|
* **Education & demos** showing how LoRA + MLX enables rapid on-device fine-tuning. |
|
|
|
The model was trained on synthetic NL-SQL pairs for demo purposes. **Do not** deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review. |
|
|
|
--- |
|
|
|
## 💻 Quick Start |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0" |
|
tok = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
prompt = """\ |
|
### Database schema |
|
table orders(id, customer_id, total, created_at) |
|
table customers(id, name, country) |
|
|
|
### Question |
|
List total sales per country ordered by total descending.""" |
|
|
|
inputs = tok(prompt, return_tensors="pt") |
|
sql_out = model.generate(**inputs, max_new_tokens=128) |
|
print(tok.decode(sql_out[0], skip_special_tokens=True)) |
|
``` |
|
|
|
--- |
|
|
|
## 🏋️♂️ Training Details |
|
|
|
* **Data** – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness. |
|
* **Pre-processing** – schema + question paired using the *Text-to-SQL prompt* pattern; SQL statements lower-cased; no anonymisation. |
|
* **Hyper-parameters** |
|
* batch size = 32 (gradient-accum = 4) |
|
* learning-rate = 2 e-4 (cosine schedule) |
|
* epochs = 3 |
|
* LoRA rank = 16, α = 32 |
|
* fp16 mixed-precision |
|
|
|
Total GPU-hours ≈ 5mins on Apple-Silicon. |
|
|
|
--- |
|
|
|
## 🌱 Environmental Impact |
|
|
|
LoRA fine-tuning on consumer Apple-Silicon is energy-efficient. |
|
|
|
--- |
|
|
|
## 🛠️ Limitations & Biases |
|
|
|
* Trained on a synthetic, limited dataset → may under-perform on real production schemas. |
|
* Does **not** perform schema-linking; you must include the relevant schema in the prompt. |
|
* SQL is not guaranteed to be safe; always validate queries before execution. |
|
|
|
--- |
|
|
|
## ✍️ Citation |
|
``` |
|
@misc{mohanan2024tinyllama_sql_lora, |
|
title = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0}, |
|
author = {Jerome Mohanan}, |
|
note = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0}, |
|
year = {2024} |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## 📫 Contact |
|
Questions or feedback? Ping **@jero2rome** on Hugging Face or email <[email protected]>. |
|
|