jero2rome's picture
Create README.md
a0418ad verified
---
license: apache-2.0
language:
- en
base_model:
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
tags:
- lora
- fused
- text-to-sql
- natural-language-to-sql
- mlx
- apple-silicon
- fine-tuning
- instruction-following
model_creator: Jerome Mohanan
datasets:
- spider # used conceptually as inspiration; see Training Data
---
# TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL
**TinyLlama-1.1B-Chat-LoRA-Fused-v1.0** is a 1.1 billion parameter model derived from *TinyLlama/TinyLlama-1.1B-Chat-v1.0*.
Using parameter-efficient **LoRA** fine-tuning and the new Apple-Silicon-native **MLX** framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
After training, the LoRA adapters were **merged (“fused”)** into the base weights, so you only need this single checkpoint for inference.
---
## 🗝️ Key Facts
| Property | Value |
|---|---|
| Base model | TinyLlama 1.1B Chat v1.0 |
| Task | Natural-Language → SQL generation |
| Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 |
| Training framework | MLX 0.8 + PEFT |
| Hardware | MacBook Pro M4 Pro (20-core GPU) |
| Checkpoint size | 2.1 GB (fp16, fused) |
| License | Apache 2.0 |
---
## ✨ Intended Use
* **Interactive data exploration** inside BI notebooks or chatbots.
* **Customer-support analytics** — empower non-SQL users to ask free-form questions.
* **Education & demos** showing how LoRA + MLX enables rapid on-device fine-tuning.
The model was trained on synthetic NL-SQL pairs for demo purposes. **Do not** deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.
---
## 💻 Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = """\
### Database schema
table orders(id, customer_id, total, created_at)
table customers(id, name, country)
### Question
List total sales per country ordered by total descending."""
inputs = tok(prompt, return_tensors="pt")
sql_out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(sql_out[0], skip_special_tokens=True))
```
---
## 🏋️‍♂️ Training Details
* **Data** – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
* **Pre-processing** – schema + question paired using the *Text-to-SQL prompt* pattern; SQL statements lower-cased; no anonymisation.
* **Hyper-parameters**
* batch size = 32 (gradient-accum = 4)
* learning-rate = 2 e-4 (cosine schedule)
* epochs = 3
* LoRA rank = 16, α = 32
* fp16 mixed-precision
Total GPU-hours ≈ 5mins on Apple-Silicon.
---
## 🌱 Environmental Impact
LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.
---
## 🛠️ Limitations & Biases
* Trained on a synthetic, limited dataset → may under-perform on real production schemas.
* Does **not** perform schema-linking; you must include the relevant schema in the prompt.
* SQL is not guaranteed to be safe; always validate queries before execution.
---
## ✍️ Citation
```
@misc{mohanan2024tinyllama_sql_lora,
title = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
author = {Jerome Mohanan},
note = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
year = {2024}
}
```
---
## 📫 Contact
Questions or feedback? Ping **@jero2rome** on Hugging Face or email <[email protected]>.