Create README.md

a0418ad verified 3 months ago

3.71 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- TinyLlama/TinyLlama-1.1B-Chat-v1.0
	tags:
	- lora
	- fused
	- text-to-sql
	- natural-language-to-sql
	- mlx
	- apple-silicon
	- fine-tuning
	- instruction-following
	model_creator: Jerome Mohanan
	datasets:
	- spider # used conceptually as inspiration; see Training Data
	---

	# TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL

	TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 is a 1.1 billion parameter model derived from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
	Using parameter-efficient LoRA fine-tuning and the new Apple-Silicon-native MLX framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
	After training, the LoRA adapters were merged (“fused”) into the base weights, so you only need this single checkpoint for inference.

	---

	## 🗝️ Key Facts
	\| Property \| Value \|
	\|---\|---\|
	\| Base model \| TinyLlama 1.1B Chat v1.0 \|
	\| Task \| Natural-Language → SQL generation \|
	\| Fine-tuning method \| Low-Rank Adaptation (LoRA) @ rank = 16 \|
	\| Training framework \| MLX 0.8 + PEFT \|
	\| Hardware \| MacBook Pro M4 Pro (20-core GPU) \|
	\| Checkpoint size \| 2.1 GB (fp16, fused) \|
	\| License \| Apache 2.0 \|

	---

	## ✨ Intended Use

	* Interactive data exploration inside BI notebooks or chatbots.
	* Customer-support analytics — empower non-SQL users to ask free-form questions.
	* Education & demos showing how LoRA + MLX enables rapid on-device fine-tuning.

	The model was trained on synthetic NL-SQL pairs for demo purposes. Do not deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.

	---

	## 💻 Quick Start
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
	tok = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)

	prompt = """\
	### Database schema
	table orders(id, customer_id, total, created_at)
	table customers(id, name, country)

	### Question
	List total sales per country ordered by total descending."""

	inputs = tok(prompt, return_tensors="pt")
	sql_out = model.generate(**inputs, max_new_tokens=128)
	print(tok.decode(sql_out[0], skip_special_tokens=True))
	```

	---

	## 🏋️‍♂️ Training Details

	* Data – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
	* Pre-processing – schema + question paired using the Text-to-SQL prompt pattern; SQL statements lower-cased; no anonymisation.
	* Hyper-parameters
	* batch size = 32 (gradient-accum = 4)
	* learning-rate = 2 e-4 (cosine schedule)
	* epochs = 3
	* LoRA rank = 16, α = 32
	* fp16 mixed-precision

	Total GPU-hours ≈ 5mins on Apple-Silicon.

	---

	## 🌱 Environmental Impact

	LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.

	---

	## 🛠️ Limitations & Biases

	* Trained on a synthetic, limited dataset → may under-perform on real production schemas.
	* Does not perform schema-linking; you must include the relevant schema in the prompt.
	* SQL is not guaranteed to be safe; always validate queries before execution.

	---

	## ✍️ Citation
	```
	@misc{mohanan2024tinyllama_sql_lora,
	title = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
	author = {Jerome Mohanan},
	note = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
	year = {2024}
	}
	```

	---

	## 📫 Contact
	Questions or feedback? Ping @jero2rome on Hugging Face or email <[email protected]>.