--- license: apache-2.0 language: - en base_model: - TinyLlama/TinyLlama-1.1B-Chat-v1.0 tags: - lora - fused - text-to-sql - natural-language-to-sql - mlx - apple-silicon - fine-tuning - instruction-following model_creator: Jerome Mohanan datasets: - spider # used conceptually as inspiration; see Training Data --- # TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL **TinyLlama-1.1B-Chat-LoRA-Fused-v1.0** is a 1.1 billion parameter model derived from *TinyLlama/TinyLlama-1.1B-Chat-v1.0*. Using parameter-efficient **LoRA** fine-tuning and the new Apple-Silicon-native **MLX** framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases. After training, the LoRA adapters were **merged (“fused”)** into the base weights, so you only need this single checkpoint for inference. --- ## 🗝️ Key Facts | Property | Value | |---|---| | Base model | TinyLlama 1.1B Chat v1.0 | | Task | Natural-Language → SQL generation | | Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 | | Training framework | MLX 0.8 + PEFT | | Hardware | MacBook Pro M4 Pro (20-core GPU) | | Checkpoint size | 2.1 GB (fp16, fused) | | License | Apache 2.0 | --- ## ✨ Intended Use * **Interactive data exploration** inside BI notebooks or chatbots. * **Customer-support analytics** — empower non-SQL users to ask free-form questions. * **Education & demos** showing how LoRA + MLX enables rapid on-device fine-tuning. The model was trained on synthetic NL-SQL pairs for demo purposes. **Do not** deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review. --- ## 💻 Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0" tok = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) prompt = """\ ### Database schema table orders(id, customer_id, total, created_at) table customers(id, name, country) ### Question List total sales per country ordered by total descending.""" inputs = tok(prompt, return_tensors="pt") sql_out = model.generate(**inputs, max_new_tokens=128) print(tok.decode(sql_out[0], skip_special_tokens=True)) ``` --- ## 🏋️‍♂️ Training Details * **Data** – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness. * **Pre-processing** – schema + question paired using the *Text-to-SQL prompt* pattern; SQL statements lower-cased; no anonymisation. * **Hyper-parameters** * batch size = 32 (gradient-accum = 4) * learning-rate = 2 e-4 (cosine schedule) * epochs = 3 * LoRA rank = 16, α = 32 * fp16 mixed-precision Total GPU-hours ≈ 5mins on Apple-Silicon. --- ## 🌱 Environmental Impact LoRA fine-tuning on consumer Apple-Silicon is energy-efficient. --- ## 🛠️ Limitations & Biases * Trained on a synthetic, limited dataset → may under-perform on real production schemas. * Does **not** perform schema-linking; you must include the relevant schema in the prompt. * SQL is not guaranteed to be safe; always validate queries before execution. --- ## ✍️ Citation ``` @misc{mohanan2024tinyllama_sql_lora, title = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0}, author = {Jerome Mohanan}, note = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0}, year = {2024} } ``` --- ## 📫 Contact Questions or feedback? Ping **@jero2rome** on Hugging Face or email .