Enhance model card: Add GitHub link and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +44 -6
README.md CHANGED
@@ -1,20 +1,20 @@
1
  ---
2
- pipeline_tag: text-generation
3
  library_name: transformers
4
  license: cc-by-nc-4.0
 
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
10
-
11
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
12
 
13
  ### Important Links
14
 
15
  📖[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
16
- 🤗[HuggingFace](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
17
- 🤖[ModelScope](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
 
18
 
19
  ## News
20
 
@@ -49,7 +49,7 @@ tags:
49
 
50
  <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png" height="500" alt="slmsql_bird_main">
51
 
52
- <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png" height="500" alt="slmsql_spider_main">
53
 
54
  Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
55
 
@@ -65,7 +65,7 @@ Performance Comparison of different Text-to-SQL methods on BIRD dev and test dat
65
  | SLM-SQL-Base-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.5B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.5B) |
66
  | SLM-SQL-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.5B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.5B) |
67
  | CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) |
68
- | SLM-SQL-Base-0.6B | Qwen3-0.6B | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.6B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.6B) |
69
  | SLM-SQL-0.6B | Qwen3-0.6B | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.6B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.6B) |
70
  | SLM-SQL-Base-1.3B | deepseek-coder-1.3b-instruct | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.3B ) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.3B ) |
71
  | SLM-SQL-1.3B | deepseek-coder-1.3b-instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.3B ) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.3B ) |
@@ -79,6 +79,44 @@ Performance Comparison of different Text-to-SQL methods on BIRD dev and test dat
79
  | SynsQL-Merge-Think-310k | [🤖 Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Merge-Think-310k) | [🤗 HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Merge-Think-310k) |
80
  | bird train and dev dataset | [🤖 Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [🤗 HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  ## TODO
83
 
84
  - [ ] Release inference code
 
1
  ---
 
2
  library_name: transformers
3
  license: cc-by-nc-4.0
4
+ pipeline_tag: text-generation
5
  tags:
6
  - text-to-sql
7
  - reinforcement-learning
8
  ---
9
 
 
10
  # SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
11
 
12
  ### Important Links
13
 
14
  📖[Arxiv Paper](https://arxiv.org/abs/2507.22478) |
15
+ \ud83d\udcbb[GitHub Repository](https://github.com/CycloneBoy/slm_sql) |
16
+ 🤗[HuggingFace Collection](https://huggingface.co/collections/cycloneboy/slm-sql-688b02f99f958d7a417658dc) |
17
+ 🤖[ModelScope Collection](https://modelscope.cn/collections/SLM-SQL-624bb6a60e9643) |
18
 
19
  ## News
20
 
 
49
 
50
  <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_bird_main.png" height="500" alt="slmsql_bird_main">
51
 
52
+ <img src="https://raw.githubusercontent.com/CycloneBoy/slm_sql/main/data/image/slmsql_spider_main.png" height="500" alt="slm_sql_spider_main">
53
 
54
  Performance Comparison of different Text-to-SQL methods on BIRD dev and test dataset.
55
 
 
65
  | SLM-SQL-Base-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.5B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.5B) |
66
  | SLM-SQL-1.5B | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.5B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.5B) |
67
  | CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct | Qwen2.5-Coder-1.5B-Instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instruct) |
68
+ | SLM-SQL-Base-0.6B | Qwen3-0.6B | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-0.6B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-0.6B) |\
69
  | SLM-SQL-0.6B | Qwen3-0.6B | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-0.6B) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-0.6B) |
70
  | SLM-SQL-Base-1.3B | deepseek-coder-1.3b-instruct | SFT | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-Base-1.3B ) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-Base-1.3B ) |
71
  | SLM-SQL-1.3B | deepseek-coder-1.3b-instruct | SFT + GRPO | [🤖 Modelscope](https://modelscope.cn/models/cycloneboy/SLM-SQL-1.3B ) | [🤗 HuggingFace](https://huggingface.co/cycloneboy/SLM-SQL-1.3B ) |
 
79
  | SynsQL-Merge-Think-310k | [🤖 Modelscope](https://modelscope.cn/datasets/cycloneboy/SynsQL-Merge-Think-310k) | [🤗 HuggingFace](https://huggingface.co/datasets/cycloneboy/SynsQL-Merge-Think-310k) |
80
  | bird train and dev dataset | [🤖 Modelscope](https://modelscope.cn/datasets/cycloneboy/bird_train) | [🤗 HuggingFace](https://huggingface.co/datasets/cycloneboy/bird_train) |
81
 
82
+ ## Usage
83
+
84
+ You can easily load the model and use it for text-to-SQL generation with the Hugging Face `transformers` library. Here is an example:
85
+
86
+ ```python
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer
88
+ import torch
89
+
90
+ model_name = "cycloneboy/SLM-SQL-0.5B"
91
+ # Or choose another model from the table above, e.g., "cycloneboy/SLM-SQL-1.5B"
92
+
93
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
94
+ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16)
95
+
96
+ # Example prompt for text-to-SQL
97
+ # Replace 'Your_Database_Schema_Here' with your actual database schema/DDL if needed
98
+ # The model might expect a specific prompt format based on its training,
99
+ # refer to the original GitHub repository for detailed prompting instructions.
100
+ prompt = "Please give me the names of all employees who work in the 'Sales' department."
101
+
102
+ # Apply chat template if available, or just format the prompt directly
103
+ if hasattr(tokenizer, 'apply_chat_template') and tokenizer.chat_template is not None:
104
+ messages = [{"role": "user", "content": prompt}]
105
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
106
+ else:
107
+ input_text = prompt
108
+
109
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
110
+
111
+ # Generate SQL query
112
+ outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.7, do_sample=True, top_p=0.95)
113
+
114
+ # Decode and print the generated SQL
115
+ generated_sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
116
+ print(f"Generated SQL:
117
+ {generated_sql}")
118
+ ```
119
+
120
  ## TODO
121
 
122
  - [ ] Release inference code