prokaggler7 commited on
Commit
87d3f5e
·
verified ·
1 Parent(s): 17d6012

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - text-generation
6
+ - gpt2
7
+ - causal-lm
8
+ - shakespeare
9
+ - small-model
10
+ ---
11
+
12
+ # 🧠 SLM-GPT2: Tiny Shakespeare GPT-2 Model
13
+
14
+ `SLM-GPT2` is a small GPT-2-like language model trained from scratch on the [Tiny Shakespeare dataset](https://huggingface.co/datasets/tiny_shakespeare). It’s a toy model meant for educational purposes, experimentation, and understanding how transformer-based language models work.
15
+
16
+ ---
17
+
18
+ ## ✨ Model Details
19
+
20
+ - **Architecture**: GPT-2 (custom config)
21
+ - **Layers**: 4
22
+ - **Hidden size**: 256
23
+ - **Heads**: 4
24
+ - **Max sequence length**: 128
25
+ - **Vocabulary size**: Same as tokenizer (based on `distilgpt2` or custom)
26
+ - **Training epochs**: 3
27
+ - **Dataset**: [tiny_shakespeare](https://huggingface.co/datasets/tiny_shakespeare)
28
+
29
+ ---
30
+
31
+ ## 🧪 Intended Use
32
+
33
+ - Educational demos
34
+ - Debugging/training pipeline validation
35
+ - Low-resource inference tests
36
+ - Not suitable for production or accurate text generation
37
+
38
+ ---
39
+
40
+ ## 🚫 Limitations
41
+
42
+ - Trained on a tiny dataset (~100 KB)
43
+ - Limited vocabulary and generalization
44
+ - Can generate incoherent or biased outputs
45
+ - Not safe for deployment in real-world applications
46
+
47
+ ---
48
+
49
+ ## 💻 How to Use
50
+
51
+ ```python
52
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
53
+
54
+ model = AutoModelForCausalLM.from_pretrained("your-username/slm-gpt2")
55
+ tokenizer = AutoTokenizer.from_pretrained("your-username/slm-gpt2")
56
+
57
+ generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
58
+ output = generator("To be or not to be", max_length=50)
59
+ print(output[0]['generated_text'])