michaelgathara commited on
Commit
6def9c1
·
verified ·
1 Parent(s): 6bc1e8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -37
README.md CHANGED
@@ -1,50 +1,71 @@
1
  ---
2
- language: en
3
- license: mit
4
- tags:
5
- - pytorch
6
- - causal-lm
7
- - language-model
8
- - flash-attention
9
- ---
 
 
 
 
 
 
 
10
 
11
- # PurelyUnfunctionalAI/GibberishGPT
12
 
13
- This is a language model trained with Flash Attention. The model is based on a decoder-only transformer architecture.
 
 
 
 
 
 
 
14
 
15
- ## Model Details
16
 
17
- - **Model Type:** Causal Language Model
18
- - **Embedding Size:** 512
19
- - **Hidden Layers:** 8
20
- - **Attention Heads:** 8
21
- - **Context Length:** 512
22
- - **Flash Attention:** Enabled
23
 
24
- ## Usage
 
25
 
26
- ```python
27
- import tiktoken
28
- from transformers import AutoModelForCausalLM
29
 
30
- # Load the tokenizer
31
- tokenizer = tiktoken.get_encoding("gpt2")
 
 
32
 
33
- # Load the model
34
- model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")
 
 
 
35
 
36
- # Encode input
37
- input_text = "Your prompt here"
38
- input_ids = tokenizer.encode(input_text)
39
- input_tensor = torch.tensor([input_ids], dtype=torch.long)
40
 
41
- # Generate
42
- output = model.generate(input_tensor, max_length=100)
43
- generated_text = tokenizer.decode(output[0].tolist())
44
- print(generated_text)
45
- ```
46
 
47
- ## License
 
48
 
49
- This model is available under the MIT License.
50
-
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - pytorch
6
+ - causal-lm
7
+ - language-model
8
+ - flash-attention
9
+ datasets:
10
+ - Salesforce/wikitext
11
+ pipeline_tag: question-answering
12
+ ---
13
+
14
+ # PurelyUnfunctionalAI/GibberishGPT
15
+
16
+ A lightweight decoder-only transformer language model trained with Flash Attention on the WikiText dataset.
17
 
18
+ ## Model Details
19
 
20
+ - **Model Type:** Causal Language Model
21
+ - **Architecture:** Decoder-only Transformer
22
+ - **Embedding Size:** 512
23
+ - **Hidden Layers:** 8
24
+ - **Attention Heads:** 8
25
+ - **Context Length:** 512
26
+ - **Flash Attention:** Enabled
27
+ - **Training Data:** Salesforce/wikitext
28
 
29
+ ## Usage
30
 
31
+ ```python
32
+ import torch
33
+ import tiktoken
34
+ from transformers import AutoModelForCausalLM
 
 
35
 
36
+ # Load the tokenizer
37
+ tokenizer = tiktoken.get_encoding("gpt2")
38
 
39
+ # Load the model
40
+ model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")
 
41
 
42
+ # Encode input
43
+ input_text = "Your prompt here"
44
+ input_ids = tokenizer.encode(input_text)
45
+ input_tensor = torch.tensor([input_ids], dtype=torch.long)
46
 
47
+ # Generate
48
+ output = model.generate(input_tensor, max_length=100)
49
+ generated_text = tokenizer.decode(output[0].tolist())
50
+ print(generated_text)
51
+ ```
52
 
53
+ # Limitations
 
 
 
54
 
55
+ - The model has a context length of 512 tokens
56
+ - It was trained on WikiText data which may not cover specialized domains
57
+ - As a lightweight model, it may not perform as well as larger LLMs on complex tasks
 
 
58
 
59
+ # Citation
60
+ If you use this model in your research, please cite:
61
 
62
+ ```
63
+ @misc{GibberishGPT,
64
+ author = {Gathara, Michael},
65
+ title = {GibberishGPT: A Lightweight Language Model with Flash Attention},
66
+ year = {2025},
67
+ publisher = {GitHub},
68
+ journal = {GitHub repository},
69
+ howpublished = {\url{https://huggingface.co/PurelyUnfunctionalAI/GibberishGPT}}
70
+ }
71
+ ```