File size: 2,133 Bytes
6bc1e8d
6def9c1
 
 
 
 
 
 
 
 
 
 
 
 
 
5579881
79e8e8f
6bc1e8d
6def9c1
6bc1e8d
6def9c1
 
 
 
 
 
 
 
6bc1e8d
6def9c1
6bc1e8d
6def9c1
 
 
 
6bc1e8d
6def9c1
 
6bc1e8d
6def9c1
 
6bc1e8d
6def9c1
 
 
 
6bc1e8d
6def9c1
 
 
 
 
6bc1e8d
6def9c1
6bc1e8d
6def9c1
 
 
6bc1e8d
6def9c1
 
6bc1e8d
6def9c1
 
76b84ed
6def9c1
 
68556f9
 
6def9c1
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
language: en
license: mit
tags:
  - pytorch
  - causal-lm
  - language-model
  - flash-attention
datasets:
  - Salesforce/wikitext
pipeline_tag: question-answering
---

# PurelyUnfunctionalAI/GibberishGPT

A lightweight decoder-only transformer language model trained with Flash Attention on the WikiText dataset. This is a version used for learning about training LLMs and ML pipelines. The model does not actually output coherent text, although serves as a good starting point for learning more about LLMs
<a  href="https://github.com/PUFAI/GibberishGPT"> <img  alt="GitHub"  src="https://img.shields.io/badge/GitHub-Repo-blue?logo=github&style=flat-square"> </a>

## Model Details

- **Model Type:** Causal Language Model
- **Architecture:** Decoder-only Transformer
- **Embedding Size:** 512
- **Hidden Layers:** 8
- **Attention Heads:** 8
- **Context Length:** 512
- **Flash Attention:** Enabled
- **Training Data:** Salesforce/wikitext

## Usage

```python
import torch
import tiktoken
from transformers import AutoModelForCausalLM

# Load the tokenizer
tokenizer = tiktoken.get_encoding("gpt2")

# Load the model
model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")

# Encode input
input_text = "Your prompt here"
input_ids = tokenizer.encode(input_text)
input_tensor = torch.tensor([input_ids], dtype=torch.long)

# Generate
output = model.generate(input_tensor, max_length=100)
generated_text = tokenizer.decode(output[0].tolist())
print(generated_text)
```

# Limitations

- The model has a context length of 512 tokens
- It was trained on WikiText data which may not cover specialized domains
- As a lightweight model, it may not perform as well as larger LLMs on complex tasks

# Citation
If you use this model in your research, please cite:

```
@misc{GibberishGPT,
  author = {Gathara, Michael and Menon, Vaishak and Liu, Jason},
  title = {GibberishGPT: A Lightweight Language Model with Flash Attention},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face model repository},
  howpublished = {\url{https://huggingface.co/PurelyUnfunctionalAI/GibberishGPT}}
}
```