GibberishGPT / README.md
michaelgathara's picture
Update README.md
5579881 verified
---
language: en
license: mit
tags:
- pytorch
- causal-lm
- language-model
- flash-attention
datasets:
- Salesforce/wikitext
pipeline_tag: question-answering
---
# PurelyUnfunctionalAI/GibberishGPT
A lightweight decoder-only transformer language model trained with Flash Attention on the WikiText dataset. This is a version used for learning about training LLMs and ML pipelines. The model does not actually output coherent text, although serves as a good starting point for learning more about LLMs
<a href="https://github.com/PUFAI/GibberishGPT"> <img alt="GitHub" src="https://img.shields.io/badge/GitHub-Repo-blue?logo=github&style=flat-square"> </a>
## Model Details
- **Model Type:** Causal Language Model
- **Architecture:** Decoder-only Transformer
- **Embedding Size:** 512
- **Hidden Layers:** 8
- **Attention Heads:** 8
- **Context Length:** 512
- **Flash Attention:** Enabled
- **Training Data:** Salesforce/wikitext
## Usage
```python
import torch
import tiktoken
from transformers import AutoModelForCausalLM
# Load the tokenizer
tokenizer = tiktoken.get_encoding("gpt2")
# Load the model
model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")
# Encode input
input_text = "Your prompt here"
input_ids = tokenizer.encode(input_text)
input_tensor = torch.tensor([input_ids], dtype=torch.long)
# Generate
output = model.generate(input_tensor, max_length=100)
generated_text = tokenizer.decode(output[0].tolist())
print(generated_text)
```
# Limitations
- The model has a context length of 512 tokens
- It was trained on WikiText data which may not cover specialized domains
- As a lightweight model, it may not perform as well as larger LLMs on complex tasks
# Citation
If you use this model in your research, please cite:
```
@misc{GibberishGPT,
author = {Gathara, Michael and Menon, Vaishak and Liu, Jason},
title = {GibberishGPT: A Lightweight Language Model with Flash Attention},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face model repository},
howpublished = {\url{https://huggingface.co/PurelyUnfunctionalAI/GibberishGPT}}
}
```