PurelyUnfunctionalAI
/

GibberishGPT

Question Answering

flash_attention_lm

flash-attention

Model card Files Files and versions

michaelgathara commited on Mar 27

Commit

6def9c1

·

verified ·

1 Parent(s): 6bc1e8d

Update README.md

Files changed (1) hide show

README.md +58 -37

README.md CHANGED Viewed

@@ -1,50 +1,71 @@
 ---
-        language: en
-        license: mit
-        tags:
-        - pytorch
-        - causal-lm
-        - language-model
-        - flash-attention
-        ---
-        # PurelyUnfunctionalAI/GibberishGPT
-        This is a language model trained with Flash Attention. The model is based on a decoder-only transformer architecture.
-        ## Model Details
-        - **Model Type:** Causal Language Model
-        - **Embedding Size:** 512
-        - **Hidden Layers:** 8
-        - **Attention Heads:** 8
-        - **Context Length:** 512
-        - **Flash Attention:** Enabled
-        ## Usage
-        ```python
-        import tiktoken
-        from transformers import AutoModelForCausalLM
-        # Load the tokenizer
-        tokenizer = tiktoken.get_encoding("gpt2")
-        # Load the model
-        model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")
-        # Encode input
-        input_text = "Your prompt here"
-        input_ids = tokenizer.encode(input_text)
-        input_tensor = torch.tensor([input_ids], dtype=torch.long)
-        # Generate
-        output = model.generate(input_tensor, max_length=100)
-        generated_text = tokenizer.decode(output[0].tolist())
-        print(generated_text)
-        ```
-        ## License
-        This model is available under the MIT License.

 ---
+language: en
+license: mit
+tags:
+  - pytorch
+  - causal-lm
+  - language-model
+  - flash-attention
+datasets:
+  - Salesforce/wikitext
+pipeline_tag: question-answering
+---
+# PurelyUnfunctionalAI/GibberishGPT
+A lightweight decoder-only transformer language model trained with Flash Attention on the WikiText dataset.
+## Model Details
+- **Model Type:** Causal Language Model
+- **Architecture:** Decoder-only Transformer
+- **Embedding Size:** 512
+- **Hidden Layers:** 8
+- **Attention Heads:** 8
+- **Context Length:** 512
+- **Flash Attention:** Enabled
+- **Training Data:** Salesforce/wikitext
+## Usage
+```python
+import torch
+import tiktoken
+from transformers import AutoModelForCausalLM
+# Load the tokenizer
+tokenizer = tiktoken.get_encoding("gpt2")
+# Load the model
+model = AutoModelForCausalLM.from_pretrained("PurelyUnfunctionalAI/GibberishGPT")
+# Encode input
+input_text = "Your prompt here"
+input_ids = tokenizer.encode(input_text)
+input_tensor = torch.tensor([input_ids], dtype=torch.long)
+# Generate
+output = model.generate(input_tensor, max_length=100)
+generated_text = tokenizer.decode(output[0].tolist())
+print(generated_text)
+```
+# Limitations
+- The model has a context length of 512 tokens
+- It was trained on WikiText data which may not cover specialized domains
+- As a lightweight model, it may not perform as well as larger LLMs on complex tasks
+# Citation
+If you use this model in your research, please cite:
+```
+@misc{GibberishGPT,
+  author = {Gathara, Michael},
+  title = {GibberishGPT: A Lightweight Language Model with Flash Attention},
+  year = {2025},
+  publisher = {GitHub},
+  journal = {GitHub repository},
+  howpublished = {\url{https://huggingface.co/PurelyUnfunctionalAI/GibberishGPT}}
+}
+```