|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Salesforce/wikitext |
|
language: |
|
- en |
|
base_model: |
|
- openai-community/gpt2 |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- text-generation-inference |
|
--- |
|
|
|
# **Gpt2-Wikitext-9180** |
|
|
|
> **Gpt2-Wikitext-9180**, fine-tuned from GPT-2, is a Transformer-based language model trained on a large English corpus (WikiText) using self-supervised learning. This means it was trained on raw, unlabeled text data, using an automated process to create inputs and labels by predicting the next word in a sentence. No manual annotation was involved, allowing the model to leverage a vast amount of publicly available data. |
|
|
|
## Demo Inference |
|
|
|
```py |
|
pip install transformers |
|
``` |
|
|
|
```py |
|
import torch |
|
from transformers import GPT2LMHeadModel, GPT2Tokenizer |
|
|
|
# Loading pre-trained GPT-2 model and tokenizer |
|
model_name = "prithivMLmods/Gpt2-Wikitext-9180" |
|
tokenizer = GPT2Tokenizer.from_pretrained(model_name) |
|
model = GPT2LMHeadModel.from_pretrained(model_name) |
|
|
|
# Set the model to evaluation mode |
|
model.eval() |
|
``` |
|
|
|
```py |
|
def generate_text(prompt, max_length=100, temperature=0.8, top_k=50): |
|
input_ids = tokenizer.encode(prompt, return_tensors="pt") |
|
output = model.generate( |
|
input_ids, |
|
max_length=max_length, |
|
temperature=temperature, |
|
top_k=top_k, |
|
pad_token_id=tokenizer.eos_token_id, |
|
do_sample=True |
|
) |
|
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
return generated_text |
|
``` |
|
|
|
```py |
|
# Example prompt |
|
prompt = "Once upon a time" |
|
generated_text = generate_text(prompt, max_length=68) |
|
|
|
# Print the generated text |
|
print(generated_text) |
|
``` |
|
|
|
|
|
--- |
|
|
|
### **Intended Use Case** |
|
|
|
* **Text Generation**: Auto-completion, story generation, or dialogue simulation. |
|
* **Language Modeling**: Understanding language structure and context for downstream NLP tasks. |
|
* **Educational and Research Use**: Exploring fine-tuning techniques, language understanding, or benchmarking language models. |
|
* **Prototyping**: Quick deployment of language-based features in applications and interfaces. |
|
|
|
--- |
|
|
|
### **Limitations** |
|
|
|
* **Factual Inaccuracy**: May generate plausible-sounding but incorrect or outdated information. |
|
* **Bias and Toxicity**: Can reflect biases present in training data (e.g., stereotypes, offensive language). |
|
* **Context Length**: Limited context window inherited from GPT-2 architecture. |
|
* **Not Real-Time Aware**: Lacks access to current events or updates beyond its training data. |
|
* **Lack of Understanding**: Generates text based on patterns, not genuine comprehension or reasoning. |