Llama-3.3-Tiny-Classifier / README.md

skar0

Upload tiny random Llama-3.3 model (seed 42)

e4cec7d verified about 2 months ago

preview code

raw

history blame contribute delete

1.57 kB

metadata

license: mit
base_model: JackFram/llama-68m
tags:
  - tiny-model
  - random-weights
  - testing
  - llama

Llama-3.3-Tiny-Instruct

This is a tiny random version of the JackFram/llama-68m model, created for testing and experimentation purposes.

Model Details

Base model: JackFram/llama-68m
Seed: 42
Hidden size: 768
Number of layers: 2
Number of attention heads: 12
Vocabulary size: 32000
Max position embeddings: 2048

Parameters

Total parameters: ~43,454,976
Trainable parameters: ~43,454,976

Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier")
tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier")

# Generate text (note: this model has random weights!)
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Important Notes

⚠️ This model has random weights and is not trained! It's designed for:

Testing model loading and inference pipelines
Benchmarking model architecture
Educational purposes
Rapid prototyping where actual model performance isn't needed

The model will generate random/nonsensical text since it hasn't been trained on any data.

Creation

This model was created using the upload_tiny_llama33.py script from the minimal-grpo-trainer repository.