---
license: apache-2.0
tags:
- text-generation
- language-model
- causal-lm
- cosmicfish
- 90m
- transformer
- rope
- gqa
- swiglu
- rmsnorm
language: en
datasets:
- CosmicSet-2.0-mini
- akkiisfrommars/TreeCorpusCleanedmodel
model_type: CosmicFish
pipeline_tag: text-generation
---

# CosmicFish-90M

A 90M parameter language model with modern architecture improvements developed by Mistyoz AI.

## Quick Start

**The easiest way to chat with CosmicFish is using our chat.py script:**

```bash
# Download the chat script from this repository
wget https://huggingface.co/MistyozAI/CosmicFish-90M/resolve/main/chat.py

# Install dependencies
pip install transformers huggingface-hub termcolor safetensors

# Run the chat interface (automatically downloads model)
python chat.py
```

The `chat.py` script handles all model loading, generation, and provides the best chat experience with live streaming, repetition penalty, and conversation commands.

## Model Details

- **Parameters**: 91.6M
- **Architecture**: CosmicFish (RoPE, GQA, SwiGLU, RMSNorm)
- **Context Length**: 512 tokens
- **Vocabulary**: 50,257 tokens
- **Training Data**: CosmicSet 2.0 mini
- **Developer**: Mistyoz AI
- **Repository**: MistyozAI/CosmicFish-90M
- **Format**: Safetensors

## Usage

### Installation

```bash
pip install transformers huggingface-hub termcolor safetensors torch
```

### Quick Chat Interface

```python
from transformers import GPT2Tokenizer
from huggingface_hub import snapshot_download
from safetensors.torch import load_file
import torch
import json
import os

# Download model from Hugging Face Hub
cache_dir = snapshot_download(repo_id="MistyozAI/CosmicFish-90M")

# Load tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Load config
with open(os.path.join(cache_dir, "config.json")) as f:
    config_dict = json.load(f)

# Load model weights from safetensors
state_dict = load_file(os.path.join(cache_dir, "model.safetensors"))

# Note: Full model class available in the repository
print("Model downloaded and ready for use!")
```

### Advanced Generation with Repetition Penalty

```python
def generate_with_repetition_penalty(model, tokenizer, prompt, max_tokens=100, temperature=0.5, penalty=1.2):
    input_ids = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
    generated = input_ids.clone()
    
    for _ in range(max_tokens):
        with torch.no_grad():
            logits, _ = model(generated)
        
        next_token_logits = logits[:, -1, :] / temperature
        
        # Apply repetition penalty
        if penalty > 1.0:
            for token_id in set(generated[0].tolist()):
                if next_token_logits[0, token_id] > 0:
                    next_token_logits[0, token_id] /= penalty
                else:
                    next_token_logits[0, token_id] *= penalty
        
        probs = torch.nn.functional.softmax(next_token_logits, dim=-1)
        next_token = torch.multinomial(probs, num_samples=1)
        
        if next_token.item() == tokenizer.eos_token_id:
            break
            
        generated = torch.cat([generated, next_token], dim=1)
    
    return tokenizer.decode(generated[0], skip_special_tokens=True)
```

### Loading Model with Safetensors

```python
from safetensors.torch import load_file
from modeling_cosmicfish import CosmicFish, CosmicConfig
import json

def load_cosmicfish_model(model_path):
    # Load config
    with open(os.path.join(model_path, "config.json")) as f:
        config_dict = json.load(f)
    
    # Create model config
    config = CosmicConfig(
        vocab_size=config_dict["vocab_size"],
        block_size=config_dict["block_size"], 
        n_layer=config_dict["n_layer"],
        n_head=config_dict["n_head"],
        n_embd=config_dict["n_embd"],
        bias=config_dict["bias"],
        dropout=0.0,
        use_rotary=config_dict["use_rotary"],
        use_swiglu=config_dict["use_swiglu"],
        use_gqa=config_dict["use_gqa"],
        n_query_groups=config_dict["n_query_groups"]
    )
    
    # Create model
    model = CosmicFish(config)
    
    # Load weights from safetensors (secure format)
    state_dict = load_file(os.path.join(model_path, "model.safetensors"))
    
    # Handle weight sharing (lm_head.weight shares with transformer.wte.weight)
    if 'lm_head.weight' not in state_dict and 'transformer.wte.weight' in state_dict:
        state_dict['lm_head.weight'] = state_dict['transformer.wte.weight']
    
    model.load_state_dict(state_dict)
    model.eval()
    
    return model
```

### Chat Interface

```python
def chat_with_model():
    conversation = []
    
    while True:
        user_input = input("You: ")
        if user_input.lower() in ['quit', 'exit']:
            break
        
        context = "Below is a conversation between a human and an AI assistant.\n\n"
        for human, ai in conversation:
            context += f"Human: {human}\nAssistant: {ai}\n\n"
        context += f"Human: {user_input}\nAssistant:"
        
        # Generate response with repetition penalty
        response = generate_with_repetition_penalty(
            model, tokenizer, context, 
            max_tokens=150, temperature=0.7, penalty=1.2
        )
        
        # Extract just the assistant's response
        response = response.split("Assistant:")[-1].split('\n')[0].strip()
        print(f"CosmicFish: {response}")
        
        conversation.append((user_input, response))

chat_with_model()
```

## Architecture

CosmicFish uses several modern improvements over standard transformers:

- **RoPE (Rotary Position Embeddings)**: Better position encoding than absolute positions
- **GQA (Grouped-Query Attention)**: Reduces memory usage with 4 query groups 
- **SwiGLU**: More effective activation function than ReLU/GELU
- **RMSNorm**: Simpler, more stable normalization than LayerNorm

## Training

- **Dataset**: CosmicSet 2.0 mini
- **Sequence Length**: 512 tokens
- **Training Steps**: ~200K iterations
- **Hardware**: Nvidia A40 x1

## Performance

- **Speed**: Varies by hardware (not benchmarked)
- **Memory**: ~256MB RAM
- **File Size**: 185MB
- **Loading**: Fast and secure with safetensors

## Limitations

- Small model size (90M parameters) may produce less accurate responses
- 512 token context limit
- English only
- Training data cutoff applies
- May generate incorrect information
- Cannot browse internet or access real-time data

## License

Apache 2.0 - see LICENSE file.

## Credit

If you use CosmicFish-90M, please credit Mistyoz AI.