File size: 6,528 Bytes
8d4d0f0
 
 
 
 
0836361
 
 
8d4d0f0
0836361
 
 
 
8d4d0f0
 
0836361
8d4d0f0
 
66b7216
8d4d0f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0836361
8d4d0f0
 
 
 
 
 
 
 
 
 
 
 
 
0836361
8d4d0f0
 
0836361
8d4d0f0
 
 
 
 
 
84aa403
8d4d0f0
 
 
 
 
 
 
0836361
8d4d0f0
 
 
 
 
 
 
 
 
 
 
 
 
 
0836361
 
8d4d0f0
 
 
 
 
 
 
 
0836361
8d4d0f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0836361
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d4d0f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0836361
8d4d0f0
5c23344
8d4d0f0
 
 
 
 
0836361
ad3aadd
0836361
8d4d0f0
 
 
 
 
0836361
8d4d0f0
 
 
 
 
 
 
 
 
 
66b7216
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
---
license: apache-2.0
tags:
- text-generation
- language-model
- causal-lm
- cosmicfish
- 90m
- transformer
- rope
- gqa
- swiglu
- rmsnorm
language: en
datasets:
- CosmicSet-2.0-mini
- akkiisfrommars/TreeCorpusCleanedmodel
model_type: CosmicFish
pipeline_tag: text-generation
---

# CosmicFish-90M

A 90M parameter language model with modern architecture improvements developed by Mistyoz AI.

## Quick Start

**The easiest way to chat with CosmicFish is using our chat.py script:**

```bash
# Download the chat script from this repository
wget https://huggingface.co/MistyozAI/CosmicFish-90M/resolve/main/chat.py

# Install dependencies
pip install transformers huggingface-hub termcolor safetensors

# Run the chat interface (automatically downloads model)
python chat.py
```

The `chat.py` script handles all model loading, generation, and provides the best chat experience with live streaming, repetition penalty, and conversation commands.

## Model Details

- **Parameters**: 91.6M
- **Architecture**: CosmicFish (RoPE, GQA, SwiGLU, RMSNorm)
- **Context Length**: 512 tokens
- **Vocabulary**: 50,257 tokens
- **Training Data**: CosmicSet 2.0 mini
- **Developer**: Mistyoz AI
- **Repository**: MistyozAI/CosmicFish-90M
- **Format**: Safetensors

## Usage

### Installation

```bash
pip install transformers huggingface-hub termcolor safetensors torch
```

### Quick Chat Interface

```python
from transformers import GPT2Tokenizer
from huggingface_hub import snapshot_download
from safetensors.torch import load_file
import torch
import json
import os

# Download model from Hugging Face Hub
cache_dir = snapshot_download(repo_id="MistyozAI/CosmicFish-90M")

# Load tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Load config
with open(os.path.join(cache_dir, "config.json")) as f:
    config_dict = json.load(f)

# Load model weights from safetensors
state_dict = load_file(os.path.join(cache_dir, "model.safetensors"))

# Note: Full model class available in the repository
print("Model downloaded and ready for use!")
```

### Advanced Generation with Repetition Penalty

```python
def generate_with_repetition_penalty(model, tokenizer, prompt, max_tokens=100, temperature=0.5, penalty=1.2):
    input_ids = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)
    generated = input_ids.clone()
    
    for _ in range(max_tokens):
        with torch.no_grad():
            logits, _ = model(generated)
        
        next_token_logits = logits[:, -1, :] / temperature
        
        # Apply repetition penalty
        if penalty > 1.0:
            for token_id in set(generated[0].tolist()):
                if next_token_logits[0, token_id] > 0:
                    next_token_logits[0, token_id] /= penalty
                else:
                    next_token_logits[0, token_id] *= penalty
        
        probs = torch.nn.functional.softmax(next_token_logits, dim=-1)
        next_token = torch.multinomial(probs, num_samples=1)
        
        if next_token.item() == tokenizer.eos_token_id:
            break
            
        generated = torch.cat([generated, next_token], dim=1)
    
    return tokenizer.decode(generated[0], skip_special_tokens=True)
```

### Loading Model with Safetensors

```python
from safetensors.torch import load_file
from modeling_cosmicfish import CosmicFish, CosmicConfig
import json

def load_cosmicfish_model(model_path):
    # Load config
    with open(os.path.join(model_path, "config.json")) as f:
        config_dict = json.load(f)
    
    # Create model config
    config = CosmicConfig(
        vocab_size=config_dict["vocab_size"],
        block_size=config_dict["block_size"], 
        n_layer=config_dict["n_layer"],
        n_head=config_dict["n_head"],
        n_embd=config_dict["n_embd"],
        bias=config_dict["bias"],
        dropout=0.0,
        use_rotary=config_dict["use_rotary"],
        use_swiglu=config_dict["use_swiglu"],
        use_gqa=config_dict["use_gqa"],
        n_query_groups=config_dict["n_query_groups"]
    )
    
    # Create model
    model = CosmicFish(config)
    
    # Load weights from safetensors (secure format)
    state_dict = load_file(os.path.join(model_path, "model.safetensors"))
    
    # Handle weight sharing (lm_head.weight shares with transformer.wte.weight)
    if 'lm_head.weight' not in state_dict and 'transformer.wte.weight' in state_dict:
        state_dict['lm_head.weight'] = state_dict['transformer.wte.weight']
    
    model.load_state_dict(state_dict)
    model.eval()
    
    return model
```

### Chat Interface

```python
def chat_with_model():
    conversation = []
    
    while True:
        user_input = input("You: ")
        if user_input.lower() in ['quit', 'exit']:
            break
        
        context = "Below is a conversation between a human and an AI assistant.\n\n"
        for human, ai in conversation:
            context += f"Human: {human}\nAssistant: {ai}\n\n"
        context += f"Human: {user_input}\nAssistant:"
        
        # Generate response with repetition penalty
        response = generate_with_repetition_penalty(
            model, tokenizer, context, 
            max_tokens=150, temperature=0.7, penalty=1.2
        )
        
        # Extract just the assistant's response
        response = response.split("Assistant:")[-1].split('\n')[0].strip()
        print(f"CosmicFish: {response}")
        
        conversation.append((user_input, response))

chat_with_model()
```

## Architecture

CosmicFish uses several modern improvements over standard transformers:

- **RoPE (Rotary Position Embeddings)**: Better position encoding than absolute positions
- **GQA (Grouped-Query Attention)**: Reduces memory usage with 4 query groups 
- **SwiGLU**: More effective activation function than ReLU/GELU
- **RMSNorm**: Simpler, more stable normalization than LayerNorm

## Training

- **Dataset**: CosmicSet 2.0 mini
- **Sequence Length**: 512 tokens
- **Training Steps**: ~200K iterations
- **Hardware**: Nvidia A40 x1

## Performance

- **Speed**: Varies by hardware (not benchmarked)
- **Memory**: ~256MB RAM
- **File Size**: 185MB
- **Loading**: Fast and secure with safetensors

## Limitations

- Small model size (90M parameters) may produce less accurate responses
- 512 token context limit
- English only
- Training data cutoff applies
- May generate incorrect information
- Cannot browse internet or access real-time data

## License

Apache 2.0 - see LICENSE file.

## Credit

If you use CosmicFish-90M, please credit Mistyoz AI.