--- license: apache-2.0 tags: - text-generation - language-model - causal-lm - cosmicfish - 90m - transformer - rope - gqa - swiglu - rmsnorm language: en datasets: - CosmicSet-2.0-mini - akkiisfrommars/TreeCorpusCleanedmodel model_type: CosmicFish pipeline_tag: text-generation --- # CosmicFish-90M A 90M parameter language model with modern architecture improvements developed by Mistyoz AI. ## Quick Start **The easiest way to chat with CosmicFish is using our chat.py script:** ```bash # Download the chat script from this repository wget https://huggingface.co/MistyozAI/CosmicFish-90M/resolve/main/chat.py # Install dependencies pip install transformers huggingface-hub termcolor safetensors # Run the chat interface (automatically downloads model) python chat.py ``` The `chat.py` script handles all model loading, generation, and provides the best chat experience with live streaming, repetition penalty, and conversation commands. ## Model Details - **Parameters**: 91.6M - **Architecture**: CosmicFish (RoPE, GQA, SwiGLU, RMSNorm) - **Context Length**: 512 tokens - **Vocabulary**: 50,257 tokens - **Training Data**: CosmicSet 2.0 mini - **Developer**: Mistyoz AI - **Repository**: MistyozAI/CosmicFish-90M - **Format**: Safetensors ## Usage ### Installation ```bash pip install transformers huggingface-hub termcolor safetensors torch ``` ### Quick Chat Interface ```python from transformers import GPT2Tokenizer from huggingface_hub import snapshot_download from safetensors.torch import load_file import torch import json import os # Download model from Hugging Face Hub cache_dir = snapshot_download(repo_id="MistyozAI/CosmicFish-90M") # Load tokenizer tokenizer = GPT2Tokenizer.from_pretrained("gpt2") # Load config with open(os.path.join(cache_dir, "config.json")) as f: config_dict = json.load(f) # Load model weights from safetensors state_dict = load_file(os.path.join(cache_dir, "model.safetensors")) # Note: Full model class available in the repository print("Model downloaded and ready for use!") ``` ### Advanced Generation with Repetition Penalty ```python def generate_with_repetition_penalty(model, tokenizer, prompt, max_tokens=100, temperature=0.5, penalty=1.2): input_ids = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0) generated = input_ids.clone() for _ in range(max_tokens): with torch.no_grad(): logits, _ = model(generated) next_token_logits = logits[:, -1, :] / temperature # Apply repetition penalty if penalty > 1.0: for token_id in set(generated[0].tolist()): if next_token_logits[0, token_id] > 0: next_token_logits[0, token_id] /= penalty else: next_token_logits[0, token_id] *= penalty probs = torch.nn.functional.softmax(next_token_logits, dim=-1) next_token = torch.multinomial(probs, num_samples=1) if next_token.item() == tokenizer.eos_token_id: break generated = torch.cat([generated, next_token], dim=1) return tokenizer.decode(generated[0], skip_special_tokens=True) ``` ### Loading Model with Safetensors ```python from safetensors.torch import load_file from modeling_cosmicfish import CosmicFish, CosmicConfig import json def load_cosmicfish_model(model_path): # Load config with open(os.path.join(model_path, "config.json")) as f: config_dict = json.load(f) # Create model config config = CosmicConfig( vocab_size=config_dict["vocab_size"], block_size=config_dict["block_size"], n_layer=config_dict["n_layer"], n_head=config_dict["n_head"], n_embd=config_dict["n_embd"], bias=config_dict["bias"], dropout=0.0, use_rotary=config_dict["use_rotary"], use_swiglu=config_dict["use_swiglu"], use_gqa=config_dict["use_gqa"], n_query_groups=config_dict["n_query_groups"] ) # Create model model = CosmicFish(config) # Load weights from safetensors (secure format) state_dict = load_file(os.path.join(model_path, "model.safetensors")) # Handle weight sharing (lm_head.weight shares with transformer.wte.weight) if 'lm_head.weight' not in state_dict and 'transformer.wte.weight' in state_dict: state_dict['lm_head.weight'] = state_dict['transformer.wte.weight'] model.load_state_dict(state_dict) model.eval() return model ``` ### Chat Interface ```python def chat_with_model(): conversation = [] while True: user_input = input("You: ") if user_input.lower() in ['quit', 'exit']: break context = "Below is a conversation between a human and an AI assistant.\n\n" for human, ai in conversation: context += f"Human: {human}\nAssistant: {ai}\n\n" context += f"Human: {user_input}\nAssistant:" # Generate response with repetition penalty response = generate_with_repetition_penalty( model, tokenizer, context, max_tokens=150, temperature=0.7, penalty=1.2 ) # Extract just the assistant's response response = response.split("Assistant:")[-1].split('\n')[0].strip() print(f"CosmicFish: {response}") conversation.append((user_input, response)) chat_with_model() ``` ## Architecture CosmicFish uses several modern improvements over standard transformers: - **RoPE (Rotary Position Embeddings)**: Better position encoding than absolute positions - **GQA (Grouped-Query Attention)**: Reduces memory usage with 4 query groups - **SwiGLU**: More effective activation function than ReLU/GELU - **RMSNorm**: Simpler, more stable normalization than LayerNorm ## Training - **Dataset**: CosmicSet 2.0 mini - **Sequence Length**: 512 tokens - **Training Steps**: ~200K iterations - **Hardware**: Nvidia A40 x1 ## Performance - **Speed**: Varies by hardware (not benchmarked) - **Memory**: ~256MB RAM - **File Size**: 185MB - **Loading**: Fast and secure with safetensors ## Limitations - Small model size (90M parameters) may produce less accurate responses - 512 token context limit - English only - Training data cutoff applies - May generate incorrect information - Cannot browse internet or access real-time data ## License Apache 2.0 - see LICENSE file. ## Credit If you use CosmicFish-90M, please credit Mistyoz AI.