File size: 6,503 Bytes
525736b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5b43794
8a1ea37
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- code
- python
- maincoder
- code-generation
- reinforcement-learning
- mcpo
pipeline_tag: text-generation
base_model: Maincode/Maincoder-1B
---
<img src="https://huggingface.co/datasets/Maincode/assets/resolve/e51154e034201be1a5dad0e9c8de31d8b9f17643/maincoder_logo.png" alt="" width="1250">

[**Maincoder-1B**](https://maincode.com/maincoder/) is a code-focused language model optimized for code generation and completion tasks. The model achieves strong performance on coding benchmarks while maintaining a compact size suitable for local deployment.

# Key Features

- **Code Generation**: Optimized for Python code completion and generation tasks.
- **Compact Size**: 1 billion parameters, lightweight enough to run on consumer hardware.
- **Deep Architecture**: Modern transformer architecture with RoPE embeddings, grouped-query attention, QK normalization and high depth-to-width ratio.
- **Advanced Data Mixing**: Pre-trained and mid-trained on custom data mixes developed for high-performance coding.
- **MCPO Algorithm**: Fine-tuned with specialised reinforcement learning policy optimisation algorithm to improve training stability and accelerate convergence.
- **SOTA Performance**: State-of-the-art performance on Python coding benchmarks HumanEval, HumanEval+ and MBPP+.

# Benchmark Results

<img src="https://huggingface.co/datasets/Maincode/assets/resolve/main/performance_h.png" alt="Benchmark Performance Across Baseline LLMs" width="1050">

| Model | HumanEval | HumanEval+ | MBPP+ | MMLU | GSM8K |
|---|---:|---:|---:|---:|---:|
| [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B) | **0.7622** | **0.7256** |  **0.7090** | 0.3054 | 0.2976 |
| [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) | 0.5610 | 0.5305 |  0.6217 | 0.2705 | 0.0413 |
| [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) | 0.5366 | 0.5000 | 0.6799 | **0.5928** | 0.5505 |
| [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) | 0.4634 | 0.4451 | 0.6561 | 0.4984 | 0.4944 |
| [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) |  0.4024 | 0.3780 | 0.5582 | 0.5571 |**0.6865** |

# Model Overview

Maincoder uses a modern transformer decoder architecture with:

- **Rotary Position Embeddings**: With theta of 1,000,000.
- **RMSNorm**: Pre-normalization for stable training.
- **Grouped Query Attention**: 4:1 ratio of query to key-value heads.
- **QK Normalization**: RMSNorm applied to attention queries and keys.
- **SwiGLU MLP**: Gated linear units with SiLU activation.

| Attribute | Value |
|-----------|-------|
| Parameters | 1B |
| Hidden Size | 1536 |
| Layers | 32 |
| Attention Heads | 16 (4 KV heads) |
| Head Dimension | 96 |
| Vocabulary Size | 151,936 |
| Context Length | 2,048 |
| Precision | bfloat16 |

# Usage

### Installation

```bash
pip install transformers torch
```

### Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Maincode/Maincoder-1B",
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "Maincode/Maincoder-1B",
    trust_remote_code=True,
)

# Code completion example
prompt = '''def fibonacci(n: int) -> int:
    """Return the n-th Fibonacci number."""
'''

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.2,
    do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

### Code Completion

```python
# Function completion
prompt = '''def quicksort(arr: list) -> list:
    """Sort a list using the quicksort algorithm."""
'''

# Class completion
prompt = '''class BinarySearchTree:
    """A binary search tree implementation."""
    
    def __init__(self):
'''

# Algorithm implementation
prompt = '''def dijkstra(graph: dict, start: str, end: str) -> tuple:
    """Find the shortest path using Dijkstra's algorithm.
    
    Args:
        graph: Adjacency list representation of the graph
        start: Starting node
        end: Target node
    
    Returns:
        Tuple of (distance, path)
    """
'''
```

# Additional Notes

## Reproducibility

<details>
<summary>Model evaluations were run on 8 AMD MI355X GPUs via the <a href="https://github.com/EleutherAI/lm-evaluation-harness">EleutherAI</a> framework.</summary>

```bash
docker run --rm -it \
  --device=/dev/kfd --device=/dev/dri --group-add=video \
  --ipc=host --security-opt seccomp=unconfined \
  -v $(pwd):/workspace -w /workspace \
  -e HF_TOKEN \
  -e PYTHONHASHSEED=0 \
  -e TORCH_DETERMINISTIC=1 \
  -e ROCBLAS_ATOMICS_MODE="0" \
  -e MIOPEN_FIND_MODE="1" \
  -e CUBLAS_WORKSPACE_CONFIG=":4096:8" \
  -e HF_ALLOW_CODE_EVAL="1" \
  rocm/pytorch:rocm7.1.1_ubuntu24.04_py3.12_pytorch_release_2.9.1 \
  bash -c 'pip install "lm_eval[hf]" && \
  accelerate launch -m lm_eval \
  --model hf --model_args "pretrained=Maincode/Maincoder-1B,trust_remote_code=True,dtype=float32" \
  --tasks humaneval,humaneval_plus,mbpp_plus,mmlu,gsm8k \
  --device cuda:0 --batch_size 32 --seed 42 \
  --confirm_run_unsafe_code'
```

</details>

## Limitations

- Context length limited to 2,048 tokens
- Primarily optimized for Python, performance may vary on other languages
- May generate code with bugs or security issues - always review generated code

<div style="margin-left:14px; border-left:4px solid #3b82f6; background:rgba(59,130,246,0.08); padding:8px 10px; border-radius:8px; font-size:0.92em; margin:10px 0;">
  <strong>Disclaimer</strong>: This model has <strong>not</strong> undergone any alignment or safety tuning (e.g., RLHF/RLAIF, DPO, or safety fine-tuning). Outputs may be unsafe or biased. Please use appropriate safeguards and evaluate carefully for your use case.
</div>

## License

This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

## Citation

```bibtex
@misc{maincoder2025,
  title        = {Maincoder-1B: A High-Performance 1B Parameter Coding Model},
  author       = {Maincode Team},
  year         = {2025},
  organization = {Maincode},
  howpublished = {\url{https://huggingface.co/Maincode/Maincoder-1B}}
}
```

## Contact

For questions, issues, or collaboration inquiries, please visit [Maincode](https://maincode.com).