Crow-v1
Collection
2 items
•
Updated
Tokenizer: Custom BPE tokenizer trained for code and docstring pairs.
Data: Functions and natural language descriptions extracted from GitHub repositories.
Masking strategy: Two-phase pretraining.
Pretraining hyperparameters:
transformers
Base model
Shuu12121/CodeModernBERT-Crow-v1-Pre