|
--- |
|
datasets: |
|
- togethercomputer/RedPajama-Data-V2 |
|
language: |
|
- de |
|
library_name: transformers |
|
license: other |
|
pipeline_tag: feature-extraction |
|
tags: |
|
- masked-lm |
|
- long-context |
|
base_model: |
|
- LSX-UniWue/LLaMmlein_1B |
|
--- |
|
|
|
# LLäMmlein2Vec 1B |
|
|
|
LLäMmlein2Vec 1B is a German encoder language model derived from our German decoder-only model [LLäMmlein 1B](https://huggingface.co/LSX-UniWue/LLaMmlein_1B) via [LLM2Vec](https://github.com/McGill-NLP/llm2vec). |
|
Find more details in our [preprint](https://arxiv.org/abs/2505.13136)! |
|
|
|
|
|
We provide three transformed models: |
|
|
|
|
|
* [LLäMmlein 7B](https://huggingface.co/LSX-UniWue/LLaMmlein2Vec_7B) |
|
|
|
* [LLäMmlein 1B](https://huggingface.co/LSX-UniWue/LLaMmlein2Vec_1B) ← You are here |
|
|
|
* [LLäMmlein 120M](https://huggingface.co/LSX-UniWue/LLaMmlein2Vec_120M) |
|
|
|
|
|
### Usage |
|
You can use LLäMmlein2Vec with the `llm2vec` library. |
|
|
|
```python |
|
import torch |
|
from llm2vec import LLM2Vec |
|
|
|
model_id = "LSX-UniWue/LLaMmlein2Vec_1B" |
|
l2v = LLM2Vec.from_pretrained( |
|
model_id, |
|
device_map="cuda" if torch.cuda.is_available() else "cpu", |
|
torch_dtype=torch.bfloat16, |
|
) |
|
``` |
|
|
|
### License |
|
|
|
We release the ModernGBERT models under a research-only RAIL-M license. See [license.md](./license.md) for details. |