Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
50
63
188
Nick Doiron
monsoon-nlp
Follow
iDrops's profile picture
Hemanth-thunder's profile picture
kritsadaK's profile picture
37 followers
·
72 following
https://mapmeld.com/plant-based-llms/
mapmeld
mapmeld.bsky.social
AI & ML interests
biology and multilingual models
Recent Activity
liked
a model
about 12 hours ago
jhu-clsp/mmBERT-base
upvoted
an
article
about 12 hours ago
mmBERT: ModernBERT goes Multilingual
reacted
to
tomaarsen
's
post
with ❤️
about 12 hours ago
ModernBERT goes MULTILINGUAL! One of the most requested models I've seen, The Johns Hopkins University's CLSP has trained state-of-the-art massively multilingual encoders using the ModernBERT architecture: mmBERT. Model details: - 2 model sizes: - https://huggingface.co/jhu-clsp/mmBERT-small - https://huggingface.co/jhu-clsp/mmBERT-base - Uses the ModernBERT architecture, but with the Gemma2 multilingual tokenizer (so: flash attention, alternating global/local attention, unpadding/sequence packing, etc.) - Maximum sequence length of 8192 tokens, on the high end for encoders - Trained on 1833 languages using DCLM, FineWeb2, and many more sources - 3 training phases: 2.3T tokens pretraining on 60 languages, 600B tokens mid-training on 110 languages, and 100B tokens decay training on all 1833 languages. - Both models are MIT Licensed, and the full datasets and intermediary checkpoints are also publicly released Evaluation details: - Very competitive with ModernBERT at equivalent sizes on English (GLUE, MTEB v2 English after finetuning) - Consistently outperforms equivalently sized models on all Multilingual tasks (XTREME, classification, MTEB v2 Multilingual after finetuning) - In short: beats commonly used multilingual base models like mDistilBERT, XLM-R (multilingual RoBERTa), multilingual MiniLM, etc. - Additionally: the ModernBERT-based mmBERT is much faster than the alternatives due to its architectural benefits. Easily up to 2x throughput in common scenarios. Check out the full blogpost with more details. It's super dense & gets straight to the point: https://huggingface.co/blog/mmbert Based on these results, mmBERT should be the new go-to multilingual encoder base models at 300M and below. Do note that the mmBERT models are "base" models, i.e. they're currently only trained to perform Mask Filling. They'll need to be finetuned for downstream tasks like semantic search, classification, clustering, etc.
View all activity
Organizations
monsoon-nlp
's models
44
Sort: Recently updated
monsoon-nlp/dna-blockdiff-2
Fill-Mask
•
0.1B
•
Updated
May 13
•
15
monsoon-nlp/dna-blockdiff-papaya
Fill-Mask
•
0.1B
•
Updated
Mar 31
•
8
•
1
monsoon-nlp/dna-blockdiff
Fill-Mask
•
0.1B
•
Updated
Mar 24
•
13
monsoon-nlp/gpt-nyc-nontoxic
Text Generation
•
0.1B
•
Updated
Mar 10
•
9
monsoon-nlp/dv-muril
Fill-Mask
•
0.2B
•
Updated
Mar 10
•
7
monsoon-nlp/dv-labse
Fill-Mask
•
0.5B
•
Updated
Mar 10
•
9
monsoon-nlp/byt5-basque
0.3B
•
Updated
Feb 24
•
10
monsoon-nlp/byt5-dv
0.3B
•
Updated
Feb 24
•
12
monsoon-nlp/dv-wave
0.0B
•
Updated
Feb 3
•
4
•
1
monsoon-nlp/byt5-base-dv
0.7B
•
Updated
Jan 21
•
7
monsoon-nlp/bangla-electra
0.0B
•
Updated
Dec 11, 2024
•
243
•
4
monsoon-nlp/codellama-abliterated-2xd
Text Generation
•
7B
•
Updated
Jul 26, 2024
•
5
monsoon-nlp/codellama-abliterated
Text Generation
•
7B
•
Updated
Jul 26, 2024
•
4
•
1
monsoon-nlp/protein-matryoshka-embeddings
Sentence Similarity
•
0.4B
•
Updated
Jul 19, 2024
•
12
•
7
monsoon-nlp/llama3-biotoken3pretrain-kaniwa
Updated
May 25, 2024
•
5
monsoon-nlp/llama3-biotokenpretrain-kaniwa
Updated
May 15, 2024
•
2
monsoon-nlp/llama3-dnapretrain-kaniwa
Updated
Apr 26, 2024
monsoon-nlp/tinyllama-mixpretrain-uniprottune
Updated
Apr 22, 2024
•
2
monsoon-nlp/nyc-savvy-llama2-7b-lora-adapter
Updated
Apr 22, 2024
monsoon-nlp/tinyllama-mixpretrain-quinoa-sciphi
Text Generation
•
1B
•
Updated
Apr 22, 2024
•
5
monsoon-nlp/tinyllama-proteinpretrain-quinoa
Text Generation
•
1B
•
Updated
Apr 21, 2024
•
6
monsoon-nlp/BioMedGPT-16bit
Text Generation
•
7B
•
Updated
Apr 21, 2024
•
8
monsoon-nlp/gpt-nyc-small
Text Generation
•
0.1B
•
Updated
Apr 21, 2024
•
11
monsoon-nlp/eyegazer-vit-binary
Updated
Nov 4, 2023
•
21
•
1
monsoon-nlp/eyegazer-vit-lora
Updated
Nov 2, 2023
•
2
monsoon-nlp/mGPT-quantized
Text Generation
•
1B
•
Updated
Sep 20, 2023
•
11
•
1
monsoon-nlp/hindi-bert
Feature Extraction
•
0.0B
•
Updated
Sep 20, 2023
•
680
•
19
monsoon-nlp/tamillion
Feature Extraction
•
0.1B
•
Updated
Sep 20, 2023
•
23
•
2
monsoon-nlp/nyrkr-joker-llama
Text Generation
•
Updated
Sep 5, 2023
•
8
monsoon-nlp/nyc-savvy-llama2-7b
Text Generation
•
Updated
Sep 4, 2023
•
10
•
1
Previous
1
2
Next