SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
- Model Type: SetFit
- Sentence Transformer body: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- Classification head: a OneVsRestClassifier instance
- Maximum Sequence Length: 128 tokens
Model Sources
- Repository: SetFit on GitHub
- Paper: Efficient Few-Shot Learning Without Prompts
- Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts
Evaluation
Metrics
Label | Accuracy |
---|---|
all | 0.3919 |
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("faodl/20250908_model_g20_multilabel_MiniLM-L12-all-labels")
# Run inference
preds = model("4.3.3 Strategies for Comprehensive Sexuality Education and (CSE) Youth-friendly Health Services 1. To promote volunteerism as a tool for fostering active participation of young people in national development; 5. To promote volunteerism as a tool for fostering active participation of young people in national development; 5.")
Training Details
Training Set Metrics
Training set | Min | Median | Max |
---|---|---|---|
Word count | 2 | 70.5122 | 1194 |
Training Hyperparameters
- batch_size: (32, 32)
- num_epochs: (2, 2)
- max_steps: -1
- sampling_strategy: oversampling
- num_iterations: 10
- body_learning_rate: (2e-05, 2e-05)
- head_learning_rate: 2e-05
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- l2_weight: 0.01
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: False
Training Results
Epoch | Step | Training Loss | Validation Loss |
---|---|---|---|
0.0005 | 1 | 0.1435 | - |
0.0241 | 50 | 0.1438 | - |
0.0482 | 100 | 0.1239 | - |
0.0723 | 150 | 0.1073 | - |
0.0964 | 200 | 0.0992 | - |
0.1205 | 250 | 0.0883 | - |
0.1446 | 300 | 0.08 | - |
0.1687 | 350 | 0.0801 | - |
0.1928 | 400 | 0.073 | - |
0.2169 | 450 | 0.0647 | - |
0.2410 | 500 | 0.0549 | - |
0.2651 | 550 | 0.0575 | - |
0.2892 | 600 | 0.0544 | - |
0.3133 | 650 | 0.0523 | - |
0.3373 | 700 | 0.0506 | - |
0.3614 | 750 | 0.0467 | - |
0.3855 | 800 | 0.0443 | - |
0.4096 | 850 | 0.0385 | - |
0.4337 | 900 | 0.0425 | - |
0.4578 | 950 | 0.0412 | - |
0.4819 | 1000 | 0.036 | - |
0.5060 | 1050 | 0.0323 | - |
0.5301 | 1100 | 0.0352 | - |
0.5542 | 1150 | 0.0347 | - |
0.5783 | 1200 | 0.0319 | - |
0.6024 | 1250 | 0.0254 | - |
0.6265 | 1300 | 0.0291 | - |
0.6506 | 1350 | 0.0253 | - |
0.6747 | 1400 | 0.0283 | - |
0.6988 | 1450 | 0.0248 | - |
0.7229 | 1500 | 0.02 | - |
0.7470 | 1550 | 0.0249 | - |
0.7711 | 1600 | 0.0208 | - |
0.7952 | 1650 | 0.021 | - |
0.8193 | 1700 | 0.0238 | - |
0.8434 | 1750 | 0.0196 | - |
0.8675 | 1800 | 0.0213 | - |
0.8916 | 1850 | 0.0222 | - |
0.9157 | 1900 | 0.019 | - |
0.9398 | 1950 | 0.0226 | - |
0.9639 | 2000 | 0.0156 | - |
0.9880 | 2050 | 0.0193 | - |
1.0120 | 2100 | 0.016 | - |
1.0361 | 2150 | 0.019 | - |
1.0602 | 2200 | 0.0154 | - |
1.0843 | 2250 | 0.0136 | - |
1.1084 | 2300 | 0.014 | - |
1.1325 | 2350 | 0.0147 | - |
1.1566 | 2400 | 0.0126 | - |
1.1807 | 2450 | 0.0161 | - |
1.2048 | 2500 | 0.0123 | - |
1.2289 | 2550 | 0.0151 | - |
1.2530 | 2600 | 0.0123 | - |
1.2771 | 2650 | 0.0122 | - |
1.3012 | 2700 | 0.0084 | - |
1.3253 | 2750 | 0.0154 | - |
1.3494 | 2800 | 0.014 | - |
1.3735 | 2850 | 0.0124 | - |
1.3976 | 2900 | 0.0146 | - |
1.4217 | 2950 | 0.0103 | - |
1.4458 | 3000 | 0.0116 | - |
1.4699 | 3050 | 0.013 | - |
1.4940 | 3100 | 0.0104 | - |
1.5181 | 3150 | 0.0124 | - |
1.5422 | 3200 | 0.0127 | - |
1.5663 | 3250 | 0.0122 | - |
1.5904 | 3300 | 0.0092 | - |
1.6145 | 3350 | 0.0108 | - |
1.6386 | 3400 | 0.0121 | - |
1.6627 | 3450 | 0.0125 | - |
1.6867 | 3500 | 0.0162 | - |
1.7108 | 3550 | 0.0105 | - |
1.7349 | 3600 | 0.0133 | - |
1.7590 | 3650 | 0.0145 | - |
1.7831 | 3700 | 0.0113 | - |
1.8072 | 3750 | 0.009 | - |
1.8313 | 3800 | 0.0105 | - |
1.8554 | 3850 | 0.011 | - |
1.8795 | 3900 | 0.0087 | - |
1.9036 | 3950 | 0.0159 | - |
1.9277 | 4000 | 0.0101 | - |
1.9518 | 4050 | 0.0112 | - |
1.9759 | 4100 | 0.0111 | - |
2.0 | 4150 | 0.0124 | - |
Framework Versions
- Python: 3.12.11
- SetFit: 1.1.3
- Sentence Transformers: 5.1.0
- Transformers: 4.56.0
- PyTorch: 2.8.0+cu126
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
- Downloads last month
- 21