File size: 6,651 Bytes
295953d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
model-index:
- name: modernbert-base-multi-head-values-context
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# modernbert-base-multi-head-values-context

This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2990
- Subset Accuracy: 0.2753
- F1 Macro: 0.3032
- F1 Micro: 0.3876
- Precision Macro: 0.4109
- Recall Macro: 0.2499
- Roc Auc: 0.7910

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 2
- eval_batch_size: 2
- seed: 2025
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 33
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Subset Accuracy | F1 Macro | F1 Micro | Precision Macro | Recall Macro | Roc Auc |
|:-------------:|:-------:|:-----:|:---------------:|:---------------:|:--------:|:--------:|:---------------:|:------------:|:-------:|
| 2.5451        | 0.5002  | 767   | 0.2012          | 0.0027          | 0.0023   | 0.0050   | 0.0718          | 0.0012       | 0.6531  |
| 1.5075        | 1.0     | 1534  | 0.1838          | 0.0768          | 0.0569   | 0.1319   | 0.2330          | 0.0353       | 0.7437  |
| 1.4382        | 1.5002  | 2301  | 0.1781          | 0.1437          | 0.1318   | 0.2281   | 0.3534          | 0.0891       | 0.7792  |
| 1.3858        | 2.0     | 3068  | 0.1710          | 0.1680          | 0.1582   | 0.2615   | 0.4338          | 0.1091       | 0.7962  |
| 1.3157        | 2.5002  | 3835  | 0.1681          | 0.1822          | 0.1787   | 0.2796   | 0.4967          | 0.1267       | 0.8058  |
| 1.291         | 3.0     | 4602  | 0.1622          | 0.2229          | 0.2115   | 0.3291   | 0.6302          | 0.1523       | 0.8195  |
| 1.2388        | 3.5002  | 5369  | 0.1614          | 0.2026          | 0.2201   | 0.3082   | 0.6143          | 0.1536       | 0.8222  |
| 1.1993        | 4.0     | 6136  | 0.1583          | 0.2445          | 0.2454   | 0.3554   | 0.5956          | 0.1783       | 0.8291  |
| 1.1415        | 4.5002  | 6903  | 0.1608          | 0.2793          | 0.2883   | 0.3934   | 0.5614          | 0.2220       | 0.8288  |
| 1.1221        | 5.0     | 7670  | 0.1595          | 0.2384          | 0.2523   | 0.3533   | 0.5982          | 0.1761       | 0.8342  |
| 1.0726        | 5.5002  | 8437  | 0.1604          | 0.2727          | 0.2930   | 0.3906   | 0.5584          | 0.2178       | 0.8318  |
| 1.0381        | 6.0     | 9204  | 0.1629          | 0.2599          | 0.2693   | 0.3759   | 0.5421          | 0.2099       | 0.8315  |
| 0.9957        | 6.5002  | 9971  | 0.1662          | 0.2814          | 0.2856   | 0.4001   | 0.5380          | 0.2223       | 0.8300  |
| 0.9319        | 7.0     | 10738 | 0.1640          | 0.2604          | 0.2960   | 0.3820   | 0.5431          | 0.2201       | 0.8288  |
| 0.8279        | 7.5002  | 11505 | 0.1733          | 0.2788          | 0.2953   | 0.3939   | 0.5275          | 0.2307       | 0.8245  |
| 0.8365        | 8.0     | 12272 | 0.1742          | 0.2757          | 0.3004   | 0.3910   | 0.5030          | 0.2339       | 0.8218  |
| 0.7168        | 8.5002  | 13039 | 0.1810          | 0.2863          | 0.3063   | 0.4020   | 0.4589          | 0.2499       | 0.8202  |
| 0.7158        | 9.0     | 13806 | 0.1804          | 0.2758          | 0.3052   | 0.3910   | 0.4622          | 0.2392       | 0.8212  |
| 0.5827        | 9.5002  | 14573 | 0.1880          | 0.2878          | 0.3166   | 0.4034   | 0.4568          | 0.2584       | 0.8159  |
| 0.5958        | 10.0    | 15340 | 0.1906          | 0.2788          | 0.3114   | 0.3940   | 0.4912          | 0.2522       | 0.8134  |
| 0.4641        | 10.5002 | 16107 | 0.1978          | 0.2750          | 0.3104   | 0.3896   | 0.4505          | 0.2501       | 0.8106  |
| 0.4608        | 11.0    | 16874 | 0.2022          | 0.2724          | 0.3026   | 0.3880   | 0.4840          | 0.2470       | 0.8082  |
| 0.3546        | 11.5002 | 17641 | 0.2113          | 0.2773          | 0.3120   | 0.3922   | 0.4598          | 0.2556       | 0.8038  |
| 0.3575        | 12.0    | 18408 | 0.2133          | 0.2834          | 0.3092   | 0.3980   | 0.4361          | 0.2535       | 0.8045  |
| 0.2601        | 12.5002 | 19175 | 0.2226          | 0.2778          | 0.3104   | 0.3897   | 0.4274          | 0.2559       | 0.8003  |
| 0.258         | 13.0    | 19942 | 0.2275          | 0.2824          | 0.3176   | 0.3956   | 0.4188          | 0.2643       | 0.8003  |
| 0.1778        | 13.5002 | 20709 | 0.2375          | 0.2686          | 0.3035   | 0.3815   | 0.4103          | 0.2496       | 0.7994  |
| 0.1803        | 14.0    | 21476 | 0.2426          | 0.2713          | 0.3083   | 0.3865   | 0.4305          | 0.2522       | 0.7968  |
| 0.1233        | 14.5002 | 22243 | 0.2501          | 0.2781          | 0.3139   | 0.3906   | 0.4473          | 0.2592       | 0.7970  |
| 0.1197        | 15.0    | 23010 | 0.2566          | 0.2735          | 0.3081   | 0.3864   | 0.4231          | 0.2519       | 0.7950  |
| 0.0804        | 15.5002 | 23777 | 0.2653          | 0.2746          | 0.3065   | 0.3839   | 0.4267          | 0.2512       | 0.7941  |
| 0.0813        | 16.0    | 24544 | 0.2723          | 0.2740          | 0.3078   | 0.3861   | 0.4372          | 0.2505       | 0.7931  |
| 0.0548        | 16.5002 | 25311 | 0.2813          | 0.2776          | 0.3077   | 0.3922   | 0.4544          | 0.2500       | 0.7927  |
| 0.0535        | 17.0    | 26078 | 0.2882          | 0.2804          | 0.3093   | 0.3912   | 0.4497          | 0.2528       | 0.7914  |
| 0.0387        | 17.5002 | 26845 | 0.2990          | 0.2753          | 0.3032   | 0.3876   | 0.4109          | 0.2499       | 0.7910  |


### Framework versions

- Transformers 4.53.2
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.2