Vinsuka commited on
Commit
d52ad1b
·
verified ·
1 Parent(s): 974f545

Upload folder using huggingface_hub

Browse files
Files changed (46) hide show
  1. 1_Pooling/config.json +10 -0
  2. README.md +772 -1
  3. checkpoint-144/1_Pooling/config.json +10 -0
  4. checkpoint-144/README.md +773 -0
  5. checkpoint-144/config.json +47 -0
  6. checkpoint-144/config_sentence_transformers.json +10 -0
  7. checkpoint-144/model.safetensors +3 -0
  8. checkpoint-144/modules.json +20 -0
  9. checkpoint-144/optimizer.pt +3 -0
  10. checkpoint-144/rng_state.pth +3 -0
  11. checkpoint-144/scheduler.pt +3 -0
  12. checkpoint-144/sentence_bert_config.json +4 -0
  13. checkpoint-144/special_tokens_map.json +37 -0
  14. checkpoint-144/tokenizer.json +0 -0
  15. checkpoint-144/tokenizer_config.json +945 -0
  16. checkpoint-144/trainer_state.json +478 -0
  17. checkpoint-144/training_args.bin +3 -0
  18. checkpoint-98/1_Pooling/config.json +10 -0
  19. checkpoint-98/README.md +763 -0
  20. checkpoint-98/config.json +47 -0
  21. checkpoint-98/config_sentence_transformers.json +10 -0
  22. checkpoint-98/modules.json +20 -0
  23. checkpoint-98/optimizer.pt +3 -0
  24. checkpoint-98/rng_state.pth +3 -0
  25. checkpoint-98/scheduler.pt +3 -0
  26. checkpoint-98/sentence_bert_config.json +4 -0
  27. checkpoint-98/special_tokens_map.json +37 -0
  28. checkpoint-98/tokenizer.json +0 -0
  29. checkpoint-98/tokenizer_config.json +945 -0
  30. checkpoint-98/trainer_state.json +332 -0
  31. checkpoint-98/training_args.bin +3 -0
  32. config.json +47 -0
  33. config_sentence_transformers.json +10 -0
  34. eval/Information-Retrieval_evaluation_dim_128_results.csv +4 -0
  35. eval/Information-Retrieval_evaluation_dim_256_results.csv +4 -0
  36. eval/Information-Retrieval_evaluation_dim_512_results.csv +4 -0
  37. eval/Information-Retrieval_evaluation_dim_64_results.csv +4 -0
  38. eval/Information-Retrieval_evaluation_dim_768_results.csv +4 -0
  39. metrics_comparison.txt +82 -0
  40. model.safetensors +3 -0
  41. modules.json +20 -0
  42. sentence_bert_config.json +4 -0
  43. special_tokens_map.json +37 -0
  44. tokenizer.json +0 -0
  45. tokenizer_config.json +945 -0
  46. training_args.bin +3 -0
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -1,3 +1,774 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:6190
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: nomic-ai/modernbert-embed-base
14
+ widget:
15
+ - source_sentence: What is the duration of the period mentioned in the text?
16
+ sentences:
17
+ - . The only excep Ɵon to the requirement that the plainƟff must be a lending i nsƟtuƟon
18
+ in order to invoke the provisions of the Act is contained in SecƟon 25, in terms
19
+ of which a person who inter alia knowingly draws a cheque which is subsequently
20
+ dishonoured by the bank for want of funds is guilty of an offence under the Act,
21
+ and proceedings can be insƟtuted against such person in the Magistrate’s
22
+ - '? The 1st question of law is formulated on the basis that , the 1st Defendant
23
+ is the licensee of the 2nd Defendant and therefore, the 1st Defendant cannot claim
24
+ prescriptive title to the subject matter'
25
+ - .50,000/ - (that is , a period of 36 months) but such “Facility” is subject to
26
+ review on 30 /09/2000”, (that is, a period of about only 5 months from the date
27
+ of P4)
28
+ - source_sentence: What is the purpose of the disposition of the property by Lanka
29
+ Tractors Limited as mentioned in the text?
30
+ sentences:
31
+ - . (3) is whether the said disposition of the property by Lanka Tractors Limited
32
+ was done with the sole object of defrauding its creditors. Section 348 of the
33
+ Companies Act which describes about Fraudulent reference would be relevant in
34
+ this regard
35
+ - . In the arbitration process, the Government is not involved; the court system
36
+ is not involved (except as provided for in the Act); the parties do not have to
37
+ rely on any Government institution for resolution of their dispute. Process of
38
+ conducting the arbitration, venue, time, mode of adducing evidence are all decided
39
+ by agreement of parties
40
+ - . This is broadly similar to the provision in the summary procedure on liquid
41
+ claims. The amendment in clause 8 of the Bill, repeals the defini Ɵon of the term
42
+ ‘debt’ in sec Ɵon 30. The subs Ɵtuted defini Ɵon excludes the words referred to
43
+ above which limit its applicability to money owed under a promise or agreement
44
+ which is in wri Ɵng
45
+ - source_sentence: What is one of the topics covered in the training program?
46
+ sentences:
47
+ - . The resul Ɵng posiƟon is that the court would not have any wri Ʃen evidence
48
+ of the commitment on the part of the debtor when it issues decree nisi in the
49
+ first instance
50
+ - '? Before this C ourt, there is no dispute on the manner in which the appellant
51
+ obtained the title of the land in question'
52
+ - . Detail reporting procedures to government of Sri Lanka’s contact points. - 4
53
+ Weeks Phase 3 Training of Port Facility Security Officers SATHSINDU/BAGNOLD undertakes
54
+ to design a training program and conducted aid program for up to ten persons.
55
+ • Understanding the reasons for the ISPS code • ISPS Code content and requirements.
56
+ • Understanding the ISPS Code
57
+ - source_sentence: What type of action was taken by the Divisional Secretary?
58
+ sentences:
59
+ - .2020 was also sent by the Divisional Secretary of Th amankaduwa imposing similar
60
+ restrictions as by the Polonnaruwa Pradeshiya Sabha
61
+ - . When Seylan Bank published the resolution of its board of directors which exercised
62
+ its powers of Parate Execution in the newspaper on 10th March 2006-, HNB had made
63
+ the application dated 21st March [SC Appeal No. 85A /2009 ] Page 6 of 25 2006
64
+ to the District Court of Colombo in terms of Sections 260, 261, 348, 359 and 352
65
+ of the Companies Act No
66
+ - . Having regard to the above -mentioned stipulated circumstances , I consider
67
+ the facts put forward for the appellant , seeking a reduction of sentence. The
68
+ offence was committed in 2004. The appellant had been in remand custody for more
69
+ than three years and the appell ant did not have any previous convictions
70
+ - source_sentence: What is described in Section 25 of the Arbitration Act?
71
+ sentences:
72
+ - . But where a matter is within the plenary jurisdiction of the Court if no objection
73
+ is taken, the Court will then have jurisdiction to proceed on with the matter
74
+ and make a valid order.” 14 31. Further , in the case of Don Tilakaratne v
75
+ - '. (3) The provision of subsections (1) and (2) shall apply only to the extent
76
+ agreed to by the parties. (4) The arbitral tribunal shall decide according to
77
+ considerations of general justice and fairness or trade usages only if the parties
78
+ have expressly authorised it to do so. Section 25 of the Arbitration Act describes
79
+ the form and content of the arbitral award as follows: 25'
80
+ - '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite
81
+ the fact that they did not arise from the pleadings, and were altogether inconsistent
82
+ with them, answered the afore-stated question of law (in respect of which this
83
+ Court had granted Leave to Appeal in that case) in the affirmative and in favour
84
+ of HNB, and stated as follows: “In conclusion, it needs to be emphasised'
85
+ pipeline_tag: sentence-similarity
86
+ library_name: sentence-transformers
87
+ metrics:
88
+ - cosine_accuracy@1
89
+ - cosine_accuracy@3
90
+ - cosine_accuracy@5
91
+ - cosine_accuracy@10
92
+ - cosine_precision@1
93
+ - cosine_precision@3
94
+ - cosine_precision@5
95
+ - cosine_precision@10
96
+ - cosine_recall@1
97
+ - cosine_recall@3
98
+ - cosine_recall@5
99
+ - cosine_recall@10
100
+ - cosine_ndcg@10
101
+ - cosine_mrr@10
102
+ - cosine_map@100
103
+ model-index:
104
+ - name: Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
105
+ results:
106
+ - task:
107
+ type: information-retrieval
108
+ name: Information Retrieval
109
+ dataset:
110
+ name: dim 768
111
+ type: dim_768
112
+ metrics:
113
+ - type: cosine_accuracy@1
114
+ value: 0.5741279069767442
115
+ name: Cosine Accuracy@1
116
+ - type: cosine_accuracy@3
117
+ value: 0.7616279069767442
118
+ name: Cosine Accuracy@3
119
+ - type: cosine_accuracy@5
120
+ value: 0.8197674418604651
121
+ name: Cosine Accuracy@5
122
+ - type: cosine_accuracy@10
123
+ value: 0.8851744186046512
124
+ name: Cosine Accuracy@10
125
+ - type: cosine_precision@1
126
+ value: 0.5741279069767442
127
+ name: Cosine Precision@1
128
+ - type: cosine_precision@3
129
+ value: 0.25387596899224807
130
+ name: Cosine Precision@3
131
+ - type: cosine_precision@5
132
+ value: 0.163953488372093
133
+ name: Cosine Precision@5
134
+ - type: cosine_precision@10
135
+ value: 0.0885174418604651
136
+ name: Cosine Precision@10
137
+ - type: cosine_recall@1
138
+ value: 0.5741279069767442
139
+ name: Cosine Recall@1
140
+ - type: cosine_recall@3
141
+ value: 0.7616279069767442
142
+ name: Cosine Recall@3
143
+ - type: cosine_recall@5
144
+ value: 0.8197674418604651
145
+ name: Cosine Recall@5
146
+ - type: cosine_recall@10
147
+ value: 0.8851744186046512
148
+ name: Cosine Recall@10
149
+ - type: cosine_ndcg@10
150
+ value: 0.7308126785084815
151
+ name: Cosine Ndcg@10
152
+ - type: cosine_mrr@10
153
+ value: 0.6812459625322997
154
+ name: Cosine Mrr@10
155
+ - type: cosine_map@100
156
+ value: 0.6852483059452662
157
+ name: Cosine Map@100
158
+ - task:
159
+ type: information-retrieval
160
+ name: Information Retrieval
161
+ dataset:
162
+ name: dim 512
163
+ type: dim_512
164
+ metrics:
165
+ - type: cosine_accuracy@1
166
+ value: 0.5741279069767442
167
+ name: Cosine Accuracy@1
168
+ - type: cosine_accuracy@3
169
+ value: 0.7630813953488372
170
+ name: Cosine Accuracy@3
171
+ - type: cosine_accuracy@5
172
+ value: 0.8212209302325582
173
+ name: Cosine Accuracy@5
174
+ - type: cosine_accuracy@10
175
+ value: 0.875
176
+ name: Cosine Accuracy@10
177
+ - type: cosine_precision@1
178
+ value: 0.5741279069767442
179
+ name: Cosine Precision@1
180
+ - type: cosine_precision@3
181
+ value: 0.2543604651162791
182
+ name: Cosine Precision@3
183
+ - type: cosine_precision@5
184
+ value: 0.16424418604651161
185
+ name: Cosine Precision@5
186
+ - type: cosine_precision@10
187
+ value: 0.0875
188
+ name: Cosine Precision@10
189
+ - type: cosine_recall@1
190
+ value: 0.5741279069767442
191
+ name: Cosine Recall@1
192
+ - type: cosine_recall@3
193
+ value: 0.7630813953488372
194
+ name: Cosine Recall@3
195
+ - type: cosine_recall@5
196
+ value: 0.8212209302325582
197
+ name: Cosine Recall@5
198
+ - type: cosine_recall@10
199
+ value: 0.875
200
+ name: Cosine Recall@10
201
+ - type: cosine_ndcg@10
202
+ value: 0.726227401269234
203
+ name: Cosine Ndcg@10
204
+ - type: cosine_mrr@10
205
+ value: 0.6782132475083055
206
+ name: Cosine Mrr@10
207
+ - type: cosine_map@100
208
+ value: 0.6827936993080407
209
+ name: Cosine Map@100
210
+ - task:
211
+ type: information-retrieval
212
+ name: Information Retrieval
213
+ dataset:
214
+ name: dim 256
215
+ type: dim_256
216
+ metrics:
217
+ - type: cosine_accuracy@1
218
+ value: 0.5552325581395349
219
+ name: Cosine Accuracy@1
220
+ - type: cosine_accuracy@3
221
+ value: 0.7281976744186046
222
+ name: Cosine Accuracy@3
223
+ - type: cosine_accuracy@5
224
+ value: 0.7921511627906976
225
+ name: Cosine Accuracy@5
226
+ - type: cosine_accuracy@10
227
+ value: 0.8619186046511628
228
+ name: Cosine Accuracy@10
229
+ - type: cosine_precision@1
230
+ value: 0.5552325581395349
231
+ name: Cosine Precision@1
232
+ - type: cosine_precision@3
233
+ value: 0.24273255813953487
234
+ name: Cosine Precision@3
235
+ - type: cosine_precision@5
236
+ value: 0.15843023255813954
237
+ name: Cosine Precision@5
238
+ - type: cosine_precision@10
239
+ value: 0.08619186046511627
240
+ name: Cosine Precision@10
241
+ - type: cosine_recall@1
242
+ value: 0.5552325581395349
243
+ name: Cosine Recall@1
244
+ - type: cosine_recall@3
245
+ value: 0.7281976744186046
246
+ name: Cosine Recall@3
247
+ - type: cosine_recall@5
248
+ value: 0.7921511627906976
249
+ name: Cosine Recall@5
250
+ - type: cosine_recall@10
251
+ value: 0.8619186046511628
252
+ name: Cosine Recall@10
253
+ - type: cosine_ndcg@10
254
+ value: 0.7077790398550751
255
+ name: Cosine Ndcg@10
256
+ - type: cosine_mrr@10
257
+ value: 0.6585646225544481
258
+ name: Cosine Mrr@10
259
+ - type: cosine_map@100
260
+ value: 0.6630890497309057
261
+ name: Cosine Map@100
262
+ - task:
263
+ type: information-retrieval
264
+ name: Information Retrieval
265
+ dataset:
266
+ name: dim 128
267
+ type: dim_128
268
+ metrics:
269
+ - type: cosine_accuracy@1
270
+ value: 0.49709302325581395
271
+ name: Cosine Accuracy@1
272
+ - type: cosine_accuracy@3
273
+ value: 0.6758720930232558
274
+ name: Cosine Accuracy@3
275
+ - type: cosine_accuracy@5
276
+ value: 0.7354651162790697
277
+ name: Cosine Accuracy@5
278
+ - type: cosine_accuracy@10
279
+ value: 0.8241279069767442
280
+ name: Cosine Accuracy@10
281
+ - type: cosine_precision@1
282
+ value: 0.49709302325581395
283
+ name: Cosine Precision@1
284
+ - type: cosine_precision@3
285
+ value: 0.22529069767441862
286
+ name: Cosine Precision@3
287
+ - type: cosine_precision@5
288
+ value: 0.14709302325581394
289
+ name: Cosine Precision@5
290
+ - type: cosine_precision@10
291
+ value: 0.08241279069767442
292
+ name: Cosine Precision@10
293
+ - type: cosine_recall@1
294
+ value: 0.49709302325581395
295
+ name: Cosine Recall@1
296
+ - type: cosine_recall@3
297
+ value: 0.6758720930232558
298
+ name: Cosine Recall@3
299
+ - type: cosine_recall@5
300
+ value: 0.7354651162790697
301
+ name: Cosine Recall@5
302
+ - type: cosine_recall@10
303
+ value: 0.8241279069767442
304
+ name: Cosine Recall@10
305
+ - type: cosine_ndcg@10
306
+ value: 0.6567813216281579
307
+ name: Cosine Ndcg@10
308
+ - type: cosine_mrr@10
309
+ value: 0.6037779162052417
310
+ name: Cosine Mrr@10
311
+ - type: cosine_map@100
312
+ value: 0.6090388181529673
313
+ name: Cosine Map@100
314
+ - task:
315
+ type: information-retrieval
316
+ name: Information Retrieval
317
+ dataset:
318
+ name: dim 64
319
+ type: dim_64
320
+ metrics:
321
+ - type: cosine_accuracy@1
322
+ value: 0.39680232558139533
323
+ name: Cosine Accuracy@1
324
+ - type: cosine_accuracy@3
325
+ value: 0.5581395348837209
326
+ name: Cosine Accuracy@3
327
+ - type: cosine_accuracy@5
328
+ value: 0.622093023255814
329
+ name: Cosine Accuracy@5
330
+ - type: cosine_accuracy@10
331
+ value: 0.7252906976744186
332
+ name: Cosine Accuracy@10
333
+ - type: cosine_precision@1
334
+ value: 0.39680232558139533
335
+ name: Cosine Precision@1
336
+ - type: cosine_precision@3
337
+ value: 0.18604651162790695
338
+ name: Cosine Precision@3
339
+ - type: cosine_precision@5
340
+ value: 0.12441860465116278
341
+ name: Cosine Precision@5
342
+ - type: cosine_precision@10
343
+ value: 0.07252906976744186
344
+ name: Cosine Precision@10
345
+ - type: cosine_recall@1
346
+ value: 0.39680232558139533
347
+ name: Cosine Recall@1
348
+ - type: cosine_recall@3
349
+ value: 0.5581395348837209
350
+ name: Cosine Recall@3
351
+ - type: cosine_recall@5
352
+ value: 0.622093023255814
353
+ name: Cosine Recall@5
354
+ - type: cosine_recall@10
355
+ value: 0.7252906976744186
356
+ name: Cosine Recall@10
357
+ - type: cosine_ndcg@10
358
+ value: 0.5513541983050395
359
+ name: Cosine Ndcg@10
360
+ - type: cosine_mrr@10
361
+ value: 0.497020348837209
362
+ name: Cosine Mrr@10
363
+ - type: cosine_map@100
364
+ value: 0.5050183064129367
365
+ name: Cosine Map@100
366
  ---
367
+
368
+ # Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
369
+
370
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
371
+
372
+ ## Model Details
373
+
374
+ ### Model Description
375
+ - **Model Type:** Sentence Transformer
376
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
377
+ - **Maximum Sequence Length:** 512 tokens
378
+ - **Output Dimensionality:** 768 dimensions
379
+ - **Similarity Function:** Cosine Similarity
380
+ <!-- - **Training Dataset:** Unknown -->
381
+ - **Language:** en
382
+ - **License:** apache-2.0
383
+
384
+ ### Model Sources
385
+
386
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
387
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
388
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
389
+
390
+ ### Full Model Architecture
391
+
392
+ ```
393
+ SentenceTransformer(
394
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
395
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
396
+ (2): Normalize()
397
+ )
398
+ ```
399
+
400
+ ## Usage
401
+
402
+ ### Direct Usage (Sentence Transformers)
403
+
404
+ First install the Sentence Transformers library:
405
+
406
+ ```bash
407
+ pip install -U sentence-transformers
408
+ ```
409
+
410
+ Then you can load this model and run inference.
411
+ ```python
412
+ from sentence_transformers import SentenceTransformer
413
+
414
+ # Download from the 🤗 Hub
415
+ model = SentenceTransformer("sentence_transformers_model_id")
416
+ # Run inference
417
+ sentences = [
418
+ 'What is described in Section 25 of the Arbitration Act?',
419
+ '. (3) The provision of subsections (1) and (2) shall apply only to the extent agreed to by the parties. (4) The arbitral tribunal shall decide according to considerations of general justice and fairness or trade usages only if the parties have expressly authorised it to do so. Section 25 of the Arbitration Act describes the form and content of the arbitral award as follows: 25',
420
+ '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite the fact that they did not arise from the pleadings, and were altogether inconsistent with them, answered the afore-stated question of law (in respect of which this Court had granted Leave to Appeal in that case) in the affirmative and in favour of HNB, and stated as follows: “In conclusion, it needs to be emphasised',
421
+ ]
422
+ embeddings = model.encode(sentences)
423
+ print(embeddings.shape)
424
+ # [3, 768]
425
+
426
+ # Get the similarity scores for the embeddings
427
+ similarities = model.similarity(embeddings, embeddings)
428
+ print(similarities.shape)
429
+ # [3, 3]
430
+ ```
431
+
432
+ <!--
433
+ ### Direct Usage (Transformers)
434
+
435
+ <details><summary>Click to see the direct usage in Transformers</summary>
436
+
437
+ </details>
438
+ -->
439
+
440
+ <!--
441
+ ### Downstream Usage (Sentence Transformers)
442
+
443
+ You can finetune this model on your own dataset.
444
+
445
+ <details><summary>Click to expand</summary>
446
+
447
+ </details>
448
+ -->
449
+
450
+ <!--
451
+ ### Out-of-Scope Use
452
+
453
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
454
+ -->
455
+
456
+ ## Evaluation
457
+
458
+ ### Metrics
459
+
460
+ #### Information Retrieval
461
+
462
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
463
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
464
+
465
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
466
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
467
+ | cosine_accuracy@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
468
+ | cosine_accuracy@3 | 0.7616 | 0.7631 | 0.7282 | 0.6759 | 0.5581 |
469
+ | cosine_accuracy@5 | 0.8198 | 0.8212 | 0.7922 | 0.7355 | 0.6221 |
470
+ | cosine_accuracy@10 | 0.8852 | 0.875 | 0.8619 | 0.8241 | 0.7253 |
471
+ | cosine_precision@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
472
+ | cosine_precision@3 | 0.2539 | 0.2544 | 0.2427 | 0.2253 | 0.186 |
473
+ | cosine_precision@5 | 0.164 | 0.1642 | 0.1584 | 0.1471 | 0.1244 |
474
+ | cosine_precision@10 | 0.0885 | 0.0875 | 0.0862 | 0.0824 | 0.0725 |
475
+ | cosine_recall@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
476
+ | cosine_recall@3 | 0.7616 | 0.7631 | 0.7282 | 0.6759 | 0.5581 |
477
+ | cosine_recall@5 | 0.8198 | 0.8212 | 0.7922 | 0.7355 | 0.6221 |
478
+ | cosine_recall@10 | 0.8852 | 0.875 | 0.8619 | 0.8241 | 0.7253 |
479
+ | **cosine_ndcg@10** | **0.7308** | **0.7262** | **0.7078** | **0.6568** | **0.5514** |
480
+ | cosine_mrr@10 | 0.6812 | 0.6782 | 0.6586 | 0.6038 | 0.497 |
481
+ | cosine_map@100 | 0.6852 | 0.6828 | 0.6631 | 0.609 | 0.505 |
482
+
483
+ <!--
484
+ ## Bias, Risks and Limitations
485
+
486
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
487
+ -->
488
+
489
+ <!--
490
+ ### Recommendations
491
+
492
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
493
+ -->
494
+
495
+ ## Training Details
496
+
497
+ ### Training Dataset
498
+
499
+ #### Unnamed Dataset
500
+
501
+ * Size: 6,190 training samples
502
+ * Columns: <code>anchor</code> and <code>positive</code>
503
+ * Approximate statistics based on the first 1000 samples:
504
+ | | anchor | positive |
505
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
506
+ | type | string | string |
507
+ | details | <ul><li>min: 7 tokens</li><li>mean: 15.11 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 69.53 tokens</li><li>max: 214 tokens</li></ul> |
508
+ * Samples:
509
+ | anchor | positive |
510
+ |:---------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
511
+ | <code>How must the District Court exercise its discretion?</code> | <code>imposition of ‘ a’ term; (5) It is not mandatory to impose security, as evinced by the use of the conjunction “or”; (6) In imposing terms, the District Court must be mindful of the objectives of the Act, and its discretion must be exercised judicially</code> |
512
+ | <code>What is the source of the observation made by Christian Appu?</code> | <code>. Christian Appu , (1895) 1 NLR 288 observed that , “possession is "disturbed" either by an action intended to remove the possessor from the land, or by acts which prevent the possessor from enjoying the free and full use of 12 the land of which he is in the course of acquiring the dominion, and which convert his continuous user into a disconnected and divided user ”</code> |
513
+ | <code>What must the defendant do regarding the plaintiff's claim?</code> | <code>. The Court of Appeal in Ramanayake v Sampath Bank Ltd and Others [(1993) 1 Sri LR 145 at page 153] has held that, “The defendant has to deal with the plaintiff’s claim on its merits; it is not competent for the defendant to merely set out technical objections. It is also incumbent on the defendant to reveal his defence, if he has any</code> |
514
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
515
+ ```json
516
+ {
517
+ "loss": "MultipleNegativesRankingLoss",
518
+ "matryoshka_dims": [
519
+ 768,
520
+ 512,
521
+ 256,
522
+ 128,
523
+ 64
524
+ ],
525
+ "matryoshka_weights": [
526
+ 1,
527
+ 1,
528
+ 1,
529
+ 1,
530
+ 1
531
+ ],
532
+ "n_dims_per_step": -1
533
+ }
534
+ ```
535
+
536
+ ### Training Hyperparameters
537
+ #### Non-Default Hyperparameters
538
+
539
+ - `eval_strategy`: epoch
540
+ - `per_device_train_batch_size`: 16
541
+ - `gradient_accumulation_steps`: 8
542
+ - `learning_rate`: 2e-05
543
+ - `lr_scheduler_type`: cosine
544
+ - `warmup_ratio`: 0.1
545
+ - `tf32`: True
546
+ - `load_best_model_at_end`: True
547
+ - `optim`: adamw_torch_fused
548
+ - `batch_sampler`: no_duplicates
549
+
550
+ #### All Hyperparameters
551
+ <details><summary>Click to expand</summary>
552
+
553
+ - `overwrite_output_dir`: False
554
+ - `do_predict`: False
555
+ - `eval_strategy`: epoch
556
+ - `prediction_loss_only`: True
557
+ - `per_device_train_batch_size`: 16
558
+ - `per_device_eval_batch_size`: 8
559
+ - `per_gpu_train_batch_size`: None
560
+ - `per_gpu_eval_batch_size`: None
561
+ - `gradient_accumulation_steps`: 8
562
+ - `eval_accumulation_steps`: None
563
+ - `torch_empty_cache_steps`: None
564
+ - `learning_rate`: 2e-05
565
+ - `weight_decay`: 0.0
566
+ - `adam_beta1`: 0.9
567
+ - `adam_beta2`: 0.999
568
+ - `adam_epsilon`: 1e-08
569
+ - `max_grad_norm`: 1.0
570
+ - `num_train_epochs`: 3
571
+ - `max_steps`: -1
572
+ - `lr_scheduler_type`: cosine
573
+ - `lr_scheduler_kwargs`: {}
574
+ - `warmup_ratio`: 0.1
575
+ - `warmup_steps`: 0
576
+ - `log_level`: passive
577
+ - `log_level_replica`: warning
578
+ - `log_on_each_node`: True
579
+ - `logging_nan_inf_filter`: True
580
+ - `save_safetensors`: True
581
+ - `save_on_each_node`: False
582
+ - `save_only_model`: False
583
+ - `restore_callback_states_from_checkpoint`: False
584
+ - `no_cuda`: False
585
+ - `use_cpu`: False
586
+ - `use_mps_device`: False
587
+ - `seed`: 42
588
+ - `data_seed`: None
589
+ - `jit_mode_eval`: False
590
+ - `use_ipex`: False
591
+ - `bf16`: False
592
+ - `fp16`: False
593
+ - `fp16_opt_level`: O1
594
+ - `half_precision_backend`: auto
595
+ - `bf16_full_eval`: False
596
+ - `fp16_full_eval`: False
597
+ - `tf32`: True
598
+ - `local_rank`: 0
599
+ - `ddp_backend`: None
600
+ - `tpu_num_cores`: None
601
+ - `tpu_metrics_debug`: False
602
+ - `debug`: []
603
+ - `dataloader_drop_last`: False
604
+ - `dataloader_num_workers`: 0
605
+ - `dataloader_prefetch_factor`: None
606
+ - `past_index`: -1
607
+ - `disable_tqdm`: False
608
+ - `remove_unused_columns`: True
609
+ - `label_names`: None
610
+ - `load_best_model_at_end`: True
611
+ - `ignore_data_skip`: False
612
+ - `fsdp`: []
613
+ - `fsdp_min_num_params`: 0
614
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
615
+ - `fsdp_transformer_layer_cls_to_wrap`: None
616
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
617
+ - `deepspeed`: None
618
+ - `label_smoothing_factor`: 0.0
619
+ - `optim`: adamw_torch_fused
620
+ - `optim_args`: None
621
+ - `adafactor`: False
622
+ - `group_by_length`: False
623
+ - `length_column_name`: length
624
+ - `ddp_find_unused_parameters`: None
625
+ - `ddp_bucket_cap_mb`: None
626
+ - `ddp_broadcast_buffers`: False
627
+ - `dataloader_pin_memory`: True
628
+ - `dataloader_persistent_workers`: False
629
+ - `skip_memory_metrics`: True
630
+ - `use_legacy_prediction_loop`: False
631
+ - `push_to_hub`: False
632
+ - `resume_from_checkpoint`: None
633
+ - `hub_model_id`: None
634
+ - `hub_strategy`: every_save
635
+ - `hub_private_repo`: None
636
+ - `hub_always_push`: False
637
+ - `gradient_checkpointing`: False
638
+ - `gradient_checkpointing_kwargs`: None
639
+ - `include_inputs_for_metrics`: False
640
+ - `include_for_metrics`: []
641
+ - `eval_do_concat_batches`: True
642
+ - `fp16_backend`: auto
643
+ - `push_to_hub_model_id`: None
644
+ - `push_to_hub_organization`: None
645
+ - `mp_parameters`:
646
+ - `auto_find_batch_size`: False
647
+ - `full_determinism`: False
648
+ - `torchdynamo`: None
649
+ - `ray_scope`: last
650
+ - `ddp_timeout`: 1800
651
+ - `torch_compile`: False
652
+ - `torch_compile_backend`: None
653
+ - `torch_compile_mode`: None
654
+ - `dispatch_batches`: None
655
+ - `split_batches`: None
656
+ - `include_tokens_per_second`: False
657
+ - `include_num_input_tokens_seen`: False
658
+ - `neftune_noise_alpha`: None
659
+ - `optim_target_modules`: None
660
+ - `batch_eval_metrics`: False
661
+ - `eval_on_start`: False
662
+ - `use_liger_kernel`: False
663
+ - `eval_use_gather_object`: False
664
+ - `average_tokens_across_devices`: False
665
+ - `prompts`: None
666
+ - `batch_sampler`: no_duplicates
667
+ - `multi_dataset_batch_sampler`: proportional
668
+
669
+ </details>
670
+
671
+ ### Training Logs
672
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
673
+ |:-------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
674
+ | 0.1034 | 5 | 29.8712 | - | - | - | - | - |
675
+ | 0.2067 | 10 | 26.1323 | - | - | - | - | - |
676
+ | 0.3101 | 15 | 17.8585 | - | - | - | - | - |
677
+ | 0.4134 | 20 | 14.0232 | - | - | - | - | - |
678
+ | 0.5168 | 25 | 11.6897 | - | - | - | - | - |
679
+ | 0.6202 | 30 | 10.8431 | - | - | - | - | - |
680
+ | 0.7235 | 35 | 9.264 | - | - | - | - | - |
681
+ | 0.8269 | 40 | 11.2186 | - | - | - | - | - |
682
+ | 0.9302 | 45 | 9.9143 | - | - | - | - | - |
683
+ | 1.0 | 49 | - | 0.7134 | 0.7110 | 0.6902 | 0.6341 | 0.5282 |
684
+ | 1.0207 | 50 | 7.2581 | - | - | - | - | - |
685
+ | 1.1240 | 55 | 6.066 | - | - | - | - | - |
686
+ | 1.2274 | 60 | 6.3626 | - | - | - | - | - |
687
+ | 1.3307 | 65 | 6.8135 | - | - | - | - | - |
688
+ | 1.4341 | 70 | 5.5556 | - | - | - | - | - |
689
+ | 1.5375 | 75 | 6.0144 | - | - | - | - | - |
690
+ | 1.6408 | 80 | 6.1965 | - | - | - | - | - |
691
+ | 1.7442 | 85 | 5.596 | - | - | - | - | - |
692
+ | 1.8475 | 90 | 6.631 | - | - | - | - | - |
693
+ | 1.9509 | 95 | 6.3319 | - | - | - | - | - |
694
+ | **2.0** | **98** | **-** | **0.7331** | **0.7304** | **0.7074** | **0.6569** | **0.5477** |
695
+ | 2.0413 | 100 | 4.7382 | - | - | - | - | - |
696
+ | 2.1447 | 105 | 4.1516 | - | - | - | - | - |
697
+ | 2.2481 | 110 | 4.3517 | - | - | - | - | - |
698
+ | 2.3514 | 115 | 3.7044 | - | - | - | - | - |
699
+ | 2.4548 | 120 | 4.1593 | - | - | - | - | - |
700
+ | 2.5581 | 125 | 4.8081 | - | - | - | - | - |
701
+ | 2.6615 | 130 | 3.908 | - | - | - | - | - |
702
+ | 2.7649 | 135 | 3.7684 | - | - | - | - | - |
703
+ | 2.8682 | 140 | 3.8927 | - | - | - | - | - |
704
+ | 2.9509 | 144 | - | 0.7308 | 0.7262 | 0.7078 | 0.6568 | 0.5514 |
705
+
706
+ * The bold row denotes the saved checkpoint.
707
+
708
+ ### Framework Versions
709
+ - Python: 3.13.3
710
+ - Sentence Transformers: 3.4.0
711
+ - Transformers: 4.48.1
712
+ - PyTorch: 2.6.0+cu126
713
+ - Accelerate: 1.3.0
714
+ - Datasets: 3.2.0
715
+ - Tokenizers: 0.21.1
716
+
717
+ ## Citation
718
+
719
+ ### BibTeX
720
+
721
+ #### Sentence Transformers
722
+ ```bibtex
723
+ @inproceedings{reimers-2019-sentence-bert,
724
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
725
+ author = "Reimers, Nils and Gurevych, Iryna",
726
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
727
+ month = "11",
728
+ year = "2019",
729
+ publisher = "Association for Computational Linguistics",
730
+ url = "https://arxiv.org/abs/1908.10084",
731
+ }
732
+ ```
733
+
734
+ #### MatryoshkaLoss
735
+ ```bibtex
736
+ @misc{kusupati2024matryoshka,
737
+ title={Matryoshka Representation Learning},
738
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
739
+ year={2024},
740
+ eprint={2205.13147},
741
+ archivePrefix={arXiv},
742
+ primaryClass={cs.LG}
743
+ }
744
+ ```
745
+
746
+ #### MultipleNegativesRankingLoss
747
+ ```bibtex
748
+ @misc{henderson2017efficient,
749
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
750
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
751
+ year={2017},
752
+ eprint={1705.00652},
753
+ archivePrefix={arXiv},
754
+ primaryClass={cs.CL}
755
+ }
756
+ ```
757
+
758
+ <!--
759
+ ## Glossary
760
+
761
+ *Clearly define terms in order to be accessible across audiences.*
762
+ -->
763
+
764
+ <!--
765
+ ## Model Card Authors
766
+
767
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
768
+ -->
769
+
770
+ <!--
771
+ ## Model Card Contact
772
+
773
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
774
+ -->
checkpoint-144/1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
checkpoint-144/README.md ADDED
@@ -0,0 +1,773 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:6190
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: nomic-ai/modernbert-embed-base
14
+ widget:
15
+ - source_sentence: What is the duration of the period mentioned in the text?
16
+ sentences:
17
+ - . The only excep Ɵon to the requirement that the plainƟff must be a lending i nsƟtuƟon
18
+ in order to invoke the provisions of the Act is contained in SecƟon 25, in terms
19
+ of which a person who inter alia knowingly draws a cheque which is subsequently
20
+ dishonoured by the bank for want of funds is guilty of an offence under the Act,
21
+ and proceedings can be insƟtuted against such person in the Magistrate’s
22
+ - '? The 1st question of law is formulated on the basis that , the 1st Defendant
23
+ is the licensee of the 2nd Defendant and therefore, the 1st Defendant cannot claim
24
+ prescriptive title to the subject matter'
25
+ - .50,000/ - (that is , a period of 36 months) but such “Facility” is subject to
26
+ review on 30 /09/2000”, (that is, a period of about only 5 months from the date
27
+ of P4)
28
+ - source_sentence: What is the purpose of the disposition of the property by Lanka
29
+ Tractors Limited as mentioned in the text?
30
+ sentences:
31
+ - . (3) is whether the said disposition of the property by Lanka Tractors Limited
32
+ was done with the sole object of defrauding its creditors. Section 348 of the
33
+ Companies Act which describes about Fraudulent reference would be relevant in
34
+ this regard
35
+ - . In the arbitration process, the Government is not involved; the court system
36
+ is not involved (except as provided for in the Act); the parties do not have to
37
+ rely on any Government institution for resolution of their dispute. Process of
38
+ conducting the arbitration, venue, time, mode of adducing evidence are all decided
39
+ by agreement of parties
40
+ - . This is broadly similar to the provision in the summary procedure on liquid
41
+ claims. The amendment in clause 8 of the Bill, repeals the defini Ɵon of the term
42
+ ‘debt’ in sec Ɵon 30. The subs Ɵtuted defini Ɵon excludes the words referred to
43
+ above which limit its applicability to money owed under a promise or agreement
44
+ which is in wri Ɵng
45
+ - source_sentence: What is one of the topics covered in the training program?
46
+ sentences:
47
+ - . The resul Ɵng posiƟon is that the court would not have any wri Ʃen evidence
48
+ of the commitment on the part of the debtor when it issues decree nisi in the
49
+ first instance
50
+ - '? Before this C ourt, there is no dispute on the manner in which the appellant
51
+ obtained the title of the land in question'
52
+ - . Detail reporting procedures to government of Sri Lanka’s contact points. - 4
53
+ Weeks Phase 3 Training of Port Facility Security Officers SATHSINDU/BAGNOLD undertakes
54
+ to design a training program and conducted aid program for up to ten persons.
55
+ • Understanding the reasons for the ISPS code • ISPS Code content and requirements.
56
+ • Understanding the ISPS Code
57
+ - source_sentence: What type of action was taken by the Divisional Secretary?
58
+ sentences:
59
+ - .2020 was also sent by the Divisional Secretary of Th amankaduwa imposing similar
60
+ restrictions as by the Polonnaruwa Pradeshiya Sabha
61
+ - . When Seylan Bank published the resolution of its board of directors which exercised
62
+ its powers of Parate Execution in the newspaper on 10th March 2006-, HNB had made
63
+ the application dated 21st March [SC Appeal No. 85A /2009 ] Page 6 of 25 2006
64
+ to the District Court of Colombo in terms of Sections 260, 261, 348, 359 and 352
65
+ of the Companies Act No
66
+ - . Having regard to the above -mentioned stipulated circumstances , I consider
67
+ the facts put forward for the appellant , seeking a reduction of sentence. The
68
+ offence was committed in 2004. The appellant had been in remand custody for more
69
+ than three years and the appell ant did not have any previous convictions
70
+ - source_sentence: What is described in Section 25 of the Arbitration Act?
71
+ sentences:
72
+ - . But where a matter is within the plenary jurisdiction of the Court if no objection
73
+ is taken, the Court will then have jurisdiction to proceed on with the matter
74
+ and make a valid order.” 14 31. Further , in the case of Don Tilakaratne v
75
+ - '. (3) The provision of subsections (1) and (2) shall apply only to the extent
76
+ agreed to by the parties. (4) The arbitral tribunal shall decide according to
77
+ considerations of general justice and fairness or trade usages only if the parties
78
+ have expressly authorised it to do so. Section 25 of the Arbitration Act describes
79
+ the form and content of the arbitral award as follows: 25'
80
+ - '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite
81
+ the fact that they did not arise from the pleadings, and were altogether inconsistent
82
+ with them, answered the afore-stated question of law (in respect of which this
83
+ Court had granted Leave to Appeal in that case) in the affirmative and in favour
84
+ of HNB, and stated as follows: “In conclusion, it needs to be emphasised'
85
+ pipeline_tag: sentence-similarity
86
+ library_name: sentence-transformers
87
+ metrics:
88
+ - cosine_accuracy@1
89
+ - cosine_accuracy@3
90
+ - cosine_accuracy@5
91
+ - cosine_accuracy@10
92
+ - cosine_precision@1
93
+ - cosine_precision@3
94
+ - cosine_precision@5
95
+ - cosine_precision@10
96
+ - cosine_recall@1
97
+ - cosine_recall@3
98
+ - cosine_recall@5
99
+ - cosine_recall@10
100
+ - cosine_ndcg@10
101
+ - cosine_mrr@10
102
+ - cosine_map@100
103
+ model-index:
104
+ - name: Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
105
+ results:
106
+ - task:
107
+ type: information-retrieval
108
+ name: Information Retrieval
109
+ dataset:
110
+ name: dim 768
111
+ type: dim_768
112
+ metrics:
113
+ - type: cosine_accuracy@1
114
+ value: 0.5741279069767442
115
+ name: Cosine Accuracy@1
116
+ - type: cosine_accuracy@3
117
+ value: 0.7616279069767442
118
+ name: Cosine Accuracy@3
119
+ - type: cosine_accuracy@5
120
+ value: 0.8197674418604651
121
+ name: Cosine Accuracy@5
122
+ - type: cosine_accuracy@10
123
+ value: 0.8851744186046512
124
+ name: Cosine Accuracy@10
125
+ - type: cosine_precision@1
126
+ value: 0.5741279069767442
127
+ name: Cosine Precision@1
128
+ - type: cosine_precision@3
129
+ value: 0.25387596899224807
130
+ name: Cosine Precision@3
131
+ - type: cosine_precision@5
132
+ value: 0.163953488372093
133
+ name: Cosine Precision@5
134
+ - type: cosine_precision@10
135
+ value: 0.0885174418604651
136
+ name: Cosine Precision@10
137
+ - type: cosine_recall@1
138
+ value: 0.5741279069767442
139
+ name: Cosine Recall@1
140
+ - type: cosine_recall@3
141
+ value: 0.7616279069767442
142
+ name: Cosine Recall@3
143
+ - type: cosine_recall@5
144
+ value: 0.8197674418604651
145
+ name: Cosine Recall@5
146
+ - type: cosine_recall@10
147
+ value: 0.8851744186046512
148
+ name: Cosine Recall@10
149
+ - type: cosine_ndcg@10
150
+ value: 0.7308126785084815
151
+ name: Cosine Ndcg@10
152
+ - type: cosine_mrr@10
153
+ value: 0.6812459625322997
154
+ name: Cosine Mrr@10
155
+ - type: cosine_map@100
156
+ value: 0.6852483059452662
157
+ name: Cosine Map@100
158
+ - task:
159
+ type: information-retrieval
160
+ name: Information Retrieval
161
+ dataset:
162
+ name: dim 512
163
+ type: dim_512
164
+ metrics:
165
+ - type: cosine_accuracy@1
166
+ value: 0.5741279069767442
167
+ name: Cosine Accuracy@1
168
+ - type: cosine_accuracy@3
169
+ value: 0.7630813953488372
170
+ name: Cosine Accuracy@3
171
+ - type: cosine_accuracy@5
172
+ value: 0.8212209302325582
173
+ name: Cosine Accuracy@5
174
+ - type: cosine_accuracy@10
175
+ value: 0.875
176
+ name: Cosine Accuracy@10
177
+ - type: cosine_precision@1
178
+ value: 0.5741279069767442
179
+ name: Cosine Precision@1
180
+ - type: cosine_precision@3
181
+ value: 0.2543604651162791
182
+ name: Cosine Precision@3
183
+ - type: cosine_precision@5
184
+ value: 0.16424418604651161
185
+ name: Cosine Precision@5
186
+ - type: cosine_precision@10
187
+ value: 0.0875
188
+ name: Cosine Precision@10
189
+ - type: cosine_recall@1
190
+ value: 0.5741279069767442
191
+ name: Cosine Recall@1
192
+ - type: cosine_recall@3
193
+ value: 0.7630813953488372
194
+ name: Cosine Recall@3
195
+ - type: cosine_recall@5
196
+ value: 0.8212209302325582
197
+ name: Cosine Recall@5
198
+ - type: cosine_recall@10
199
+ value: 0.875
200
+ name: Cosine Recall@10
201
+ - type: cosine_ndcg@10
202
+ value: 0.726227401269234
203
+ name: Cosine Ndcg@10
204
+ - type: cosine_mrr@10
205
+ value: 0.6782132475083055
206
+ name: Cosine Mrr@10
207
+ - type: cosine_map@100
208
+ value: 0.6827936993080407
209
+ name: Cosine Map@100
210
+ - task:
211
+ type: information-retrieval
212
+ name: Information Retrieval
213
+ dataset:
214
+ name: dim 256
215
+ type: dim_256
216
+ metrics:
217
+ - type: cosine_accuracy@1
218
+ value: 0.5552325581395349
219
+ name: Cosine Accuracy@1
220
+ - type: cosine_accuracy@3
221
+ value: 0.7281976744186046
222
+ name: Cosine Accuracy@3
223
+ - type: cosine_accuracy@5
224
+ value: 0.7921511627906976
225
+ name: Cosine Accuracy@5
226
+ - type: cosine_accuracy@10
227
+ value: 0.8619186046511628
228
+ name: Cosine Accuracy@10
229
+ - type: cosine_precision@1
230
+ value: 0.5552325581395349
231
+ name: Cosine Precision@1
232
+ - type: cosine_precision@3
233
+ value: 0.24273255813953487
234
+ name: Cosine Precision@3
235
+ - type: cosine_precision@5
236
+ value: 0.15843023255813954
237
+ name: Cosine Precision@5
238
+ - type: cosine_precision@10
239
+ value: 0.08619186046511627
240
+ name: Cosine Precision@10
241
+ - type: cosine_recall@1
242
+ value: 0.5552325581395349
243
+ name: Cosine Recall@1
244
+ - type: cosine_recall@3
245
+ value: 0.7281976744186046
246
+ name: Cosine Recall@3
247
+ - type: cosine_recall@5
248
+ value: 0.7921511627906976
249
+ name: Cosine Recall@5
250
+ - type: cosine_recall@10
251
+ value: 0.8619186046511628
252
+ name: Cosine Recall@10
253
+ - type: cosine_ndcg@10
254
+ value: 0.7077790398550751
255
+ name: Cosine Ndcg@10
256
+ - type: cosine_mrr@10
257
+ value: 0.6585646225544481
258
+ name: Cosine Mrr@10
259
+ - type: cosine_map@100
260
+ value: 0.6630890497309057
261
+ name: Cosine Map@100
262
+ - task:
263
+ type: information-retrieval
264
+ name: Information Retrieval
265
+ dataset:
266
+ name: dim 128
267
+ type: dim_128
268
+ metrics:
269
+ - type: cosine_accuracy@1
270
+ value: 0.49709302325581395
271
+ name: Cosine Accuracy@1
272
+ - type: cosine_accuracy@3
273
+ value: 0.6758720930232558
274
+ name: Cosine Accuracy@3
275
+ - type: cosine_accuracy@5
276
+ value: 0.7354651162790697
277
+ name: Cosine Accuracy@5
278
+ - type: cosine_accuracy@10
279
+ value: 0.8241279069767442
280
+ name: Cosine Accuracy@10
281
+ - type: cosine_precision@1
282
+ value: 0.49709302325581395
283
+ name: Cosine Precision@1
284
+ - type: cosine_precision@3
285
+ value: 0.22529069767441862
286
+ name: Cosine Precision@3
287
+ - type: cosine_precision@5
288
+ value: 0.14709302325581394
289
+ name: Cosine Precision@5
290
+ - type: cosine_precision@10
291
+ value: 0.08241279069767442
292
+ name: Cosine Precision@10
293
+ - type: cosine_recall@1
294
+ value: 0.49709302325581395
295
+ name: Cosine Recall@1
296
+ - type: cosine_recall@3
297
+ value: 0.6758720930232558
298
+ name: Cosine Recall@3
299
+ - type: cosine_recall@5
300
+ value: 0.7354651162790697
301
+ name: Cosine Recall@5
302
+ - type: cosine_recall@10
303
+ value: 0.8241279069767442
304
+ name: Cosine Recall@10
305
+ - type: cosine_ndcg@10
306
+ value: 0.6567813216281579
307
+ name: Cosine Ndcg@10
308
+ - type: cosine_mrr@10
309
+ value: 0.6037779162052417
310
+ name: Cosine Mrr@10
311
+ - type: cosine_map@100
312
+ value: 0.6090388181529673
313
+ name: Cosine Map@100
314
+ - task:
315
+ type: information-retrieval
316
+ name: Information Retrieval
317
+ dataset:
318
+ name: dim 64
319
+ type: dim_64
320
+ metrics:
321
+ - type: cosine_accuracy@1
322
+ value: 0.39680232558139533
323
+ name: Cosine Accuracy@1
324
+ - type: cosine_accuracy@3
325
+ value: 0.5581395348837209
326
+ name: Cosine Accuracy@3
327
+ - type: cosine_accuracy@5
328
+ value: 0.622093023255814
329
+ name: Cosine Accuracy@5
330
+ - type: cosine_accuracy@10
331
+ value: 0.7252906976744186
332
+ name: Cosine Accuracy@10
333
+ - type: cosine_precision@1
334
+ value: 0.39680232558139533
335
+ name: Cosine Precision@1
336
+ - type: cosine_precision@3
337
+ value: 0.18604651162790695
338
+ name: Cosine Precision@3
339
+ - type: cosine_precision@5
340
+ value: 0.12441860465116278
341
+ name: Cosine Precision@5
342
+ - type: cosine_precision@10
343
+ value: 0.07252906976744186
344
+ name: Cosine Precision@10
345
+ - type: cosine_recall@1
346
+ value: 0.39680232558139533
347
+ name: Cosine Recall@1
348
+ - type: cosine_recall@3
349
+ value: 0.5581395348837209
350
+ name: Cosine Recall@3
351
+ - type: cosine_recall@5
352
+ value: 0.622093023255814
353
+ name: Cosine Recall@5
354
+ - type: cosine_recall@10
355
+ value: 0.7252906976744186
356
+ name: Cosine Recall@10
357
+ - type: cosine_ndcg@10
358
+ value: 0.5513541983050395
359
+ name: Cosine Ndcg@10
360
+ - type: cosine_mrr@10
361
+ value: 0.497020348837209
362
+ name: Cosine Mrr@10
363
+ - type: cosine_map@100
364
+ value: 0.5050183064129367
365
+ name: Cosine Map@100
366
+ ---
367
+
368
+ # Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
369
+
370
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
371
+
372
+ ## Model Details
373
+
374
+ ### Model Description
375
+ - **Model Type:** Sentence Transformer
376
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
377
+ - **Maximum Sequence Length:** 512 tokens
378
+ - **Output Dimensionality:** 768 dimensions
379
+ - **Similarity Function:** Cosine Similarity
380
+ <!-- - **Training Dataset:** Unknown -->
381
+ - **Language:** en
382
+ - **License:** apache-2.0
383
+
384
+ ### Model Sources
385
+
386
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
387
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
388
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
389
+
390
+ ### Full Model Architecture
391
+
392
+ ```
393
+ SentenceTransformer(
394
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
395
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
396
+ (2): Normalize()
397
+ )
398
+ ```
399
+
400
+ ## Usage
401
+
402
+ ### Direct Usage (Sentence Transformers)
403
+
404
+ First install the Sentence Transformers library:
405
+
406
+ ```bash
407
+ pip install -U sentence-transformers
408
+ ```
409
+
410
+ Then you can load this model and run inference.
411
+ ```python
412
+ from sentence_transformers import SentenceTransformer
413
+
414
+ # Download from the 🤗 Hub
415
+ model = SentenceTransformer("sentence_transformers_model_id")
416
+ # Run inference
417
+ sentences = [
418
+ 'What is described in Section 25 of the Arbitration Act?',
419
+ '. (3) The provision of subsections (1) and (2) shall apply only to the extent agreed to by the parties. (4) The arbitral tribunal shall decide according to considerations of general justice and fairness or trade usages only if the parties have expressly authorised it to do so. Section 25 of the Arbitration Act describes the form and content of the arbitral award as follows: 25',
420
+ '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite the fact that they did not arise from the pleadings, and were altogether inconsistent with them, answered the afore-stated question of law (in respect of which this Court had granted Leave to Appeal in that case) in the affirmative and in favour of HNB, and stated as follows: “In conclusion, it needs to be emphasised',
421
+ ]
422
+ embeddings = model.encode(sentences)
423
+ print(embeddings.shape)
424
+ # [3, 768]
425
+
426
+ # Get the similarity scores for the embeddings
427
+ similarities = model.similarity(embeddings, embeddings)
428
+ print(similarities.shape)
429
+ # [3, 3]
430
+ ```
431
+
432
+ <!--
433
+ ### Direct Usage (Transformers)
434
+
435
+ <details><summary>Click to see the direct usage in Transformers</summary>
436
+
437
+ </details>
438
+ -->
439
+
440
+ <!--
441
+ ### Downstream Usage (Sentence Transformers)
442
+
443
+ You can finetune this model on your own dataset.
444
+
445
+ <details><summary>Click to expand</summary>
446
+
447
+ </details>
448
+ -->
449
+
450
+ <!--
451
+ ### Out-of-Scope Use
452
+
453
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
454
+ -->
455
+
456
+ ## Evaluation
457
+
458
+ ### Metrics
459
+
460
+ #### Information Retrieval
461
+
462
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
463
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
464
+
465
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
466
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
467
+ | cosine_accuracy@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
468
+ | cosine_accuracy@3 | 0.7616 | 0.7631 | 0.7282 | 0.6759 | 0.5581 |
469
+ | cosine_accuracy@5 | 0.8198 | 0.8212 | 0.7922 | 0.7355 | 0.6221 |
470
+ | cosine_accuracy@10 | 0.8852 | 0.875 | 0.8619 | 0.8241 | 0.7253 |
471
+ | cosine_precision@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
472
+ | cosine_precision@3 | 0.2539 | 0.2544 | 0.2427 | 0.2253 | 0.186 |
473
+ | cosine_precision@5 | 0.164 | 0.1642 | 0.1584 | 0.1471 | 0.1244 |
474
+ | cosine_precision@10 | 0.0885 | 0.0875 | 0.0862 | 0.0824 | 0.0725 |
475
+ | cosine_recall@1 | 0.5741 | 0.5741 | 0.5552 | 0.4971 | 0.3968 |
476
+ | cosine_recall@3 | 0.7616 | 0.7631 | 0.7282 | 0.6759 | 0.5581 |
477
+ | cosine_recall@5 | 0.8198 | 0.8212 | 0.7922 | 0.7355 | 0.6221 |
478
+ | cosine_recall@10 | 0.8852 | 0.875 | 0.8619 | 0.8241 | 0.7253 |
479
+ | **cosine_ndcg@10** | **0.7308** | **0.7262** | **0.7078** | **0.6568** | **0.5514** |
480
+ | cosine_mrr@10 | 0.6812 | 0.6782 | 0.6586 | 0.6038 | 0.497 |
481
+ | cosine_map@100 | 0.6852 | 0.6828 | 0.6631 | 0.609 | 0.505 |
482
+
483
+ <!--
484
+ ## Bias, Risks and Limitations
485
+
486
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
487
+ -->
488
+
489
+ <!--
490
+ ### Recommendations
491
+
492
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
493
+ -->
494
+
495
+ ## Training Details
496
+
497
+ ### Training Dataset
498
+
499
+ #### Unnamed Dataset
500
+
501
+ * Size: 6,190 training samples
502
+ * Columns: <code>anchor</code> and <code>positive</code>
503
+ * Approximate statistics based on the first 1000 samples:
504
+ | | anchor | positive |
505
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
506
+ | type | string | string |
507
+ | details | <ul><li>min: 7 tokens</li><li>mean: 15.11 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 69.53 tokens</li><li>max: 214 tokens</li></ul> |
508
+ * Samples:
509
+ | anchor | positive |
510
+ |:---------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
511
+ | <code>How must the District Court exercise its discretion?</code> | <code>imposition of ‘ a’ term; (5) It is not mandatory to impose security, as evinced by the use of the conjunction “or”; (6) In imposing terms, the District Court must be mindful of the objectives of the Act, and its discretion must be exercised judicially</code> |
512
+ | <code>What is the source of the observation made by Christian Appu?</code> | <code>. Christian Appu , (1895) 1 NLR 288 observed that , “possession is "disturbed" either by an action intended to remove the possessor from the land, or by acts which prevent the possessor from enjoying the free and full use of 12 the land of which he is in the course of acquiring the dominion, and which convert his continuous user into a disconnected and divided user ”</code> |
513
+ | <code>What must the defendant do regarding the plaintiff's claim?</code> | <code>. The Court of Appeal in Ramanayake v Sampath Bank Ltd and Others [(1993) 1 Sri LR 145 at page 153] has held that, “The defendant has to deal with the plaintiff’s claim on its merits; it is not competent for the defendant to merely set out technical objections. It is also incumbent on the defendant to reveal his defence, if he has any</code> |
514
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
515
+ ```json
516
+ {
517
+ "loss": "MultipleNegativesRankingLoss",
518
+ "matryoshka_dims": [
519
+ 768,
520
+ 512,
521
+ 256,
522
+ 128,
523
+ 64
524
+ ],
525
+ "matryoshka_weights": [
526
+ 1,
527
+ 1,
528
+ 1,
529
+ 1,
530
+ 1
531
+ ],
532
+ "n_dims_per_step": -1
533
+ }
534
+ ```
535
+
536
+ ### Training Hyperparameters
537
+ #### Non-Default Hyperparameters
538
+
539
+ - `eval_strategy`: epoch
540
+ - `per_device_train_batch_size`: 16
541
+ - `gradient_accumulation_steps`: 8
542
+ - `learning_rate`: 2e-05
543
+ - `lr_scheduler_type`: cosine
544
+ - `warmup_ratio`: 0.1
545
+ - `tf32`: True
546
+ - `load_best_model_at_end`: True
547
+ - `optim`: adamw_torch_fused
548
+ - `batch_sampler`: no_duplicates
549
+
550
+ #### All Hyperparameters
551
+ <details><summary>Click to expand</summary>
552
+
553
+ - `overwrite_output_dir`: False
554
+ - `do_predict`: False
555
+ - `eval_strategy`: epoch
556
+ - `prediction_loss_only`: True
557
+ - `per_device_train_batch_size`: 16
558
+ - `per_device_eval_batch_size`: 8
559
+ - `per_gpu_train_batch_size`: None
560
+ - `per_gpu_eval_batch_size`: None
561
+ - `gradient_accumulation_steps`: 8
562
+ - `eval_accumulation_steps`: None
563
+ - `torch_empty_cache_steps`: None
564
+ - `learning_rate`: 2e-05
565
+ - `weight_decay`: 0.0
566
+ - `adam_beta1`: 0.9
567
+ - `adam_beta2`: 0.999
568
+ - `adam_epsilon`: 1e-08
569
+ - `max_grad_norm`: 1.0
570
+ - `num_train_epochs`: 3
571
+ - `max_steps`: -1
572
+ - `lr_scheduler_type`: cosine
573
+ - `lr_scheduler_kwargs`: {}
574
+ - `warmup_ratio`: 0.1
575
+ - `warmup_steps`: 0
576
+ - `log_level`: passive
577
+ - `log_level_replica`: warning
578
+ - `log_on_each_node`: True
579
+ - `logging_nan_inf_filter`: True
580
+ - `save_safetensors`: True
581
+ - `save_on_each_node`: False
582
+ - `save_only_model`: False
583
+ - `restore_callback_states_from_checkpoint`: False
584
+ - `no_cuda`: False
585
+ - `use_cpu`: False
586
+ - `use_mps_device`: False
587
+ - `seed`: 42
588
+ - `data_seed`: None
589
+ - `jit_mode_eval`: False
590
+ - `use_ipex`: False
591
+ - `bf16`: False
592
+ - `fp16`: False
593
+ - `fp16_opt_level`: O1
594
+ - `half_precision_backend`: auto
595
+ - `bf16_full_eval`: False
596
+ - `fp16_full_eval`: False
597
+ - `tf32`: True
598
+ - `local_rank`: 0
599
+ - `ddp_backend`: None
600
+ - `tpu_num_cores`: None
601
+ - `tpu_metrics_debug`: False
602
+ - `debug`: []
603
+ - `dataloader_drop_last`: False
604
+ - `dataloader_num_workers`: 0
605
+ - `dataloader_prefetch_factor`: None
606
+ - `past_index`: -1
607
+ - `disable_tqdm`: False
608
+ - `remove_unused_columns`: True
609
+ - `label_names`: None
610
+ - `load_best_model_at_end`: True
611
+ - `ignore_data_skip`: False
612
+ - `fsdp`: []
613
+ - `fsdp_min_num_params`: 0
614
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
615
+ - `fsdp_transformer_layer_cls_to_wrap`: None
616
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
617
+ - `deepspeed`: None
618
+ - `label_smoothing_factor`: 0.0
619
+ - `optim`: adamw_torch_fused
620
+ - `optim_args`: None
621
+ - `adafactor`: False
622
+ - `group_by_length`: False
623
+ - `length_column_name`: length
624
+ - `ddp_find_unused_parameters`: None
625
+ - `ddp_bucket_cap_mb`: None
626
+ - `ddp_broadcast_buffers`: False
627
+ - `dataloader_pin_memory`: True
628
+ - `dataloader_persistent_workers`: False
629
+ - `skip_memory_metrics`: True
630
+ - `use_legacy_prediction_loop`: False
631
+ - `push_to_hub`: False
632
+ - `resume_from_checkpoint`: None
633
+ - `hub_model_id`: None
634
+ - `hub_strategy`: every_save
635
+ - `hub_private_repo`: None
636
+ - `hub_always_push`: False
637
+ - `gradient_checkpointing`: False
638
+ - `gradient_checkpointing_kwargs`: None
639
+ - `include_inputs_for_metrics`: False
640
+ - `include_for_metrics`: []
641
+ - `eval_do_concat_batches`: True
642
+ - `fp16_backend`: auto
643
+ - `push_to_hub_model_id`: None
644
+ - `push_to_hub_organization`: None
645
+ - `mp_parameters`:
646
+ - `auto_find_batch_size`: False
647
+ - `full_determinism`: False
648
+ - `torchdynamo`: None
649
+ - `ray_scope`: last
650
+ - `ddp_timeout`: 1800
651
+ - `torch_compile`: False
652
+ - `torch_compile_backend`: None
653
+ - `torch_compile_mode`: None
654
+ - `dispatch_batches`: None
655
+ - `split_batches`: None
656
+ - `include_tokens_per_second`: False
657
+ - `include_num_input_tokens_seen`: False
658
+ - `neftune_noise_alpha`: None
659
+ - `optim_target_modules`: None
660
+ - `batch_eval_metrics`: False
661
+ - `eval_on_start`: False
662
+ - `use_liger_kernel`: False
663
+ - `eval_use_gather_object`: False
664
+ - `average_tokens_across_devices`: False
665
+ - `prompts`: None
666
+ - `batch_sampler`: no_duplicates
667
+ - `multi_dataset_batch_sampler`: proportional
668
+
669
+ </details>
670
+
671
+ ### Training Logs
672
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
673
+ |:------:|:----:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
674
+ | 0.1034 | 5 | 29.8712 | - | - | - | - | - |
675
+ | 0.2067 | 10 | 26.1323 | - | - | - | - | - |
676
+ | 0.3101 | 15 | 17.8585 | - | - | - | - | - |
677
+ | 0.4134 | 20 | 14.0232 | - | - | - | - | - |
678
+ | 0.5168 | 25 | 11.6897 | - | - | - | - | - |
679
+ | 0.6202 | 30 | 10.8431 | - | - | - | - | - |
680
+ | 0.7235 | 35 | 9.264 | - | - | - | - | - |
681
+ | 0.8269 | 40 | 11.2186 | - | - | - | - | - |
682
+ | 0.9302 | 45 | 9.9143 | - | - | - | - | - |
683
+ | 1.0 | 49 | - | 0.7134 | 0.7110 | 0.6902 | 0.6341 | 0.5282 |
684
+ | 1.0207 | 50 | 7.2581 | - | - | - | - | - |
685
+ | 1.1240 | 55 | 6.066 | - | - | - | - | - |
686
+ | 1.2274 | 60 | 6.3626 | - | - | - | - | - |
687
+ | 1.3307 | 65 | 6.8135 | - | - | - | - | - |
688
+ | 1.4341 | 70 | 5.5556 | - | - | - | - | - |
689
+ | 1.5375 | 75 | 6.0144 | - | - | - | - | - |
690
+ | 1.6408 | 80 | 6.1965 | - | - | - | - | - |
691
+ | 1.7442 | 85 | 5.596 | - | - | - | - | - |
692
+ | 1.8475 | 90 | 6.631 | - | - | - | - | - |
693
+ | 1.9509 | 95 | 6.3319 | - | - | - | - | - |
694
+ | 2.0 | 98 | - | 0.7331 | 0.7304 | 0.7074 | 0.6569 | 0.5477 |
695
+ | 2.0413 | 100 | 4.7382 | - | - | - | - | - |
696
+ | 2.1447 | 105 | 4.1516 | - | - | - | - | - |
697
+ | 2.2481 | 110 | 4.3517 | - | - | - | - | - |
698
+ | 2.3514 | 115 | 3.7044 | - | - | - | - | - |
699
+ | 2.4548 | 120 | 4.1593 | - | - | - | - | - |
700
+ | 2.5581 | 125 | 4.8081 | - | - | - | - | - |
701
+ | 2.6615 | 130 | 3.908 | - | - | - | - | - |
702
+ | 2.7649 | 135 | 3.7684 | - | - | - | - | - |
703
+ | 2.8682 | 140 | 3.8927 | - | - | - | - | - |
704
+ | 2.9509 | 144 | - | 0.7308 | 0.7262 | 0.7078 | 0.6568 | 0.5514 |
705
+
706
+
707
+ ### Framework Versions
708
+ - Python: 3.13.3
709
+ - Sentence Transformers: 3.4.0
710
+ - Transformers: 4.48.1
711
+ - PyTorch: 2.6.0+cu126
712
+ - Accelerate: 1.3.0
713
+ - Datasets: 3.2.0
714
+ - Tokenizers: 0.21.1
715
+
716
+ ## Citation
717
+
718
+ ### BibTeX
719
+
720
+ #### Sentence Transformers
721
+ ```bibtex
722
+ @inproceedings{reimers-2019-sentence-bert,
723
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
724
+ author = "Reimers, Nils and Gurevych, Iryna",
725
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
726
+ month = "11",
727
+ year = "2019",
728
+ publisher = "Association for Computational Linguistics",
729
+ url = "https://arxiv.org/abs/1908.10084",
730
+ }
731
+ ```
732
+
733
+ #### MatryoshkaLoss
734
+ ```bibtex
735
+ @misc{kusupati2024matryoshka,
736
+ title={Matryoshka Representation Learning},
737
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
738
+ year={2024},
739
+ eprint={2205.13147},
740
+ archivePrefix={arXiv},
741
+ primaryClass={cs.LG}
742
+ }
743
+ ```
744
+
745
+ #### MultipleNegativesRankingLoss
746
+ ```bibtex
747
+ @misc{henderson2017efficient,
748
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
749
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
750
+ year={2017},
751
+ eprint={1705.00652},
752
+ archivePrefix={arXiv},
753
+ primaryClass={cs.CL}
754
+ }
755
+ ```
756
+
757
+ <!--
758
+ ## Glossary
759
+
760
+ *Clearly define terms in order to be accessible across audiences.*
761
+ -->
762
+
763
+ <!--
764
+ ## Model Card Authors
765
+
766
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
767
+ -->
768
+
769
+ <!--
770
+ ## Model Card Contact
771
+
772
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
773
+ -->
checkpoint-144/config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "nomic-ai/modernbert-embed-base",
3
+ "architectures": [
4
+ "ModernBertModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 1152,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 12,
36
+ "num_hidden_layers": 22,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "reference_compile": false,
40
+ "repad_logits_with_grad": false,
41
+ "sep_token_id": 50282,
42
+ "sparse_pred_ignore_index": -100,
43
+ "sparse_prediction": false,
44
+ "torch_dtype": "float32",
45
+ "transformers_version": "4.48.1",
46
+ "vocab_size": 50368
47
+ }
checkpoint-144/config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.0",
4
+ "transformers": "4.48.1",
5
+ "pytorch": "2.6.0+cu126"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
checkpoint-144/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:933e1165769b2fa8192e53dc4bc0eaa315668b69973fcbc3392c0e6392741b60
3
+ size 596070136
checkpoint-144/modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
checkpoint-144/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2bc221e51e7012e5e055b627331c612d195bd121b41fa6ed028ffc174646eed
3
+ size 1192228922
checkpoint-144/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f545530eb7b4e974806bee09240eef32d4d60ecadb04421eae44d00374477000
3
+ size 14244
checkpoint-144/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2480ba04f11f6e263878353ec7c33776d19276a8903eae1e7af8298228a2dd1e
3
+ size 1064
checkpoint-144/sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
checkpoint-144/special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
checkpoint-144/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-144/tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }
checkpoint-144/trainer_state.json ADDED
@@ -0,0 +1,478 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.6569451636973174,
3
+ "best_model_checkpoint": "./output/modernbert_quickb\\checkpoint-98",
4
+ "epoch": 2.950904392764858,
5
+ "eval_steps": 500,
6
+ "global_step": 144,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.10335917312661498,
13
+ "grad_norm": 319.4144592285156,
14
+ "learning_rate": 6.666666666666667e-06,
15
+ "loss": 29.8712,
16
+ "step": 5
17
+ },
18
+ {
19
+ "epoch": 0.20671834625322996,
20
+ "grad_norm": 169.99363708496094,
21
+ "learning_rate": 1.3333333333333333e-05,
22
+ "loss": 26.1323,
23
+ "step": 10
24
+ },
25
+ {
26
+ "epoch": 0.31007751937984496,
27
+ "grad_norm": 103.89845275878906,
28
+ "learning_rate": 2e-05,
29
+ "loss": 17.8585,
30
+ "step": 15
31
+ },
32
+ {
33
+ "epoch": 0.4134366925064599,
34
+ "grad_norm": 86.83181762695312,
35
+ "learning_rate": 1.9925955354920265e-05,
36
+ "loss": 14.0232,
37
+ "step": 20
38
+ },
39
+ {
40
+ "epoch": 0.5167958656330749,
41
+ "grad_norm": 102.58989715576172,
42
+ "learning_rate": 1.9704917941574053e-05,
43
+ "loss": 11.6897,
44
+ "step": 25
45
+ },
46
+ {
47
+ "epoch": 0.6201550387596899,
48
+ "grad_norm": 80.32222747802734,
49
+ "learning_rate": 1.9340161087325483e-05,
50
+ "loss": 10.8431,
51
+ "step": 30
52
+ },
53
+ {
54
+ "epoch": 0.7235142118863049,
55
+ "grad_norm": 155.523681640625,
56
+ "learning_rate": 1.8837086450537195e-05,
57
+ "loss": 9.264,
58
+ "step": 35
59
+ },
60
+ {
61
+ "epoch": 0.8268733850129198,
62
+ "grad_norm": 110.1160888671875,
63
+ "learning_rate": 1.820314402779511e-05,
64
+ "loss": 11.2186,
65
+ "step": 40
66
+ },
67
+ {
68
+ "epoch": 0.9302325581395349,
69
+ "grad_norm": 105.31327819824219,
70
+ "learning_rate": 1.7447721827437822e-05,
71
+ "loss": 9.9143,
72
+ "step": 45
73
+ },
74
+ {
75
+ "epoch": 1.0,
76
+ "eval_dim_128_cosine_accuracy@1": 0.47674418604651164,
77
+ "eval_dim_128_cosine_accuracy@10": 0.7979651162790697,
78
+ "eval_dim_128_cosine_accuracy@3": 0.6584302325581395,
79
+ "eval_dim_128_cosine_accuracy@5": 0.7151162790697675,
80
+ "eval_dim_128_cosine_map@100": 0.5882659880635713,
81
+ "eval_dim_128_cosine_mrr@10": 0.5821745801033589,
82
+ "eval_dim_128_cosine_ndcg@10": 0.6341456264396685,
83
+ "eval_dim_128_cosine_precision@1": 0.47674418604651164,
84
+ "eval_dim_128_cosine_precision@10": 0.07979651162790696,
85
+ "eval_dim_128_cosine_precision@3": 0.2194767441860465,
86
+ "eval_dim_128_cosine_precision@5": 0.14302325581395348,
87
+ "eval_dim_128_cosine_recall@1": 0.47674418604651164,
88
+ "eval_dim_128_cosine_recall@10": 0.7979651162790697,
89
+ "eval_dim_128_cosine_recall@3": 0.6584302325581395,
90
+ "eval_dim_128_cosine_recall@5": 0.7151162790697675,
91
+ "eval_dim_256_cosine_accuracy@1": 0.5392441860465116,
92
+ "eval_dim_256_cosine_accuracy@10": 0.8473837209302325,
93
+ "eval_dim_256_cosine_accuracy@3": 0.7136627906976745,
94
+ "eval_dim_256_cosine_accuracy@5": 0.7645348837209303,
95
+ "eval_dim_256_cosine_map@100": 0.6454694833433305,
96
+ "eval_dim_256_cosine_mrr@10": 0.640224137135474,
97
+ "eval_dim_256_cosine_ndcg@10": 0.6901657805517794,
98
+ "eval_dim_256_cosine_precision@1": 0.5392441860465116,
99
+ "eval_dim_256_cosine_precision@10": 0.08473837209302325,
100
+ "eval_dim_256_cosine_precision@3": 0.23788759689922478,
101
+ "eval_dim_256_cosine_precision@5": 0.15290697674418602,
102
+ "eval_dim_256_cosine_recall@1": 0.5392441860465116,
103
+ "eval_dim_256_cosine_recall@10": 0.8473837209302325,
104
+ "eval_dim_256_cosine_recall@3": 0.7136627906976745,
105
+ "eval_dim_256_cosine_recall@5": 0.7645348837209303,
106
+ "eval_dim_512_cosine_accuracy@1": 0.5450581395348837,
107
+ "eval_dim_512_cosine_accuracy@10": 0.8808139534883721,
108
+ "eval_dim_512_cosine_accuracy@3": 0.7412790697674418,
109
+ "eval_dim_512_cosine_accuracy@5": 0.7994186046511628,
110
+ "eval_dim_512_cosine_map@100": 0.6607688904855182,
111
+ "eval_dim_512_cosine_mrr@10": 0.6569952011812478,
112
+ "eval_dim_512_cosine_ndcg@10": 0.7109774036246727,
113
+ "eval_dim_512_cosine_precision@1": 0.5450581395348837,
114
+ "eval_dim_512_cosine_precision@10": 0.0880813953488372,
115
+ "eval_dim_512_cosine_precision@3": 0.24709302325581395,
116
+ "eval_dim_512_cosine_precision@5": 0.15988372093023254,
117
+ "eval_dim_512_cosine_recall@1": 0.5450581395348837,
118
+ "eval_dim_512_cosine_recall@10": 0.8808139534883721,
119
+ "eval_dim_512_cosine_recall@3": 0.7412790697674418,
120
+ "eval_dim_512_cosine_recall@5": 0.7994186046511628,
121
+ "eval_dim_64_cosine_accuracy@1": 0.37790697674418605,
122
+ "eval_dim_64_cosine_accuracy@10": 0.6976744186046512,
123
+ "eval_dim_64_cosine_accuracy@3": 0.5421511627906976,
124
+ "eval_dim_64_cosine_accuracy@5": 0.5915697674418605,
125
+ "eval_dim_64_cosine_map@100": 0.48337713890549217,
126
+ "eval_dim_64_cosine_mrr@10": 0.4751395810262089,
127
+ "eval_dim_64_cosine_ndcg@10": 0.5281507062500577,
128
+ "eval_dim_64_cosine_precision@1": 0.37790697674418605,
129
+ "eval_dim_64_cosine_precision@10": 0.06976744186046512,
130
+ "eval_dim_64_cosine_precision@3": 0.18071705426356585,
131
+ "eval_dim_64_cosine_precision@5": 0.11831395348837208,
132
+ "eval_dim_64_cosine_recall@1": 0.37790697674418605,
133
+ "eval_dim_64_cosine_recall@10": 0.6976744186046512,
134
+ "eval_dim_64_cosine_recall@3": 0.5421511627906976,
135
+ "eval_dim_64_cosine_recall@5": 0.5915697674418605,
136
+ "eval_dim_768_cosine_accuracy@1": 0.5508720930232558,
137
+ "eval_dim_768_cosine_accuracy@10": 0.876453488372093,
138
+ "eval_dim_768_cosine_accuracy@3": 0.7427325581395349,
139
+ "eval_dim_768_cosine_accuracy@5": 0.8037790697674418,
140
+ "eval_dim_768_cosine_map@100": 0.6655205749696947,
141
+ "eval_dim_768_cosine_mrr@10": 0.6611186092654114,
142
+ "eval_dim_768_cosine_ndcg@10": 0.7133959457013295,
143
+ "eval_dim_768_cosine_precision@1": 0.5508720930232558,
144
+ "eval_dim_768_cosine_precision@10": 0.0876453488372093,
145
+ "eval_dim_768_cosine_precision@3": 0.24757751937984498,
146
+ "eval_dim_768_cosine_precision@5": 0.16075581395348837,
147
+ "eval_dim_768_cosine_recall@1": 0.5508720930232558,
148
+ "eval_dim_768_cosine_recall@10": 0.876453488372093,
149
+ "eval_dim_768_cosine_recall@3": 0.7427325581395349,
150
+ "eval_dim_768_cosine_recall@5": 0.8037790697674418,
151
+ "eval_runtime": 502.5762,
152
+ "eval_samples_per_second": 0.0,
153
+ "eval_sequential_score": 0.5281507062500577,
154
+ "eval_steps_per_second": 0.0,
155
+ "step": 49
156
+ },
157
+ {
158
+ "epoch": 1.020671834625323,
159
+ "grad_norm": 110.23486328125,
160
+ "learning_rate": 1.658200684320748e-05,
161
+ "loss": 7.2581,
162
+ "step": 50
163
+ },
164
+ {
165
+ "epoch": 1.124031007751938,
166
+ "grad_norm": 66.97112274169922,
167
+ "learning_rate": 1.5618819386853607e-05,
168
+ "loss": 6.066,
169
+ "step": 55
170
+ },
171
+ {
172
+ "epoch": 1.227390180878553,
173
+ "grad_norm": 76.00322723388672,
174
+ "learning_rate": 1.4572423233046386e-05,
175
+ "loss": 6.3626,
176
+ "step": 60
177
+ },
178
+ {
179
+ "epoch": 1.330749354005168,
180
+ "grad_norm": 62.68793487548828,
181
+ "learning_rate": 1.3458314388150115e-05,
182
+ "loss": 6.8135,
183
+ "step": 65
184
+ },
185
+ {
186
+ "epoch": 1.4341085271317828,
187
+ "grad_norm": 69.59709930419922,
188
+ "learning_rate": 1.2292991610964902e-05,
189
+ "loss": 5.5556,
190
+ "step": 70
191
+ },
192
+ {
193
+ "epoch": 1.5374677002583979,
194
+ "grad_norm": 143.95458984375,
195
+ "learning_rate": 1.1093712083778748e-05,
196
+ "loss": 6.0144,
197
+ "step": 75
198
+ },
199
+ {
200
+ "epoch": 1.6408268733850129,
201
+ "grad_norm": 71.6358413696289,
202
+ "learning_rate": 9.878235851980027e-06,
203
+ "loss": 6.1965,
204
+ "step": 80
205
+ },
206
+ {
207
+ "epoch": 1.744186046511628,
208
+ "grad_norm": 77.96407318115234,
209
+ "learning_rate": 8.664562816806022e-06,
210
+ "loss": 5.596,
211
+ "step": 85
212
+ },
213
+ {
214
+ "epoch": 1.847545219638243,
215
+ "grad_norm": 86.09881591796875,
216
+ "learning_rate": 7.470666176083193e-06,
217
+ "loss": 6.631,
218
+ "step": 90
219
+ },
220
+ {
221
+ "epoch": 1.950904392764858,
222
+ "grad_norm": 80.9687271118164,
223
+ "learning_rate": 6.314226260416383e-06,
224
+ "loss": 6.3319,
225
+ "step": 95
226
+ },
227
+ {
228
+ "epoch": 2.0,
229
+ "eval_dim_128_cosine_accuracy@1": 0.498546511627907,
230
+ "eval_dim_128_cosine_accuracy@10": 0.8226744186046512,
231
+ "eval_dim_128_cosine_accuracy@3": 0.6700581395348837,
232
+ "eval_dim_128_cosine_accuracy@5": 0.7369186046511628,
233
+ "eval_dim_128_cosine_map@100": 0.6099407824679366,
234
+ "eval_dim_128_cosine_mrr@10": 0.6045300387596899,
235
+ "eval_dim_128_cosine_ndcg@10": 0.6569451636973174,
236
+ "eval_dim_128_cosine_precision@1": 0.498546511627907,
237
+ "eval_dim_128_cosine_precision@10": 0.08226744186046511,
238
+ "eval_dim_128_cosine_precision@3": 0.22335271317829455,
239
+ "eval_dim_128_cosine_precision@5": 0.14738372093023255,
240
+ "eval_dim_128_cosine_recall@1": 0.498546511627907,
241
+ "eval_dim_128_cosine_recall@10": 0.8226744186046512,
242
+ "eval_dim_128_cosine_recall@3": 0.6700581395348837,
243
+ "eval_dim_128_cosine_recall@5": 0.7369186046511628,
244
+ "eval_dim_256_cosine_accuracy@1": 0.5552325581395349,
245
+ "eval_dim_256_cosine_accuracy@10": 0.8604651162790697,
246
+ "eval_dim_256_cosine_accuracy@3": 0.7354651162790697,
247
+ "eval_dim_256_cosine_accuracy@5": 0.7950581395348837,
248
+ "eval_dim_256_cosine_map@100": 0.6631022921972395,
249
+ "eval_dim_256_cosine_mrr@10": 0.6584688768918417,
250
+ "eval_dim_256_cosine_ndcg@10": 0.7073914263638542,
251
+ "eval_dim_256_cosine_precision@1": 0.5552325581395349,
252
+ "eval_dim_256_cosine_precision@10": 0.08604651162790695,
253
+ "eval_dim_256_cosine_precision@3": 0.24515503875968994,
254
+ "eval_dim_256_cosine_precision@5": 0.15901162790697673,
255
+ "eval_dim_256_cosine_recall@1": 0.5552325581395349,
256
+ "eval_dim_256_cosine_recall@10": 0.8604651162790697,
257
+ "eval_dim_256_cosine_recall@3": 0.7354651162790697,
258
+ "eval_dim_256_cosine_recall@5": 0.7950581395348837,
259
+ "eval_dim_512_cosine_accuracy@1": 0.5770348837209303,
260
+ "eval_dim_512_cosine_accuracy@10": 0.8822674418604651,
261
+ "eval_dim_512_cosine_accuracy@3": 0.7616279069767442,
262
+ "eval_dim_512_cosine_accuracy@5": 0.8197674418604651,
263
+ "eval_dim_512_cosine_map@100": 0.685728403908298,
264
+ "eval_dim_512_cosine_mrr@10": 0.6816127953119231,
265
+ "eval_dim_512_cosine_ndcg@10": 0.7303694393079266,
266
+ "eval_dim_512_cosine_precision@1": 0.5770348837209303,
267
+ "eval_dim_512_cosine_precision@10": 0.0882267441860465,
268
+ "eval_dim_512_cosine_precision@3": 0.25387596899224807,
269
+ "eval_dim_512_cosine_precision@5": 0.163953488372093,
270
+ "eval_dim_512_cosine_recall@1": 0.5770348837209303,
271
+ "eval_dim_512_cosine_recall@10": 0.8822674418604651,
272
+ "eval_dim_512_cosine_recall@3": 0.7616279069767442,
273
+ "eval_dim_512_cosine_recall@5": 0.8197674418604651,
274
+ "eval_dim_64_cosine_accuracy@1": 0.39680232558139533,
275
+ "eval_dim_64_cosine_accuracy@10": 0.7209302325581395,
276
+ "eval_dim_64_cosine_accuracy@3": 0.5523255813953488,
277
+ "eval_dim_64_cosine_accuracy@5": 0.626453488372093,
278
+ "eval_dim_64_cosine_map@100": 0.5014802548028123,
279
+ "eval_dim_64_cosine_mrr@10": 0.4936317598744924,
280
+ "eval_dim_64_cosine_ndcg@10": 0.5476937013127858,
281
+ "eval_dim_64_cosine_precision@1": 0.39680232558139533,
282
+ "eval_dim_64_cosine_precision@10": 0.07209302325581395,
283
+ "eval_dim_64_cosine_precision@3": 0.18410852713178294,
284
+ "eval_dim_64_cosine_precision@5": 0.12529069767441858,
285
+ "eval_dim_64_cosine_recall@1": 0.39680232558139533,
286
+ "eval_dim_64_cosine_recall@10": 0.7209302325581395,
287
+ "eval_dim_64_cosine_recall@3": 0.5523255813953488,
288
+ "eval_dim_64_cosine_recall@5": 0.626453488372093,
289
+ "eval_dim_768_cosine_accuracy@1": 0.5784883720930233,
290
+ "eval_dim_768_cosine_accuracy@10": 0.8880813953488372,
291
+ "eval_dim_768_cosine_accuracy@3": 0.7601744186046512,
292
+ "eval_dim_768_cosine_accuracy@5": 0.8197674418604651,
293
+ "eval_dim_768_cosine_map@100": 0.6874162730700494,
294
+ "eval_dim_768_cosine_mrr@10": 0.6835271317829454,
295
+ "eval_dim_768_cosine_ndcg@10": 0.733110755438693,
296
+ "eval_dim_768_cosine_precision@1": 0.5784883720930233,
297
+ "eval_dim_768_cosine_precision@10": 0.08880813953488371,
298
+ "eval_dim_768_cosine_precision@3": 0.253391472868217,
299
+ "eval_dim_768_cosine_precision@5": 0.163953488372093,
300
+ "eval_dim_768_cosine_recall@1": 0.5784883720930233,
301
+ "eval_dim_768_cosine_recall@10": 0.8880813953488372,
302
+ "eval_dim_768_cosine_recall@3": 0.7601744186046512,
303
+ "eval_dim_768_cosine_recall@5": 0.8197674418604651,
304
+ "eval_runtime": 661.4637,
305
+ "eval_samples_per_second": 0.0,
306
+ "eval_sequential_score": 0.5476937013127858,
307
+ "eval_steps_per_second": 0.0,
308
+ "step": 98
309
+ },
310
+ {
311
+ "epoch": 2.041343669250646,
312
+ "grad_norm": 62.07976150512695,
313
+ "learning_rate": 5.212368706427913e-06,
314
+ "loss": 4.7382,
315
+ "step": 100
316
+ },
317
+ {
318
+ "epoch": 2.144702842377261,
319
+ "grad_norm": 60.8904914855957,
320
+ "learning_rate": 4.181410844420473e-06,
321
+ "loss": 4.1516,
322
+ "step": 105
323
+ },
324
+ {
325
+ "epoch": 2.248062015503876,
326
+ "grad_norm": 62.77790832519531,
327
+ "learning_rate": 3.236620056190972e-06,
328
+ "loss": 4.3517,
329
+ "step": 110
330
+ },
331
+ {
332
+ "epoch": 2.351421188630491,
333
+ "grad_norm": 56.17869567871094,
334
+ "learning_rate": 2.3919876814572197e-06,
335
+ "loss": 3.7044,
336
+ "step": 115
337
+ },
338
+ {
339
+ "epoch": 2.454780361757106,
340
+ "grad_norm": 68.86133575439453,
341
+ "learning_rate": 1.660021821101222e-06,
342
+ "loss": 4.1593,
343
+ "step": 120
344
+ },
345
+ {
346
+ "epoch": 2.558139534883721,
347
+ "grad_norm": 57.50954055786133,
348
+ "learning_rate": 1.051562105591082e-06,
349
+ "loss": 4.8081,
350
+ "step": 125
351
+ },
352
+ {
353
+ "epoch": 2.661498708010336,
354
+ "grad_norm": 65.29470825195312,
355
+ "learning_rate": 5.756191716628556e-07,
356
+ "loss": 3.908,
357
+ "step": 130
358
+ },
359
+ {
360
+ "epoch": 2.764857881136951,
361
+ "grad_norm": 69.67886352539062,
362
+ "learning_rate": 2.392412244407294e-07,
363
+ "loss": 3.7684,
364
+ "step": 135
365
+ },
366
+ {
367
+ "epoch": 2.8682170542635657,
368
+ "grad_norm": 47.42465591430664,
369
+ "learning_rate": 4.740966106764222e-08,
370
+ "loss": 3.8927,
371
+ "step": 140
372
+ },
373
+ {
374
+ "epoch": 2.950904392764858,
375
+ "eval_dim_128_cosine_accuracy@1": 0.49709302325581395,
376
+ "eval_dim_128_cosine_accuracy@10": 0.8241279069767442,
377
+ "eval_dim_128_cosine_accuracy@3": 0.6758720930232558,
378
+ "eval_dim_128_cosine_accuracy@5": 0.7354651162790697,
379
+ "eval_dim_128_cosine_map@100": 0.6090388181529673,
380
+ "eval_dim_128_cosine_mrr@10": 0.6037779162052417,
381
+ "eval_dim_128_cosine_ndcg@10": 0.6567813216281579,
382
+ "eval_dim_128_cosine_precision@1": 0.49709302325581395,
383
+ "eval_dim_128_cosine_precision@10": 0.08241279069767442,
384
+ "eval_dim_128_cosine_precision@3": 0.22529069767441862,
385
+ "eval_dim_128_cosine_precision@5": 0.14709302325581394,
386
+ "eval_dim_128_cosine_recall@1": 0.49709302325581395,
387
+ "eval_dim_128_cosine_recall@10": 0.8241279069767442,
388
+ "eval_dim_128_cosine_recall@3": 0.6758720930232558,
389
+ "eval_dim_128_cosine_recall@5": 0.7354651162790697,
390
+ "eval_dim_256_cosine_accuracy@1": 0.5552325581395349,
391
+ "eval_dim_256_cosine_accuracy@10": 0.8619186046511628,
392
+ "eval_dim_256_cosine_accuracy@3": 0.7281976744186046,
393
+ "eval_dim_256_cosine_accuracy@5": 0.7921511627906976,
394
+ "eval_dim_256_cosine_map@100": 0.6630890497309057,
395
+ "eval_dim_256_cosine_mrr@10": 0.6585646225544481,
396
+ "eval_dim_256_cosine_ndcg@10": 0.7077790398550751,
397
+ "eval_dim_256_cosine_precision@1": 0.5552325581395349,
398
+ "eval_dim_256_cosine_precision@10": 0.08619186046511627,
399
+ "eval_dim_256_cosine_precision@3": 0.24273255813953487,
400
+ "eval_dim_256_cosine_precision@5": 0.15843023255813954,
401
+ "eval_dim_256_cosine_recall@1": 0.5552325581395349,
402
+ "eval_dim_256_cosine_recall@10": 0.8619186046511628,
403
+ "eval_dim_256_cosine_recall@3": 0.7281976744186046,
404
+ "eval_dim_256_cosine_recall@5": 0.7921511627906976,
405
+ "eval_dim_512_cosine_accuracy@1": 0.5741279069767442,
406
+ "eval_dim_512_cosine_accuracy@10": 0.875,
407
+ "eval_dim_512_cosine_accuracy@3": 0.7630813953488372,
408
+ "eval_dim_512_cosine_accuracy@5": 0.8212209302325582,
409
+ "eval_dim_512_cosine_map@100": 0.6827936993080407,
410
+ "eval_dim_512_cosine_mrr@10": 0.6782132475083055,
411
+ "eval_dim_512_cosine_ndcg@10": 0.726227401269234,
412
+ "eval_dim_512_cosine_precision@1": 0.5741279069767442,
413
+ "eval_dim_512_cosine_precision@10": 0.0875,
414
+ "eval_dim_512_cosine_precision@3": 0.2543604651162791,
415
+ "eval_dim_512_cosine_precision@5": 0.16424418604651161,
416
+ "eval_dim_512_cosine_recall@1": 0.5741279069767442,
417
+ "eval_dim_512_cosine_recall@10": 0.875,
418
+ "eval_dim_512_cosine_recall@3": 0.7630813953488372,
419
+ "eval_dim_512_cosine_recall@5": 0.8212209302325582,
420
+ "eval_dim_64_cosine_accuracy@1": 0.39680232558139533,
421
+ "eval_dim_64_cosine_accuracy@10": 0.7252906976744186,
422
+ "eval_dim_64_cosine_accuracy@3": 0.5581395348837209,
423
+ "eval_dim_64_cosine_accuracy@5": 0.622093023255814,
424
+ "eval_dim_64_cosine_map@100": 0.5050183064129367,
425
+ "eval_dim_64_cosine_mrr@10": 0.497020348837209,
426
+ "eval_dim_64_cosine_ndcg@10": 0.5513541983050395,
427
+ "eval_dim_64_cosine_precision@1": 0.39680232558139533,
428
+ "eval_dim_64_cosine_precision@10": 0.07252906976744186,
429
+ "eval_dim_64_cosine_precision@3": 0.18604651162790695,
430
+ "eval_dim_64_cosine_precision@5": 0.12441860465116278,
431
+ "eval_dim_64_cosine_recall@1": 0.39680232558139533,
432
+ "eval_dim_64_cosine_recall@10": 0.7252906976744186,
433
+ "eval_dim_64_cosine_recall@3": 0.5581395348837209,
434
+ "eval_dim_64_cosine_recall@5": 0.622093023255814,
435
+ "eval_dim_768_cosine_accuracy@1": 0.5741279069767442,
436
+ "eval_dim_768_cosine_accuracy@10": 0.8851744186046512,
437
+ "eval_dim_768_cosine_accuracy@3": 0.7616279069767442,
438
+ "eval_dim_768_cosine_accuracy@5": 0.8197674418604651,
439
+ "eval_dim_768_cosine_map@100": 0.6852483059452662,
440
+ "eval_dim_768_cosine_mrr@10": 0.6812459625322997,
441
+ "eval_dim_768_cosine_ndcg@10": 0.7308126785084815,
442
+ "eval_dim_768_cosine_precision@1": 0.5741279069767442,
443
+ "eval_dim_768_cosine_precision@10": 0.0885174418604651,
444
+ "eval_dim_768_cosine_precision@3": 0.25387596899224807,
445
+ "eval_dim_768_cosine_precision@5": 0.163953488372093,
446
+ "eval_dim_768_cosine_recall@1": 0.5741279069767442,
447
+ "eval_dim_768_cosine_recall@10": 0.8851744186046512,
448
+ "eval_dim_768_cosine_recall@3": 0.7616279069767442,
449
+ "eval_dim_768_cosine_recall@5": 0.8197674418604651,
450
+ "eval_runtime": 661.6139,
451
+ "eval_samples_per_second": 0.0,
452
+ "eval_sequential_score": 0.5513541983050395,
453
+ "eval_steps_per_second": 0.0,
454
+ "step": 144
455
+ }
456
+ ],
457
+ "logging_steps": 5,
458
+ "max_steps": 144,
459
+ "num_input_tokens_seen": 0,
460
+ "num_train_epochs": 3,
461
+ "save_steps": 500,
462
+ "stateful_callbacks": {
463
+ "TrainerControl": {
464
+ "args": {
465
+ "should_epoch_stop": false,
466
+ "should_evaluate": false,
467
+ "should_log": false,
468
+ "should_save": true,
469
+ "should_training_stop": true
470
+ },
471
+ "attributes": {}
472
+ }
473
+ },
474
+ "total_flos": 0.0,
475
+ "train_batch_size": 16,
476
+ "trial_name": null,
477
+ "trial_params": null
478
+ }
checkpoint-144/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:798de09171a4a7f8ef78d9702b3e30b782fa362715033f3af56b90923c847a70
3
+ size 5624
checkpoint-98/1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
checkpoint-98/README.md ADDED
@@ -0,0 +1,763 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:6190
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: nomic-ai/modernbert-embed-base
14
+ widget:
15
+ - source_sentence: What is the duration of the period mentioned in the text?
16
+ sentences:
17
+ - . The only excep Ɵon to the requirement that the plainƟff must be a lending i nsƟtuƟon
18
+ in order to invoke the provisions of the Act is contained in SecƟon 25, in terms
19
+ of which a person who inter alia knowingly draws a cheque which is subsequently
20
+ dishonoured by the bank for want of funds is guilty of an offence under the Act,
21
+ and proceedings can be insƟtuted against such person in the Magistrate’s
22
+ - '? The 1st question of law is formulated on the basis that , the 1st Defendant
23
+ is the licensee of the 2nd Defendant and therefore, the 1st Defendant cannot claim
24
+ prescriptive title to the subject matter'
25
+ - .50,000/ - (that is , a period of 36 months) but such “Facility” is subject to
26
+ review on 30 /09/2000”, (that is, a period of about only 5 months from the date
27
+ of P4)
28
+ - source_sentence: What is the purpose of the disposition of the property by Lanka
29
+ Tractors Limited as mentioned in the text?
30
+ sentences:
31
+ - . (3) is whether the said disposition of the property by Lanka Tractors Limited
32
+ was done with the sole object of defrauding its creditors. Section 348 of the
33
+ Companies Act which describes about Fraudulent reference would be relevant in
34
+ this regard
35
+ - . In the arbitration process, the Government is not involved; the court system
36
+ is not involved (except as provided for in the Act); the parties do not have to
37
+ rely on any Government institution for resolution of their dispute. Process of
38
+ conducting the arbitration, venue, time, mode of adducing evidence are all decided
39
+ by agreement of parties
40
+ - . This is broadly similar to the provision in the summary procedure on liquid
41
+ claims. The amendment in clause 8 of the Bill, repeals the defini Ɵon of the term
42
+ ‘debt’ in sec Ɵon 30. The subs Ɵtuted defini Ɵon excludes the words referred to
43
+ above which limit its applicability to money owed under a promise or agreement
44
+ which is in wri Ɵng
45
+ - source_sentence: What is one of the topics covered in the training program?
46
+ sentences:
47
+ - . The resul Ɵng posiƟon is that the court would not have any wri Ʃen evidence
48
+ of the commitment on the part of the debtor when it issues decree nisi in the
49
+ first instance
50
+ - '? Before this C ourt, there is no dispute on the manner in which the appellant
51
+ obtained the title of the land in question'
52
+ - . Detail reporting procedures to government of Sri Lanka’s contact points. - 4
53
+ Weeks Phase 3 Training of Port Facility Security Officers SATHSINDU/BAGNOLD undertakes
54
+ to design a training program and conducted aid program for up to ten persons.
55
+ • Understanding the reasons for the ISPS code • ISPS Code content and requirements.
56
+ • Understanding the ISPS Code
57
+ - source_sentence: What type of action was taken by the Divisional Secretary?
58
+ sentences:
59
+ - .2020 was also sent by the Divisional Secretary of Th amankaduwa imposing similar
60
+ restrictions as by the Polonnaruwa Pradeshiya Sabha
61
+ - . When Seylan Bank published the resolution of its board of directors which exercised
62
+ its powers of Parate Execution in the newspaper on 10th March 2006-, HNB had made
63
+ the application dated 21st March [SC Appeal No. 85A /2009 ] Page 6 of 25 2006
64
+ to the District Court of Colombo in terms of Sections 260, 261, 348, 359 and 352
65
+ of the Companies Act No
66
+ - . Having regard to the above -mentioned stipulated circumstances , I consider
67
+ the facts put forward for the appellant , seeking a reduction of sentence. The
68
+ offence was committed in 2004. The appellant had been in remand custody for more
69
+ than three years and the appell ant did not have any previous convictions
70
+ - source_sentence: What is described in Section 25 of the Arbitration Act?
71
+ sentences:
72
+ - . But where a matter is within the plenary jurisdiction of the Court if no objection
73
+ is taken, the Court will then have jurisdiction to proceed on with the matter
74
+ and make a valid order.” 14 31. Further , in the case of Don Tilakaratne v
75
+ - '. (3) The provision of subsections (1) and (2) shall apply only to the extent
76
+ agreed to by the parties. (4) The arbitral tribunal shall decide according to
77
+ considerations of general justice and fairness or trade usages only if the parties
78
+ have expressly authorised it to do so. Section 25 of the Arbitration Act describes
79
+ the form and content of the arbitral award as follows: 25'
80
+ - '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite
81
+ the fact that they did not arise from the pleadings, and were altogether inconsistent
82
+ with them, answered the afore-stated question of law (in respect of which this
83
+ Court had granted Leave to Appeal in that case) in the affirmative and in favour
84
+ of HNB, and stated as follows: “In conclusion, it needs to be emphasised'
85
+ pipeline_tag: sentence-similarity
86
+ library_name: sentence-transformers
87
+ metrics:
88
+ - cosine_accuracy@1
89
+ - cosine_accuracy@3
90
+ - cosine_accuracy@5
91
+ - cosine_accuracy@10
92
+ - cosine_precision@1
93
+ - cosine_precision@3
94
+ - cosine_precision@5
95
+ - cosine_precision@10
96
+ - cosine_recall@1
97
+ - cosine_recall@3
98
+ - cosine_recall@5
99
+ - cosine_recall@10
100
+ - cosine_ndcg@10
101
+ - cosine_mrr@10
102
+ - cosine_map@100
103
+ model-index:
104
+ - name: Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
105
+ results:
106
+ - task:
107
+ type: information-retrieval
108
+ name: Information Retrieval
109
+ dataset:
110
+ name: dim 768
111
+ type: dim_768
112
+ metrics:
113
+ - type: cosine_accuracy@1
114
+ value: 0.5784883720930233
115
+ name: Cosine Accuracy@1
116
+ - type: cosine_accuracy@3
117
+ value: 0.7601744186046512
118
+ name: Cosine Accuracy@3
119
+ - type: cosine_accuracy@5
120
+ value: 0.8197674418604651
121
+ name: Cosine Accuracy@5
122
+ - type: cosine_accuracy@10
123
+ value: 0.8880813953488372
124
+ name: Cosine Accuracy@10
125
+ - type: cosine_precision@1
126
+ value: 0.5784883720930233
127
+ name: Cosine Precision@1
128
+ - type: cosine_precision@3
129
+ value: 0.253391472868217
130
+ name: Cosine Precision@3
131
+ - type: cosine_precision@5
132
+ value: 0.163953488372093
133
+ name: Cosine Precision@5
134
+ - type: cosine_precision@10
135
+ value: 0.08880813953488371
136
+ name: Cosine Precision@10
137
+ - type: cosine_recall@1
138
+ value: 0.5784883720930233
139
+ name: Cosine Recall@1
140
+ - type: cosine_recall@3
141
+ value: 0.7601744186046512
142
+ name: Cosine Recall@3
143
+ - type: cosine_recall@5
144
+ value: 0.8197674418604651
145
+ name: Cosine Recall@5
146
+ - type: cosine_recall@10
147
+ value: 0.8880813953488372
148
+ name: Cosine Recall@10
149
+ - type: cosine_ndcg@10
150
+ value: 0.733110755438693
151
+ name: Cosine Ndcg@10
152
+ - type: cosine_mrr@10
153
+ value: 0.6835271317829454
154
+ name: Cosine Mrr@10
155
+ - type: cosine_map@100
156
+ value: 0.6874162730700494
157
+ name: Cosine Map@100
158
+ - task:
159
+ type: information-retrieval
160
+ name: Information Retrieval
161
+ dataset:
162
+ name: dim 512
163
+ type: dim_512
164
+ metrics:
165
+ - type: cosine_accuracy@1
166
+ value: 0.5770348837209303
167
+ name: Cosine Accuracy@1
168
+ - type: cosine_accuracy@3
169
+ value: 0.7616279069767442
170
+ name: Cosine Accuracy@3
171
+ - type: cosine_accuracy@5
172
+ value: 0.8197674418604651
173
+ name: Cosine Accuracy@5
174
+ - type: cosine_accuracy@10
175
+ value: 0.8822674418604651
176
+ name: Cosine Accuracy@10
177
+ - type: cosine_precision@1
178
+ value: 0.5770348837209303
179
+ name: Cosine Precision@1
180
+ - type: cosine_precision@3
181
+ value: 0.25387596899224807
182
+ name: Cosine Precision@3
183
+ - type: cosine_precision@5
184
+ value: 0.163953488372093
185
+ name: Cosine Precision@5
186
+ - type: cosine_precision@10
187
+ value: 0.0882267441860465
188
+ name: Cosine Precision@10
189
+ - type: cosine_recall@1
190
+ value: 0.5770348837209303
191
+ name: Cosine Recall@1
192
+ - type: cosine_recall@3
193
+ value: 0.7616279069767442
194
+ name: Cosine Recall@3
195
+ - type: cosine_recall@5
196
+ value: 0.8197674418604651
197
+ name: Cosine Recall@5
198
+ - type: cosine_recall@10
199
+ value: 0.8822674418604651
200
+ name: Cosine Recall@10
201
+ - type: cosine_ndcg@10
202
+ value: 0.7303694393079266
203
+ name: Cosine Ndcg@10
204
+ - type: cosine_mrr@10
205
+ value: 0.6816127953119231
206
+ name: Cosine Mrr@10
207
+ - type: cosine_map@100
208
+ value: 0.685728403908298
209
+ name: Cosine Map@100
210
+ - task:
211
+ type: information-retrieval
212
+ name: Information Retrieval
213
+ dataset:
214
+ name: dim 256
215
+ type: dim_256
216
+ metrics:
217
+ - type: cosine_accuracy@1
218
+ value: 0.5552325581395349
219
+ name: Cosine Accuracy@1
220
+ - type: cosine_accuracy@3
221
+ value: 0.7354651162790697
222
+ name: Cosine Accuracy@3
223
+ - type: cosine_accuracy@5
224
+ value: 0.7950581395348837
225
+ name: Cosine Accuracy@5
226
+ - type: cosine_accuracy@10
227
+ value: 0.8604651162790697
228
+ name: Cosine Accuracy@10
229
+ - type: cosine_precision@1
230
+ value: 0.5552325581395349
231
+ name: Cosine Precision@1
232
+ - type: cosine_precision@3
233
+ value: 0.24515503875968994
234
+ name: Cosine Precision@3
235
+ - type: cosine_precision@5
236
+ value: 0.15901162790697673
237
+ name: Cosine Precision@5
238
+ - type: cosine_precision@10
239
+ value: 0.08604651162790695
240
+ name: Cosine Precision@10
241
+ - type: cosine_recall@1
242
+ value: 0.5552325581395349
243
+ name: Cosine Recall@1
244
+ - type: cosine_recall@3
245
+ value: 0.7354651162790697
246
+ name: Cosine Recall@3
247
+ - type: cosine_recall@5
248
+ value: 0.7950581395348837
249
+ name: Cosine Recall@5
250
+ - type: cosine_recall@10
251
+ value: 0.8604651162790697
252
+ name: Cosine Recall@10
253
+ - type: cosine_ndcg@10
254
+ value: 0.7073914263638542
255
+ name: Cosine Ndcg@10
256
+ - type: cosine_mrr@10
257
+ value: 0.6584688768918417
258
+ name: Cosine Mrr@10
259
+ - type: cosine_map@100
260
+ value: 0.6631022921972395
261
+ name: Cosine Map@100
262
+ - task:
263
+ type: information-retrieval
264
+ name: Information Retrieval
265
+ dataset:
266
+ name: dim 128
267
+ type: dim_128
268
+ metrics:
269
+ - type: cosine_accuracy@1
270
+ value: 0.498546511627907
271
+ name: Cosine Accuracy@1
272
+ - type: cosine_accuracy@3
273
+ value: 0.6700581395348837
274
+ name: Cosine Accuracy@3
275
+ - type: cosine_accuracy@5
276
+ value: 0.7369186046511628
277
+ name: Cosine Accuracy@5
278
+ - type: cosine_accuracy@10
279
+ value: 0.8226744186046512
280
+ name: Cosine Accuracy@10
281
+ - type: cosine_precision@1
282
+ value: 0.498546511627907
283
+ name: Cosine Precision@1
284
+ - type: cosine_precision@3
285
+ value: 0.22335271317829455
286
+ name: Cosine Precision@3
287
+ - type: cosine_precision@5
288
+ value: 0.14738372093023255
289
+ name: Cosine Precision@5
290
+ - type: cosine_precision@10
291
+ value: 0.08226744186046511
292
+ name: Cosine Precision@10
293
+ - type: cosine_recall@1
294
+ value: 0.498546511627907
295
+ name: Cosine Recall@1
296
+ - type: cosine_recall@3
297
+ value: 0.6700581395348837
298
+ name: Cosine Recall@3
299
+ - type: cosine_recall@5
300
+ value: 0.7369186046511628
301
+ name: Cosine Recall@5
302
+ - type: cosine_recall@10
303
+ value: 0.8226744186046512
304
+ name: Cosine Recall@10
305
+ - type: cosine_ndcg@10
306
+ value: 0.6569451636973174
307
+ name: Cosine Ndcg@10
308
+ - type: cosine_mrr@10
309
+ value: 0.6045300387596899
310
+ name: Cosine Mrr@10
311
+ - type: cosine_map@100
312
+ value: 0.6099407824679366
313
+ name: Cosine Map@100
314
+ - task:
315
+ type: information-retrieval
316
+ name: Information Retrieval
317
+ dataset:
318
+ name: dim 64
319
+ type: dim_64
320
+ metrics:
321
+ - type: cosine_accuracy@1
322
+ value: 0.39680232558139533
323
+ name: Cosine Accuracy@1
324
+ - type: cosine_accuracy@3
325
+ value: 0.5523255813953488
326
+ name: Cosine Accuracy@3
327
+ - type: cosine_accuracy@5
328
+ value: 0.626453488372093
329
+ name: Cosine Accuracy@5
330
+ - type: cosine_accuracy@10
331
+ value: 0.7209302325581395
332
+ name: Cosine Accuracy@10
333
+ - type: cosine_precision@1
334
+ value: 0.39680232558139533
335
+ name: Cosine Precision@1
336
+ - type: cosine_precision@3
337
+ value: 0.18410852713178294
338
+ name: Cosine Precision@3
339
+ - type: cosine_precision@5
340
+ value: 0.12529069767441858
341
+ name: Cosine Precision@5
342
+ - type: cosine_precision@10
343
+ value: 0.07209302325581395
344
+ name: Cosine Precision@10
345
+ - type: cosine_recall@1
346
+ value: 0.39680232558139533
347
+ name: Cosine Recall@1
348
+ - type: cosine_recall@3
349
+ value: 0.5523255813953488
350
+ name: Cosine Recall@3
351
+ - type: cosine_recall@5
352
+ value: 0.626453488372093
353
+ name: Cosine Recall@5
354
+ - type: cosine_recall@10
355
+ value: 0.7209302325581395
356
+ name: Cosine Recall@10
357
+ - type: cosine_ndcg@10
358
+ value: 0.5476937013127858
359
+ name: Cosine Ndcg@10
360
+ - type: cosine_mrr@10
361
+ value: 0.4936317598744924
362
+ name: Cosine Mrr@10
363
+ - type: cosine_map@100
364
+ value: 0.5014802548028123
365
+ name: Cosine Map@100
366
+ ---
367
+
368
+ # Fine-tuned with [QuicKB](https://github.com/ALucek/QuicKB)
369
+
370
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
371
+
372
+ ## Model Details
373
+
374
+ ### Model Description
375
+ - **Model Type:** Sentence Transformer
376
+ - **Base model:** [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) <!-- at revision d556a88e332558790b210f7bdbe87da2fa94a8d8 -->
377
+ - **Maximum Sequence Length:** 512 tokens
378
+ - **Output Dimensionality:** 768 dimensions
379
+ - **Similarity Function:** Cosine Similarity
380
+ <!-- - **Training Dataset:** Unknown -->
381
+ - **Language:** en
382
+ - **License:** apache-2.0
383
+
384
+ ### Model Sources
385
+
386
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
387
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
388
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
389
+
390
+ ### Full Model Architecture
391
+
392
+ ```
393
+ SentenceTransformer(
394
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
395
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
396
+ (2): Normalize()
397
+ )
398
+ ```
399
+
400
+ ## Usage
401
+
402
+ ### Direct Usage (Sentence Transformers)
403
+
404
+ First install the Sentence Transformers library:
405
+
406
+ ```bash
407
+ pip install -U sentence-transformers
408
+ ```
409
+
410
+ Then you can load this model and run inference.
411
+ ```python
412
+ from sentence_transformers import SentenceTransformer
413
+
414
+ # Download from the 🤗 Hub
415
+ model = SentenceTransformer("sentence_transformers_model_id")
416
+ # Run inference
417
+ sentences = [
418
+ 'What is described in Section 25 of the Arbitration Act?',
419
+ '. (3) The provision of subsections (1) and (2) shall apply only to the extent agreed to by the parties. (4) The arbitral tribunal shall decide according to considerations of general justice and fairness or trade usages only if the parties have expressly authorised it to do so. Section 25 of the Arbitration Act describes the form and content of the arbitral award as follows: 25',
420
+ '. 9 and 10 based on the objection taken to them by the Counsel for HNB, despite the fact that they did not arise from the pleadings, and were altogether inconsistent with them, answered the afore-stated question of law (in respect of which this Court had granted Leave to Appeal in that case) in the affirmative and in favour of HNB, and stated as follows: “In conclusion, it needs to be emphasised',
421
+ ]
422
+ embeddings = model.encode(sentences)
423
+ print(embeddings.shape)
424
+ # [3, 768]
425
+
426
+ # Get the similarity scores for the embeddings
427
+ similarities = model.similarity(embeddings, embeddings)
428
+ print(similarities.shape)
429
+ # [3, 3]
430
+ ```
431
+
432
+ <!--
433
+ ### Direct Usage (Transformers)
434
+
435
+ <details><summary>Click to see the direct usage in Transformers</summary>
436
+
437
+ </details>
438
+ -->
439
+
440
+ <!--
441
+ ### Downstream Usage (Sentence Transformers)
442
+
443
+ You can finetune this model on your own dataset.
444
+
445
+ <details><summary>Click to expand</summary>
446
+
447
+ </details>
448
+ -->
449
+
450
+ <!--
451
+ ### Out-of-Scope Use
452
+
453
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
454
+ -->
455
+
456
+ ## Evaluation
457
+
458
+ ### Metrics
459
+
460
+ #### Information Retrieval
461
+
462
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
463
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
464
+
465
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
466
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
467
+ | cosine_accuracy@1 | 0.5785 | 0.577 | 0.5552 | 0.4985 | 0.3968 |
468
+ | cosine_accuracy@3 | 0.7602 | 0.7616 | 0.7355 | 0.6701 | 0.5523 |
469
+ | cosine_accuracy@5 | 0.8198 | 0.8198 | 0.7951 | 0.7369 | 0.6265 |
470
+ | cosine_accuracy@10 | 0.8881 | 0.8823 | 0.8605 | 0.8227 | 0.7209 |
471
+ | cosine_precision@1 | 0.5785 | 0.577 | 0.5552 | 0.4985 | 0.3968 |
472
+ | cosine_precision@3 | 0.2534 | 0.2539 | 0.2452 | 0.2234 | 0.1841 |
473
+ | cosine_precision@5 | 0.164 | 0.164 | 0.159 | 0.1474 | 0.1253 |
474
+ | cosine_precision@10 | 0.0888 | 0.0882 | 0.086 | 0.0823 | 0.0721 |
475
+ | cosine_recall@1 | 0.5785 | 0.577 | 0.5552 | 0.4985 | 0.3968 |
476
+ | cosine_recall@3 | 0.7602 | 0.7616 | 0.7355 | 0.6701 | 0.5523 |
477
+ | cosine_recall@5 | 0.8198 | 0.8198 | 0.7951 | 0.7369 | 0.6265 |
478
+ | cosine_recall@10 | 0.8881 | 0.8823 | 0.8605 | 0.8227 | 0.7209 |
479
+ | **cosine_ndcg@10** | **0.7331** | **0.7304** | **0.7074** | **0.6569** | **0.5477** |
480
+ | cosine_mrr@10 | 0.6835 | 0.6816 | 0.6585 | 0.6045 | 0.4936 |
481
+ | cosine_map@100 | 0.6874 | 0.6857 | 0.6631 | 0.6099 | 0.5015 |
482
+
483
+ <!--
484
+ ## Bias, Risks and Limitations
485
+
486
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
487
+ -->
488
+
489
+ <!--
490
+ ### Recommendations
491
+
492
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
493
+ -->
494
+
495
+ ## Training Details
496
+
497
+ ### Training Dataset
498
+
499
+ #### Unnamed Dataset
500
+
501
+ * Size: 6,190 training samples
502
+ * Columns: <code>anchor</code> and <code>positive</code>
503
+ * Approximate statistics based on the first 1000 samples:
504
+ | | anchor | positive |
505
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
506
+ | type | string | string |
507
+ | details | <ul><li>min: 7 tokens</li><li>mean: 15.11 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 69.53 tokens</li><li>max: 214 tokens</li></ul> |
508
+ * Samples:
509
+ | anchor | positive |
510
+ |:---------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
511
+ | <code>How must the District Court exercise its discretion?</code> | <code>imposition of ‘ a’ term; (5) It is not mandatory to impose security, as evinced by the use of the conjunction “or”; (6) In imposing terms, the District Court must be mindful of the objectives of the Act, and its discretion must be exercised judicially</code> |
512
+ | <code>What is the source of the observation made by Christian Appu?</code> | <code>. Christian Appu , (1895) 1 NLR 288 observed that , “possession is "disturbed" either by an action intended to remove the possessor from the land, or by acts which prevent the possessor from enjoying the free and full use of 12 the land of which he is in the course of acquiring the dominion, and which convert his continuous user into a disconnected and divided user ”</code> |
513
+ | <code>What must the defendant do regarding the plaintiff's claim?</code> | <code>. The Court of Appeal in Ramanayake v Sampath Bank Ltd and Others [(1993) 1 Sri LR 145 at page 153] has held that, “The defendant has to deal with the plaintiff’s claim on its merits; it is not competent for the defendant to merely set out technical objections. It is also incumbent on the defendant to reveal his defence, if he has any</code> |
514
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
515
+ ```json
516
+ {
517
+ "loss": "MultipleNegativesRankingLoss",
518
+ "matryoshka_dims": [
519
+ 768,
520
+ 512,
521
+ 256,
522
+ 128,
523
+ 64
524
+ ],
525
+ "matryoshka_weights": [
526
+ 1,
527
+ 1,
528
+ 1,
529
+ 1,
530
+ 1
531
+ ],
532
+ "n_dims_per_step": -1
533
+ }
534
+ ```
535
+
536
+ ### Training Hyperparameters
537
+ #### Non-Default Hyperparameters
538
+
539
+ - `eval_strategy`: epoch
540
+ - `per_device_train_batch_size`: 16
541
+ - `gradient_accumulation_steps`: 8
542
+ - `learning_rate`: 2e-05
543
+ - `lr_scheduler_type`: cosine
544
+ - `warmup_ratio`: 0.1
545
+ - `tf32`: True
546
+ - `load_best_model_at_end`: True
547
+ - `optim`: adamw_torch_fused
548
+ - `batch_sampler`: no_duplicates
549
+
550
+ #### All Hyperparameters
551
+ <details><summary>Click to expand</summary>
552
+
553
+ - `overwrite_output_dir`: False
554
+ - `do_predict`: False
555
+ - `eval_strategy`: epoch
556
+ - `prediction_loss_only`: True
557
+ - `per_device_train_batch_size`: 16
558
+ - `per_device_eval_batch_size`: 8
559
+ - `per_gpu_train_batch_size`: None
560
+ - `per_gpu_eval_batch_size`: None
561
+ - `gradient_accumulation_steps`: 8
562
+ - `eval_accumulation_steps`: None
563
+ - `torch_empty_cache_steps`: None
564
+ - `learning_rate`: 2e-05
565
+ - `weight_decay`: 0.0
566
+ - `adam_beta1`: 0.9
567
+ - `adam_beta2`: 0.999
568
+ - `adam_epsilon`: 1e-08
569
+ - `max_grad_norm`: 1.0
570
+ - `num_train_epochs`: 3
571
+ - `max_steps`: -1
572
+ - `lr_scheduler_type`: cosine
573
+ - `lr_scheduler_kwargs`: {}
574
+ - `warmup_ratio`: 0.1
575
+ - `warmup_steps`: 0
576
+ - `log_level`: passive
577
+ - `log_level_replica`: warning
578
+ - `log_on_each_node`: True
579
+ - `logging_nan_inf_filter`: True
580
+ - `save_safetensors`: True
581
+ - `save_on_each_node`: False
582
+ - `save_only_model`: False
583
+ - `restore_callback_states_from_checkpoint`: False
584
+ - `no_cuda`: False
585
+ - `use_cpu`: False
586
+ - `use_mps_device`: False
587
+ - `seed`: 42
588
+ - `data_seed`: None
589
+ - `jit_mode_eval`: False
590
+ - `use_ipex`: False
591
+ - `bf16`: False
592
+ - `fp16`: False
593
+ - `fp16_opt_level`: O1
594
+ - `half_precision_backend`: auto
595
+ - `bf16_full_eval`: False
596
+ - `fp16_full_eval`: False
597
+ - `tf32`: True
598
+ - `local_rank`: 0
599
+ - `ddp_backend`: None
600
+ - `tpu_num_cores`: None
601
+ - `tpu_metrics_debug`: False
602
+ - `debug`: []
603
+ - `dataloader_drop_last`: False
604
+ - `dataloader_num_workers`: 0
605
+ - `dataloader_prefetch_factor`: None
606
+ - `past_index`: -1
607
+ - `disable_tqdm`: False
608
+ - `remove_unused_columns`: True
609
+ - `label_names`: None
610
+ - `load_best_model_at_end`: True
611
+ - `ignore_data_skip`: False
612
+ - `fsdp`: []
613
+ - `fsdp_min_num_params`: 0
614
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
615
+ - `fsdp_transformer_layer_cls_to_wrap`: None
616
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
617
+ - `deepspeed`: None
618
+ - `label_smoothing_factor`: 0.0
619
+ - `optim`: adamw_torch_fused
620
+ - `optim_args`: None
621
+ - `adafactor`: False
622
+ - `group_by_length`: False
623
+ - `length_column_name`: length
624
+ - `ddp_find_unused_parameters`: None
625
+ - `ddp_bucket_cap_mb`: None
626
+ - `ddp_broadcast_buffers`: False
627
+ - `dataloader_pin_memory`: True
628
+ - `dataloader_persistent_workers`: False
629
+ - `skip_memory_metrics`: True
630
+ - `use_legacy_prediction_loop`: False
631
+ - `push_to_hub`: False
632
+ - `resume_from_checkpoint`: None
633
+ - `hub_model_id`: None
634
+ - `hub_strategy`: every_save
635
+ - `hub_private_repo`: None
636
+ - `hub_always_push`: False
637
+ - `gradient_checkpointing`: False
638
+ - `gradient_checkpointing_kwargs`: None
639
+ - `include_inputs_for_metrics`: False
640
+ - `include_for_metrics`: []
641
+ - `eval_do_concat_batches`: True
642
+ - `fp16_backend`: auto
643
+ - `push_to_hub_model_id`: None
644
+ - `push_to_hub_organization`: None
645
+ - `mp_parameters`:
646
+ - `auto_find_batch_size`: False
647
+ - `full_determinism`: False
648
+ - `torchdynamo`: None
649
+ - `ray_scope`: last
650
+ - `ddp_timeout`: 1800
651
+ - `torch_compile`: False
652
+ - `torch_compile_backend`: None
653
+ - `torch_compile_mode`: None
654
+ - `dispatch_batches`: None
655
+ - `split_batches`: None
656
+ - `include_tokens_per_second`: False
657
+ - `include_num_input_tokens_seen`: False
658
+ - `neftune_noise_alpha`: None
659
+ - `optim_target_modules`: None
660
+ - `batch_eval_metrics`: False
661
+ - `eval_on_start`: False
662
+ - `use_liger_kernel`: False
663
+ - `eval_use_gather_object`: False
664
+ - `average_tokens_across_devices`: False
665
+ - `prompts`: None
666
+ - `batch_sampler`: no_duplicates
667
+ - `multi_dataset_batch_sampler`: proportional
668
+
669
+ </details>
670
+
671
+ ### Training Logs
672
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
673
+ |:------:|:----:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
674
+ | 0.1034 | 5 | 29.8712 | - | - | - | - | - |
675
+ | 0.2067 | 10 | 26.1323 | - | - | - | - | - |
676
+ | 0.3101 | 15 | 17.8585 | - | - | - | - | - |
677
+ | 0.4134 | 20 | 14.0232 | - | - | - | - | - |
678
+ | 0.5168 | 25 | 11.6897 | - | - | - | - | - |
679
+ | 0.6202 | 30 | 10.8431 | - | - | - | - | - |
680
+ | 0.7235 | 35 | 9.264 | - | - | - | - | - |
681
+ | 0.8269 | 40 | 11.2186 | - | - | - | - | - |
682
+ | 0.9302 | 45 | 9.9143 | - | - | - | - | - |
683
+ | 1.0 | 49 | - | 0.7134 | 0.7110 | 0.6902 | 0.6341 | 0.5282 |
684
+ | 1.0207 | 50 | 7.2581 | - | - | - | - | - |
685
+ | 1.1240 | 55 | 6.066 | - | - | - | - | - |
686
+ | 1.2274 | 60 | 6.3626 | - | - | - | - | - |
687
+ | 1.3307 | 65 | 6.8135 | - | - | - | - | - |
688
+ | 1.4341 | 70 | 5.5556 | - | - | - | - | - |
689
+ | 1.5375 | 75 | 6.0144 | - | - | - | - | - |
690
+ | 1.6408 | 80 | 6.1965 | - | - | - | - | - |
691
+ | 1.7442 | 85 | 5.596 | - | - | - | - | - |
692
+ | 1.8475 | 90 | 6.631 | - | - | - | - | - |
693
+ | 1.9509 | 95 | 6.3319 | - | - | - | - | - |
694
+ | 2.0 | 98 | - | 0.7331 | 0.7304 | 0.7074 | 0.6569 | 0.5477 |
695
+
696
+
697
+ ### Framework Versions
698
+ - Python: 3.13.3
699
+ - Sentence Transformers: 3.4.0
700
+ - Transformers: 4.48.1
701
+ - PyTorch: 2.6.0+cu126
702
+ - Accelerate: 1.3.0
703
+ - Datasets: 3.2.0
704
+ - Tokenizers: 0.21.1
705
+
706
+ ## Citation
707
+
708
+ ### BibTeX
709
+
710
+ #### Sentence Transformers
711
+ ```bibtex
712
+ @inproceedings{reimers-2019-sentence-bert,
713
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
714
+ author = "Reimers, Nils and Gurevych, Iryna",
715
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
716
+ month = "11",
717
+ year = "2019",
718
+ publisher = "Association for Computational Linguistics",
719
+ url = "https://arxiv.org/abs/1908.10084",
720
+ }
721
+ ```
722
+
723
+ #### MatryoshkaLoss
724
+ ```bibtex
725
+ @misc{kusupati2024matryoshka,
726
+ title={Matryoshka Representation Learning},
727
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
728
+ year={2024},
729
+ eprint={2205.13147},
730
+ archivePrefix={arXiv},
731
+ primaryClass={cs.LG}
732
+ }
733
+ ```
734
+
735
+ #### MultipleNegativesRankingLoss
736
+ ```bibtex
737
+ @misc{henderson2017efficient,
738
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
739
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
740
+ year={2017},
741
+ eprint={1705.00652},
742
+ archivePrefix={arXiv},
743
+ primaryClass={cs.CL}
744
+ }
745
+ ```
746
+
747
+ <!--
748
+ ## Glossary
749
+
750
+ *Clearly define terms in order to be accessible across audiences.*
751
+ -->
752
+
753
+ <!--
754
+ ## Model Card Authors
755
+
756
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
757
+ -->
758
+
759
+ <!--
760
+ ## Model Card Contact
761
+
762
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
763
+ -->
checkpoint-98/config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "nomic-ai/modernbert-embed-base",
3
+ "architectures": [
4
+ "ModernBertModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 1152,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 12,
36
+ "num_hidden_layers": 22,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "reference_compile": false,
40
+ "repad_logits_with_grad": false,
41
+ "sep_token_id": 50282,
42
+ "sparse_pred_ignore_index": -100,
43
+ "sparse_prediction": false,
44
+ "torch_dtype": "float32",
45
+ "transformers_version": "4.48.1",
46
+ "vocab_size": 50368
47
+ }
checkpoint-98/config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.0",
4
+ "transformers": "4.48.1",
5
+ "pytorch": "2.6.0+cu126"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
checkpoint-98/modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
checkpoint-98/optimizer.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9399f30e389d2f7bcc072fe0d3340d13ad3d3e3d5d0c4ee5a28debd474834ae6
3
+ size 1192228922
checkpoint-98/rng_state.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7a49b5e032e7616d6731043e2e3391e1aede7da9f3756ee069faddb514dbe84b
3
+ size 14244
checkpoint-98/scheduler.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69f649a632e5aad6e06eb989cd5eadbffaf28fd582c8cdd02c88f011eec11126
3
+ size 1064
checkpoint-98/sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
checkpoint-98/special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
checkpoint-98/tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
checkpoint-98/tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }
checkpoint-98/trainer_state.json ADDED
@@ -0,0 +1,332 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.6569451636973174,
3
+ "best_model_checkpoint": "./output/modernbert_quickb\\checkpoint-98",
4
+ "epoch": 2.0,
5
+ "eval_steps": 500,
6
+ "global_step": 98,
7
+ "is_hyper_param_search": false,
8
+ "is_local_process_zero": true,
9
+ "is_world_process_zero": true,
10
+ "log_history": [
11
+ {
12
+ "epoch": 0.10335917312661498,
13
+ "grad_norm": 319.4144592285156,
14
+ "learning_rate": 6.666666666666667e-06,
15
+ "loss": 29.8712,
16
+ "step": 5
17
+ },
18
+ {
19
+ "epoch": 0.20671834625322996,
20
+ "grad_norm": 169.99363708496094,
21
+ "learning_rate": 1.3333333333333333e-05,
22
+ "loss": 26.1323,
23
+ "step": 10
24
+ },
25
+ {
26
+ "epoch": 0.31007751937984496,
27
+ "grad_norm": 103.89845275878906,
28
+ "learning_rate": 2e-05,
29
+ "loss": 17.8585,
30
+ "step": 15
31
+ },
32
+ {
33
+ "epoch": 0.4134366925064599,
34
+ "grad_norm": 86.83181762695312,
35
+ "learning_rate": 1.9925955354920265e-05,
36
+ "loss": 14.0232,
37
+ "step": 20
38
+ },
39
+ {
40
+ "epoch": 0.5167958656330749,
41
+ "grad_norm": 102.58989715576172,
42
+ "learning_rate": 1.9704917941574053e-05,
43
+ "loss": 11.6897,
44
+ "step": 25
45
+ },
46
+ {
47
+ "epoch": 0.6201550387596899,
48
+ "grad_norm": 80.32222747802734,
49
+ "learning_rate": 1.9340161087325483e-05,
50
+ "loss": 10.8431,
51
+ "step": 30
52
+ },
53
+ {
54
+ "epoch": 0.7235142118863049,
55
+ "grad_norm": 155.523681640625,
56
+ "learning_rate": 1.8837086450537195e-05,
57
+ "loss": 9.264,
58
+ "step": 35
59
+ },
60
+ {
61
+ "epoch": 0.8268733850129198,
62
+ "grad_norm": 110.1160888671875,
63
+ "learning_rate": 1.820314402779511e-05,
64
+ "loss": 11.2186,
65
+ "step": 40
66
+ },
67
+ {
68
+ "epoch": 0.9302325581395349,
69
+ "grad_norm": 105.31327819824219,
70
+ "learning_rate": 1.7447721827437822e-05,
71
+ "loss": 9.9143,
72
+ "step": 45
73
+ },
74
+ {
75
+ "epoch": 1.0,
76
+ "eval_dim_128_cosine_accuracy@1": 0.47674418604651164,
77
+ "eval_dim_128_cosine_accuracy@10": 0.7979651162790697,
78
+ "eval_dim_128_cosine_accuracy@3": 0.6584302325581395,
79
+ "eval_dim_128_cosine_accuracy@5": 0.7151162790697675,
80
+ "eval_dim_128_cosine_map@100": 0.5882659880635713,
81
+ "eval_dim_128_cosine_mrr@10": 0.5821745801033589,
82
+ "eval_dim_128_cosine_ndcg@10": 0.6341456264396685,
83
+ "eval_dim_128_cosine_precision@1": 0.47674418604651164,
84
+ "eval_dim_128_cosine_precision@10": 0.07979651162790696,
85
+ "eval_dim_128_cosine_precision@3": 0.2194767441860465,
86
+ "eval_dim_128_cosine_precision@5": 0.14302325581395348,
87
+ "eval_dim_128_cosine_recall@1": 0.47674418604651164,
88
+ "eval_dim_128_cosine_recall@10": 0.7979651162790697,
89
+ "eval_dim_128_cosine_recall@3": 0.6584302325581395,
90
+ "eval_dim_128_cosine_recall@5": 0.7151162790697675,
91
+ "eval_dim_256_cosine_accuracy@1": 0.5392441860465116,
92
+ "eval_dim_256_cosine_accuracy@10": 0.8473837209302325,
93
+ "eval_dim_256_cosine_accuracy@3": 0.7136627906976745,
94
+ "eval_dim_256_cosine_accuracy@5": 0.7645348837209303,
95
+ "eval_dim_256_cosine_map@100": 0.6454694833433305,
96
+ "eval_dim_256_cosine_mrr@10": 0.640224137135474,
97
+ "eval_dim_256_cosine_ndcg@10": 0.6901657805517794,
98
+ "eval_dim_256_cosine_precision@1": 0.5392441860465116,
99
+ "eval_dim_256_cosine_precision@10": 0.08473837209302325,
100
+ "eval_dim_256_cosine_precision@3": 0.23788759689922478,
101
+ "eval_dim_256_cosine_precision@5": 0.15290697674418602,
102
+ "eval_dim_256_cosine_recall@1": 0.5392441860465116,
103
+ "eval_dim_256_cosine_recall@10": 0.8473837209302325,
104
+ "eval_dim_256_cosine_recall@3": 0.7136627906976745,
105
+ "eval_dim_256_cosine_recall@5": 0.7645348837209303,
106
+ "eval_dim_512_cosine_accuracy@1": 0.5450581395348837,
107
+ "eval_dim_512_cosine_accuracy@10": 0.8808139534883721,
108
+ "eval_dim_512_cosine_accuracy@3": 0.7412790697674418,
109
+ "eval_dim_512_cosine_accuracy@5": 0.7994186046511628,
110
+ "eval_dim_512_cosine_map@100": 0.6607688904855182,
111
+ "eval_dim_512_cosine_mrr@10": 0.6569952011812478,
112
+ "eval_dim_512_cosine_ndcg@10": 0.7109774036246727,
113
+ "eval_dim_512_cosine_precision@1": 0.5450581395348837,
114
+ "eval_dim_512_cosine_precision@10": 0.0880813953488372,
115
+ "eval_dim_512_cosine_precision@3": 0.24709302325581395,
116
+ "eval_dim_512_cosine_precision@5": 0.15988372093023254,
117
+ "eval_dim_512_cosine_recall@1": 0.5450581395348837,
118
+ "eval_dim_512_cosine_recall@10": 0.8808139534883721,
119
+ "eval_dim_512_cosine_recall@3": 0.7412790697674418,
120
+ "eval_dim_512_cosine_recall@5": 0.7994186046511628,
121
+ "eval_dim_64_cosine_accuracy@1": 0.37790697674418605,
122
+ "eval_dim_64_cosine_accuracy@10": 0.6976744186046512,
123
+ "eval_dim_64_cosine_accuracy@3": 0.5421511627906976,
124
+ "eval_dim_64_cosine_accuracy@5": 0.5915697674418605,
125
+ "eval_dim_64_cosine_map@100": 0.48337713890549217,
126
+ "eval_dim_64_cosine_mrr@10": 0.4751395810262089,
127
+ "eval_dim_64_cosine_ndcg@10": 0.5281507062500577,
128
+ "eval_dim_64_cosine_precision@1": 0.37790697674418605,
129
+ "eval_dim_64_cosine_precision@10": 0.06976744186046512,
130
+ "eval_dim_64_cosine_precision@3": 0.18071705426356585,
131
+ "eval_dim_64_cosine_precision@5": 0.11831395348837208,
132
+ "eval_dim_64_cosine_recall@1": 0.37790697674418605,
133
+ "eval_dim_64_cosine_recall@10": 0.6976744186046512,
134
+ "eval_dim_64_cosine_recall@3": 0.5421511627906976,
135
+ "eval_dim_64_cosine_recall@5": 0.5915697674418605,
136
+ "eval_dim_768_cosine_accuracy@1": 0.5508720930232558,
137
+ "eval_dim_768_cosine_accuracy@10": 0.876453488372093,
138
+ "eval_dim_768_cosine_accuracy@3": 0.7427325581395349,
139
+ "eval_dim_768_cosine_accuracy@5": 0.8037790697674418,
140
+ "eval_dim_768_cosine_map@100": 0.6655205749696947,
141
+ "eval_dim_768_cosine_mrr@10": 0.6611186092654114,
142
+ "eval_dim_768_cosine_ndcg@10": 0.7133959457013295,
143
+ "eval_dim_768_cosine_precision@1": 0.5508720930232558,
144
+ "eval_dim_768_cosine_precision@10": 0.0876453488372093,
145
+ "eval_dim_768_cosine_precision@3": 0.24757751937984498,
146
+ "eval_dim_768_cosine_precision@5": 0.16075581395348837,
147
+ "eval_dim_768_cosine_recall@1": 0.5508720930232558,
148
+ "eval_dim_768_cosine_recall@10": 0.876453488372093,
149
+ "eval_dim_768_cosine_recall@3": 0.7427325581395349,
150
+ "eval_dim_768_cosine_recall@5": 0.8037790697674418,
151
+ "eval_runtime": 502.5762,
152
+ "eval_samples_per_second": 0.0,
153
+ "eval_sequential_score": 0.5281507062500577,
154
+ "eval_steps_per_second": 0.0,
155
+ "step": 49
156
+ },
157
+ {
158
+ "epoch": 1.020671834625323,
159
+ "grad_norm": 110.23486328125,
160
+ "learning_rate": 1.658200684320748e-05,
161
+ "loss": 7.2581,
162
+ "step": 50
163
+ },
164
+ {
165
+ "epoch": 1.124031007751938,
166
+ "grad_norm": 66.97112274169922,
167
+ "learning_rate": 1.5618819386853607e-05,
168
+ "loss": 6.066,
169
+ "step": 55
170
+ },
171
+ {
172
+ "epoch": 1.227390180878553,
173
+ "grad_norm": 76.00322723388672,
174
+ "learning_rate": 1.4572423233046386e-05,
175
+ "loss": 6.3626,
176
+ "step": 60
177
+ },
178
+ {
179
+ "epoch": 1.330749354005168,
180
+ "grad_norm": 62.68793487548828,
181
+ "learning_rate": 1.3458314388150115e-05,
182
+ "loss": 6.8135,
183
+ "step": 65
184
+ },
185
+ {
186
+ "epoch": 1.4341085271317828,
187
+ "grad_norm": 69.59709930419922,
188
+ "learning_rate": 1.2292991610964902e-05,
189
+ "loss": 5.5556,
190
+ "step": 70
191
+ },
192
+ {
193
+ "epoch": 1.5374677002583979,
194
+ "grad_norm": 143.95458984375,
195
+ "learning_rate": 1.1093712083778748e-05,
196
+ "loss": 6.0144,
197
+ "step": 75
198
+ },
199
+ {
200
+ "epoch": 1.6408268733850129,
201
+ "grad_norm": 71.6358413696289,
202
+ "learning_rate": 9.878235851980027e-06,
203
+ "loss": 6.1965,
204
+ "step": 80
205
+ },
206
+ {
207
+ "epoch": 1.744186046511628,
208
+ "grad_norm": 77.96407318115234,
209
+ "learning_rate": 8.664562816806022e-06,
210
+ "loss": 5.596,
211
+ "step": 85
212
+ },
213
+ {
214
+ "epoch": 1.847545219638243,
215
+ "grad_norm": 86.09881591796875,
216
+ "learning_rate": 7.470666176083193e-06,
217
+ "loss": 6.631,
218
+ "step": 90
219
+ },
220
+ {
221
+ "epoch": 1.950904392764858,
222
+ "grad_norm": 80.9687271118164,
223
+ "learning_rate": 6.314226260416383e-06,
224
+ "loss": 6.3319,
225
+ "step": 95
226
+ },
227
+ {
228
+ "epoch": 2.0,
229
+ "eval_dim_128_cosine_accuracy@1": 0.498546511627907,
230
+ "eval_dim_128_cosine_accuracy@10": 0.8226744186046512,
231
+ "eval_dim_128_cosine_accuracy@3": 0.6700581395348837,
232
+ "eval_dim_128_cosine_accuracy@5": 0.7369186046511628,
233
+ "eval_dim_128_cosine_map@100": 0.6099407824679366,
234
+ "eval_dim_128_cosine_mrr@10": 0.6045300387596899,
235
+ "eval_dim_128_cosine_ndcg@10": 0.6569451636973174,
236
+ "eval_dim_128_cosine_precision@1": 0.498546511627907,
237
+ "eval_dim_128_cosine_precision@10": 0.08226744186046511,
238
+ "eval_dim_128_cosine_precision@3": 0.22335271317829455,
239
+ "eval_dim_128_cosine_precision@5": 0.14738372093023255,
240
+ "eval_dim_128_cosine_recall@1": 0.498546511627907,
241
+ "eval_dim_128_cosine_recall@10": 0.8226744186046512,
242
+ "eval_dim_128_cosine_recall@3": 0.6700581395348837,
243
+ "eval_dim_128_cosine_recall@5": 0.7369186046511628,
244
+ "eval_dim_256_cosine_accuracy@1": 0.5552325581395349,
245
+ "eval_dim_256_cosine_accuracy@10": 0.8604651162790697,
246
+ "eval_dim_256_cosine_accuracy@3": 0.7354651162790697,
247
+ "eval_dim_256_cosine_accuracy@5": 0.7950581395348837,
248
+ "eval_dim_256_cosine_map@100": 0.6631022921972395,
249
+ "eval_dim_256_cosine_mrr@10": 0.6584688768918417,
250
+ "eval_dim_256_cosine_ndcg@10": 0.7073914263638542,
251
+ "eval_dim_256_cosine_precision@1": 0.5552325581395349,
252
+ "eval_dim_256_cosine_precision@10": 0.08604651162790695,
253
+ "eval_dim_256_cosine_precision@3": 0.24515503875968994,
254
+ "eval_dim_256_cosine_precision@5": 0.15901162790697673,
255
+ "eval_dim_256_cosine_recall@1": 0.5552325581395349,
256
+ "eval_dim_256_cosine_recall@10": 0.8604651162790697,
257
+ "eval_dim_256_cosine_recall@3": 0.7354651162790697,
258
+ "eval_dim_256_cosine_recall@5": 0.7950581395348837,
259
+ "eval_dim_512_cosine_accuracy@1": 0.5770348837209303,
260
+ "eval_dim_512_cosine_accuracy@10": 0.8822674418604651,
261
+ "eval_dim_512_cosine_accuracy@3": 0.7616279069767442,
262
+ "eval_dim_512_cosine_accuracy@5": 0.8197674418604651,
263
+ "eval_dim_512_cosine_map@100": 0.685728403908298,
264
+ "eval_dim_512_cosine_mrr@10": 0.6816127953119231,
265
+ "eval_dim_512_cosine_ndcg@10": 0.7303694393079266,
266
+ "eval_dim_512_cosine_precision@1": 0.5770348837209303,
267
+ "eval_dim_512_cosine_precision@10": 0.0882267441860465,
268
+ "eval_dim_512_cosine_precision@3": 0.25387596899224807,
269
+ "eval_dim_512_cosine_precision@5": 0.163953488372093,
270
+ "eval_dim_512_cosine_recall@1": 0.5770348837209303,
271
+ "eval_dim_512_cosine_recall@10": 0.8822674418604651,
272
+ "eval_dim_512_cosine_recall@3": 0.7616279069767442,
273
+ "eval_dim_512_cosine_recall@5": 0.8197674418604651,
274
+ "eval_dim_64_cosine_accuracy@1": 0.39680232558139533,
275
+ "eval_dim_64_cosine_accuracy@10": 0.7209302325581395,
276
+ "eval_dim_64_cosine_accuracy@3": 0.5523255813953488,
277
+ "eval_dim_64_cosine_accuracy@5": 0.626453488372093,
278
+ "eval_dim_64_cosine_map@100": 0.5014802548028123,
279
+ "eval_dim_64_cosine_mrr@10": 0.4936317598744924,
280
+ "eval_dim_64_cosine_ndcg@10": 0.5476937013127858,
281
+ "eval_dim_64_cosine_precision@1": 0.39680232558139533,
282
+ "eval_dim_64_cosine_precision@10": 0.07209302325581395,
283
+ "eval_dim_64_cosine_precision@3": 0.18410852713178294,
284
+ "eval_dim_64_cosine_precision@5": 0.12529069767441858,
285
+ "eval_dim_64_cosine_recall@1": 0.39680232558139533,
286
+ "eval_dim_64_cosine_recall@10": 0.7209302325581395,
287
+ "eval_dim_64_cosine_recall@3": 0.5523255813953488,
288
+ "eval_dim_64_cosine_recall@5": 0.626453488372093,
289
+ "eval_dim_768_cosine_accuracy@1": 0.5784883720930233,
290
+ "eval_dim_768_cosine_accuracy@10": 0.8880813953488372,
291
+ "eval_dim_768_cosine_accuracy@3": 0.7601744186046512,
292
+ "eval_dim_768_cosine_accuracy@5": 0.8197674418604651,
293
+ "eval_dim_768_cosine_map@100": 0.6874162730700494,
294
+ "eval_dim_768_cosine_mrr@10": 0.6835271317829454,
295
+ "eval_dim_768_cosine_ndcg@10": 0.733110755438693,
296
+ "eval_dim_768_cosine_precision@1": 0.5784883720930233,
297
+ "eval_dim_768_cosine_precision@10": 0.08880813953488371,
298
+ "eval_dim_768_cosine_precision@3": 0.253391472868217,
299
+ "eval_dim_768_cosine_precision@5": 0.163953488372093,
300
+ "eval_dim_768_cosine_recall@1": 0.5784883720930233,
301
+ "eval_dim_768_cosine_recall@10": 0.8880813953488372,
302
+ "eval_dim_768_cosine_recall@3": 0.7601744186046512,
303
+ "eval_dim_768_cosine_recall@5": 0.8197674418604651,
304
+ "eval_runtime": 661.4637,
305
+ "eval_samples_per_second": 0.0,
306
+ "eval_sequential_score": 0.5476937013127858,
307
+ "eval_steps_per_second": 0.0,
308
+ "step": 98
309
+ }
310
+ ],
311
+ "logging_steps": 5,
312
+ "max_steps": 144,
313
+ "num_input_tokens_seen": 0,
314
+ "num_train_epochs": 3,
315
+ "save_steps": 500,
316
+ "stateful_callbacks": {
317
+ "TrainerControl": {
318
+ "args": {
319
+ "should_epoch_stop": false,
320
+ "should_evaluate": false,
321
+ "should_log": false,
322
+ "should_save": true,
323
+ "should_training_stop": false
324
+ },
325
+ "attributes": {}
326
+ }
327
+ },
328
+ "total_flos": 0.0,
329
+ "train_batch_size": 16,
330
+ "trial_name": null,
331
+ "trial_params": null
332
+ }
checkpoint-98/training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:798de09171a4a7f8ef78d9702b3e30b782fa362715033f3af56b90923c847a70
3
+ size 5624
config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "nomic-ai/modernbert-embed-base",
3
+ "architectures": [
4
+ "ModernBertModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 1152,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 12,
36
+ "num_hidden_layers": 22,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "reference_compile": false,
40
+ "repad_logits_with_grad": false,
41
+ "sep_token_id": 50282,
42
+ "sparse_pred_ignore_index": -100,
43
+ "sparse_prediction": false,
44
+ "torch_dtype": "float32",
45
+ "transformers_version": "4.48.1",
46
+ "vocab_size": 50368
47
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.0",
4
+ "transformers": "4.48.1",
5
+ "pytorch": "2.6.0+cu126"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
eval/Information-Retrieval_evaluation_dim_128_results.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,49,0.47674418604651164,0.6584302325581395,0.7151162790697675,0.7979651162790697,0.47674418604651164,0.47674418604651164,0.2194767441860465,0.6584302325581395,0.14302325581395348,0.7151162790697675,0.07979651162790696,0.7979651162790697,0.5821745801033589,0.6341456264396685,0.5882659880635713
3
+ 2.0,98,0.498546511627907,0.6700581395348837,0.7369186046511628,0.8226744186046512,0.498546511627907,0.498546511627907,0.22335271317829455,0.6700581395348837,0.14738372093023255,0.7369186046511628,0.08226744186046511,0.8226744186046512,0.6045300387596899,0.6569451636973174,0.6099407824679366
4
+ 2.950904392764858,144,0.49709302325581395,0.6758720930232558,0.7354651162790697,0.8241279069767442,0.49709302325581395,0.49709302325581395,0.22529069767441862,0.6758720930232558,0.14709302325581394,0.7354651162790697,0.08241279069767442,0.8241279069767442,0.6037779162052417,0.6567813216281579,0.6090388181529673
eval/Information-Retrieval_evaluation_dim_256_results.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,49,0.5392441860465116,0.7136627906976745,0.7645348837209303,0.8473837209302325,0.5392441860465116,0.5392441860465116,0.23788759689922478,0.7136627906976745,0.15290697674418602,0.7645348837209303,0.08473837209302325,0.8473837209302325,0.640224137135474,0.6901657805517794,0.6454694833433305
3
+ 2.0,98,0.5552325581395349,0.7354651162790697,0.7950581395348837,0.8604651162790697,0.5552325581395349,0.5552325581395349,0.24515503875968994,0.7354651162790697,0.15901162790697673,0.7950581395348837,0.08604651162790695,0.8604651162790697,0.6584688768918417,0.7073914263638542,0.6631022921972395
4
+ 2.950904392764858,144,0.5552325581395349,0.7281976744186046,0.7921511627906976,0.8619186046511628,0.5552325581395349,0.5552325581395349,0.24273255813953487,0.7281976744186046,0.15843023255813954,0.7921511627906976,0.08619186046511627,0.8619186046511628,0.6585646225544481,0.7077790398550751,0.6630890497309057
eval/Information-Retrieval_evaluation_dim_512_results.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,49,0.5450581395348837,0.7412790697674418,0.7994186046511628,0.8808139534883721,0.5450581395348837,0.5450581395348837,0.24709302325581395,0.7412790697674418,0.15988372093023254,0.7994186046511628,0.0880813953488372,0.8808139534883721,0.6569952011812478,0.7109774036246727,0.6607688904855182
3
+ 2.0,98,0.5770348837209303,0.7616279069767442,0.8197674418604651,0.8822674418604651,0.5770348837209303,0.5770348837209303,0.25387596899224807,0.7616279069767442,0.163953488372093,0.8197674418604651,0.0882267441860465,0.8822674418604651,0.6816127953119231,0.7303694393079266,0.685728403908298
4
+ 2.950904392764858,144,0.5741279069767442,0.7630813953488372,0.8212209302325582,0.875,0.5741279069767442,0.5741279069767442,0.2543604651162791,0.7630813953488372,0.16424418604651161,0.8212209302325582,0.0875,0.875,0.6782132475083055,0.726227401269234,0.6827936993080407
eval/Information-Retrieval_evaluation_dim_64_results.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,49,0.37790697674418605,0.5421511627906976,0.5915697674418605,0.6976744186046512,0.37790697674418605,0.37790697674418605,0.18071705426356585,0.5421511627906976,0.11831395348837208,0.5915697674418605,0.06976744186046512,0.6976744186046512,0.4751395810262089,0.5281507062500577,0.48337713890549217
3
+ 2.0,98,0.39680232558139533,0.5523255813953488,0.626453488372093,0.7209302325581395,0.39680232558139533,0.39680232558139533,0.18410852713178294,0.5523255813953488,0.12529069767441858,0.626453488372093,0.07209302325581395,0.7209302325581395,0.4936317598744924,0.5476937013127858,0.5014802548028123
4
+ 2.950904392764858,144,0.39680232558139533,0.5581395348837209,0.622093023255814,0.7252906976744186,0.39680232558139533,0.39680232558139533,0.18604651162790695,0.5581395348837209,0.12441860465116278,0.622093023255814,0.07252906976744186,0.7252906976744186,0.497020348837209,0.5513541983050395,0.5050183064129367
eval/Information-Retrieval_evaluation_dim_768_results.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100
2
+ 1.0,49,0.5508720930232558,0.7427325581395349,0.8037790697674418,0.876453488372093,0.5508720930232558,0.5508720930232558,0.24757751937984498,0.7427325581395349,0.16075581395348837,0.8037790697674418,0.0876453488372093,0.876453488372093,0.6611186092654114,0.7133959457013295,0.6655205749696947
3
+ 2.0,98,0.5784883720930233,0.7601744186046512,0.8197674418604651,0.8880813953488372,0.5784883720930233,0.5784883720930233,0.253391472868217,0.7601744186046512,0.163953488372093,0.8197674418604651,0.08880813953488371,0.8880813953488372,0.6835271317829454,0.733110755438693,0.6874162730700494
4
+ 2.950904392764858,144,0.5741279069767442,0.7616279069767442,0.8197674418604651,0.8851744186046512,0.5741279069767442,0.5741279069767442,0.25387596899224807,0.7616279069767442,0.163953488372093,0.8197674418604651,0.0885174418604651,0.8851744186046512,0.6812459625322997,0.7308126785084815,0.6852483059452662
metrics_comparison.txt ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model Performance Metrics Comparison
2
+ ====================================
3
+
4
+ Baseline Performance
5
+ --------------------
6
+ Metric │ 768 │ 512 │ 256 │ 128 │ 64
7
+ ─────────────┼───────┼───────┼───────┼───────┼──────
8
+ ndcg@10 │ 0.548 │ 0.537 │ 0.513 │ 0.469 │ 0.379
9
+ mrr@10 │ 0.503 │ 0.490 │ 0.469 │ 0.423 │ 0.337
10
+ map@100 │ 0.512 │ 0.499 │ 0.478 │ 0.431 │ 0.347
11
+ accuracy@1 │ 0.416 │ 0.403 │ 0.384 │ 0.336 │ 0.262
12
+ accuracy@3 │ 0.561 │ 0.551 │ 0.517 │ 0.471 │ 0.384
13
+ accuracy@5 │ 0.618 │ 0.608 │ 0.576 │ 0.532 │ 0.432
14
+ accuracy@10 │ 0.690 │ 0.686 │ 0.654 │ 0.616 │ 0.515
15
+ precision@1 │ 0.416 │ 0.403 │ 0.384 │ 0.336 │ 0.262
16
+ precision@3 │ 0.187 │ 0.184 │ 0.172 │ 0.157 │ 0.128
17
+ precision@5 │ 0.124 │ 0.122 │ 0.115 │ 0.106 │ 0.086
18
+ precision@10 │ 0.069 │ 0.069 │ 0.065 │ 0.062 │ 0.051
19
+ recall@1 │ 0.416 │ 0.403 │ 0.384 │ 0.336 │ 0.262
20
+ recall@3 │ 0.561 │ 0.551 │ 0.517 │ 0.471 │ 0.384
21
+ recall@5 │ 0.618 │ 0.608 │ 0.576 │ 0.532 │ 0.432
22
+ recall@10 │ 0.690 │ 0.686 │ 0.654 │ 0.616 │ 0.515
23
+
24
+ Fine-Tuned Performance
25
+ ----------------------
26
+ Metric │ 768 │ 512 │ 256 │ 128 │ 64
27
+ ─────────────┼───────┼───────┼───────┼───────┼──────
28
+ ndcg@10 │ 0.732 │ 0.730 │ 0.707 │ 0.657 │ 0.548
29
+ mrr@10 │ 0.682 │ 0.681 │ 0.658 │ 0.605 │ 0.494
30
+ map@100 │ 0.686 │ 0.685 │ 0.663 │ 0.610 │ 0.502
31
+ accuracy@1 │ 0.576 │ 0.577 │ 0.555 │ 0.499 │ 0.397
32
+ accuracy@3 │ 0.760 │ 0.760 │ 0.735 │ 0.673 │ 0.552
33
+ accuracy@5 │ 0.820 │ 0.820 │ 0.795 │ 0.737 │ 0.626
34
+ accuracy@10 │ 0.887 │ 0.882 │ 0.860 │ 0.823 │ 0.721
35
+ precision@1 │ 0.576 │ 0.577 │ 0.555 │ 0.499 │ 0.397
36
+ precision@3 │ 0.253 │ 0.253 │ 0.245 │ 0.224 │ 0.184
37
+ precision@5 │ 0.164 │ 0.164 │ 0.159 │ 0.147 │ 0.125
38
+ precision@10 │ 0.089 │ 0.088 │ 0.086 │ 0.082 │ 0.072
39
+ recall@1 │ 0.576 │ 0.577 │ 0.555 │ 0.499 │ 0.397
40
+ recall@3 │ 0.760 │ 0.760 │ 0.735 │ 0.673 │ 0.552
41
+ recall@5 │ 0.820 │ 0.820 │ 0.795 │ 0.737 │ 0.626
42
+ recall@10 │ 0.887 │ 0.882 │ 0.860 │ 0.823 │ 0.721
43
+
44
+ Absolute Changes (Δ)
45
+ --------------------
46
+ Metric │ 768 │ 512 │ 256 │ 128 │ 64
47
+ ─────────────┼────────┼────────┼────────┼────────┼───────
48
+ ndcg@10 │ +0.184 │ +0.193 │ +0.194 │ +0.188 │ +0.168
49
+ mrr@10 │ +0.179 │ +0.191 │ +0.190 │ +0.182 │ +0.156
50
+ map@100 │ +0.174 │ +0.187 │ +0.185 │ +0.179 │ +0.155
51
+ accuracy@1 │ +0.160 │ +0.174 │ +0.172 │ +0.163 │ +0.135
52
+ accuracy@3 │ +0.199 │ +0.209 │ +0.218 │ +0.202 │ +0.169
53
+ accuracy@5 │ +0.202 │ +0.212 │ +0.219 │ +0.205 │ +0.195
54
+ accuracy@10 │ +0.196 │ +0.196 │ +0.206 │ +0.206 │ +0.206
55
+ precision@1 │ +0.160 │ +0.174 │ +0.172 │ +0.163 │ +0.135
56
+ precision@3 │ +0.066 │ +0.070 │ +0.073 │ +0.067 │ +0.056
57
+ precision@5 │ +0.040 │ +0.042 │ +0.044 │ +0.041 │ +0.039
58
+ precision@10 │ +0.020 │ +0.020 │ +0.021 │ +0.021 │ +0.021
59
+ recall@1 │ +0.160 │ +0.174 │ +0.172 │ +0.163 │ +0.135
60
+ recall@3 │ +0.199 │ +0.209 │ +0.218 │ +0.202 │ +0.169
61
+ recall@5 │ +0.202 │ +0.212 │ +0.219 │ +0.205 │ +0.195
62
+ recall@10 │ +0.196 │ +0.196 │ +0.206 │ +0.206 │ +0.206
63
+
64
+ Percentage Changes
65
+ ------------------
66
+ Metric │ 768 │ 512 │ 256 │ 128 │ 64
67
+ ─────────────┼────────┼────────┼────────┼────────┼───────
68
+ ndcg@10 │ +33.5% │ +35.9% │ +37.9% │ +40.2% │ +44.4%
69
+ mrr@10 │ +35.5% │ +39.0% │ +40.5% │ +43.1% │ +46.3%
70
+ map@100 │ +34.0% │ +37.5% │ +38.8% │ +41.5% │ +44.6%
71
+ accuracy@1 │ +38.5% │ +43.3% │ +44.7% │ +48.5% │ +51.7%
72
+ accuracy@3 │ +35.5% │ +38.0% │ +42.1% │ +42.9% │ +43.9%
73
+ accuracy@5 │ +32.7% │ +34.9% │ +38.1% │ +38.5% │ +45.1%
74
+ accuracy@10 │ +28.4% │ +28.6% │ +31.6% │ +33.5% │ +40.1%
75
+ precision@1 │ +38.5% │ +43.3% │ +44.7% │ +48.5% │ +51.7%
76
+ precision@3 │ +35.5% │ +38.0% │ +42.1% │ +42.9% │ +43.9%
77
+ precision@5 │ +32.7% │ +34.9% │ +38.1% │ +38.5% │ +45.1%
78
+ precision@10 │ +28.4% │ +28.6% │ +31.6% │ +33.5% │ +40.1%
79
+ recall@1 │ +38.5% │ +43.3% │ +44.7% │ +48.5% │ +51.7%
80
+ recall@3 │ +35.5% │ +38.0% │ +42.1% │ +42.9% │ +43.9%
81
+ recall@5 │ +32.7% │ +34.9% │ +38.1% │ +38.5% │ +45.1%
82
+ recall@10 │ +28.4% │ +28.6% │ +31.6% │ +33.5% │ +40.1%
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82a95c0ece1c727d6626a1c60751b2639dd6d1e394fbec8c54720bc393d18192
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:798de09171a4a7f8ef78d9702b3e30b782fa362715033f3af56b90923c847a70
3
+ size 5624