Update README.md
Browse files
README.md
CHANGED
@@ -28,7 +28,7 @@ pipeline_tag: text-generation
|
|
28 |
|
29 |
### Model Sources
|
30 |
|
31 |
-
- **Paper:**
|
32 |
---
|
33 |
|
34 |
## Uses
|
@@ -110,7 +110,7 @@ for sentence in sentences:
|
|
110 |
|
111 |
# Generate the output
|
112 |
outputs = model.generate(**inputs,
|
113 |
-
max_new_tokens=
|
114 |
eos_token_id=tokenizer.eos_token_id, # Ensure EOS token is used
|
115 |
pad_token_id=tokenizer.pad_token_id, # Ensure padding token is used
|
116 |
use_cache=True,
|
@@ -144,8 +144,7 @@ for sentence in sentences:
|
|
144 |
|
145 |
### Training Data
|
146 |
|
147 |
-
We used the Sprotin parallel corpus for **English–Faroese** translation: [barbaroo/Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel).
|
148 |
-
|
149 |
|
150 |
### Training Procedure
|
151 |
|
@@ -182,8 +181,8 @@ Human evaluation was also performed (see paper)
|
|
182 |
|
183 |
|
184 |
## Citation []
|
|
|
185 |
|
186 |
-
[COMING SOON]
|
187 |
|
188 |
---
|
189 |
## Framework versions
|
|
|
28 |
|
29 |
### Model Sources
|
30 |
|
31 |
+
- **Paper:** Rethinking Low-Resource MT: The Surprising Effectiveness of Fine-Tuned Multilingual Models in the LLM Age (Scalvini et al., NoDaLiDa 2025)
|
32 |
---
|
33 |
|
34 |
## Uses
|
|
|
110 |
|
111 |
# Generate the output
|
112 |
outputs = model.generate(**inputs,
|
113 |
+
max_new_tokens=500,
|
114 |
eos_token_id=tokenizer.eos_token_id, # Ensure EOS token is used
|
115 |
pad_token_id=tokenizer.pad_token_id, # Ensure padding token is used
|
116 |
use_cache=True,
|
|
|
144 |
|
145 |
### Training Data
|
146 |
|
147 |
+
We used the Sprotin parallel corpus for **English–Faroese** translation: [barbaroo/Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel).
|
|
|
148 |
|
149 |
### Training Procedure
|
150 |
|
|
|
181 |
|
182 |
|
183 |
## Citation []
|
184 |
+
Barbara Scalvini, Iben Nyholm Debess, Annika Simonsen, and Hafsteinn Einarsson. 2025. Rethinking Low-Resource MT: The Surprising Effectiveness of Fine-Tuned Multilingual Models in the LLM Age. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 609–621, Tallinn, Estonia. University of Tartu Library.
|
185 |
|
|
|
186 |
|
187 |
---
|
188 |
## Framework versions
|