barbaroo commited on
Commit
e3b5328
·
verified ·
1 Parent(s): 17ddc46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -28,7 +28,7 @@ pipeline_tag: text-generation
28
 
29
  ### Model Sources
30
 
31
- - **Paper:** [COMING SOON]
32
  ---
33
 
34
  ## Uses
@@ -110,7 +110,7 @@ for sentence in sentences:
110
 
111
  # Generate the output
112
  outputs = model.generate(**inputs,
113
- max_new_tokens=2000,
114
  eos_token_id=tokenizer.eos_token_id, # Ensure EOS token is used
115
  pad_token_id=tokenizer.pad_token_id, # Ensure padding token is used
116
  use_cache=True,
@@ -144,8 +144,7 @@ for sentence in sentences:
144
 
145
  ### Training Data
146
 
147
- We used the Sprotin parallel corpus for **English–Faroese** translation: [barbaroo/Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel).
148
-
149
 
150
  ### Training Procedure
151
 
@@ -182,8 +181,8 @@ Human evaluation was also performed (see paper)
182
 
183
 
184
  ## Citation []
 
185
 
186
- [COMING SOON]
187
 
188
  ---
189
  ## Framework versions
 
28
 
29
  ### Model Sources
30
 
31
+ - **Paper:** Rethinking Low-Resource MT: The Surprising Effectiveness of Fine-Tuned Multilingual Models in the LLM Age (Scalvini et al., NoDaLiDa 2025)
32
  ---
33
 
34
  ## Uses
 
110
 
111
  # Generate the output
112
  outputs = model.generate(**inputs,
113
+ max_new_tokens=500,
114
  eos_token_id=tokenizer.eos_token_id, # Ensure EOS token is used
115
  pad_token_id=tokenizer.pad_token_id, # Ensure padding token is used
116
  use_cache=True,
 
144
 
145
  ### Training Data
146
 
147
+ We used the Sprotin parallel corpus for **English–Faroese** translation: [barbaroo/Sprotin_parallel](https://huggingface.co/datasets/barbaroo/Sprotin_parallel).
 
148
 
149
  ### Training Procedure
150
 
 
181
 
182
 
183
  ## Citation []
184
+ Barbara Scalvini, Iben Nyholm Debess, Annika Simonsen, and Hafsteinn Einarsson. 2025. Rethinking Low-Resource MT: The Surprising Effectiveness of Fine-Tuned Multilingual Models in the LLM Age. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 609–621, Tallinn, Estonia. University of Tartu Library.
185
 
 
186
 
187
  ---
188
  ## Framework versions