stefan-it commited on
Commit
dbb2408
·
1 Parent(s): 2960832

readme: include number of training epochs

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -22,12 +22,13 @@ Preliminary Historic Multilingual and Monolingual ByT5 Models. Following languag
22
 
23
  More details can be found in [our GitHub repository](https://github.com/stefan-it/hmByT5).
24
 
25
-
26
  # Pretraining
27
 
28
  We use the official JAX/FLAX example in Hugging Face Transformers to pretrain a ByT5 model on a single v3-8 TPU.
29
  Details about the training can be found [here](https://github.com/stefan-it/hmByT5/tree/main/hmbyt5-flax).
30
 
 
 
31
  # Evaluation on Downstream Tasks (NER)
32
 
33
  We evaluated the hmByT5 model on downstream tasks:
 
22
 
23
  More details can be found in [our GitHub repository](https://github.com/stefan-it/hmByT5).
24
 
 
25
  # Pretraining
26
 
27
  We use the official JAX/FLAX example in Hugging Face Transformers to pretrain a ByT5 model on a single v3-8 TPU.
28
  Details about the training can be found [here](https://github.com/stefan-it/hmByT5/tree/main/hmbyt5-flax).
29
 
30
+ The model was trained for 0.5 epoch.
31
+
32
  # Evaluation on Downstream Tasks (NER)
33
 
34
  We evaluated the hmByT5 model on downstream tasks: