Peramanathan
/

cv-qa-model

text2text-generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Peramanathan commited on 13 days ago

Commit

1491e7d

·

verified ·

1 Parent(s): e95784a

End of training

Files changed (1) hide show

README.md +12 -17

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 library_name: transformers
 license: apache-2.0
-base_model: google/flan-t5-small
 tags:
 - generated_from_trainer
 model-index:
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
 # cv-qa-model
-This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 6.8723
 ## Model description
@@ -35,28 +35,23 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-06
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 10.0922       | 1.0   | 37   | 7.0720          |
-| 9.1794        | 2.0   | 74   | 7.0215          |
-| 5.6882        | 3.0   | 111  | 7.0107          |
-| 8.585         | 4.0   | 148  | 6.9715          |
-| 6.9991        | 5.0   | 185  | 6.9407          |
-| 10.3619       | 6.0   | 222  | 6.9142          |
-| 7.5525        | 7.0   | 259  | 6.8963          |
-| 7.8674        | 8.0   | 296  | 6.8830          |
-| 9.2344        | 9.0   | 333  | 6.8747          |
-| 7.0138        | 10.0  | 370  | 6.8723          |
 ### Framework versions

 ---
 library_name: transformers
 license: apache-2.0
+base_model: google/flan-t5-base
 tags:
 - generated_from_trainer
 model-index:
 # cv-qa-model
+This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.3571
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 5.8487        | 1.0   | 73   | 4.9315          |
+| 9.4588        | 2.0   | 146  | 4.2238          |
+| 6.4005        | 3.0   | 219  | 3.7199          |
+| 5.6368        | 4.0   | 292  | 3.4483          |
+| 3.9503        | 5.0   | 365  | 3.3571          |
 ### Framework versions