flan-t5-base-gen-chat_base-5
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.2877
- Rouge 1: 13.4731
- Rouge 2: 4.141
- Rouge L: 12.5428
- Avg Len: 12.3875
- Bertscore Prec: 0.8648
- Bertscore Rec: 0.8591
- Bertscore F1: 0.8616
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge 1 | Rouge 2 | Rouge L | Avg Len | Bertscore Prec | Bertscore Rec | Bertscore F1 |
---|---|---|---|---|---|---|---|---|---|---|
3.602 | 0.1771 | 400 | 3.3043 | 6.1296 | 0.3906 | 5.6284 | 13.5263 | 0.8505 | 0.8478 | 0.8488 |
3.3799 | 0.3543 | 800 | 3.0822 | 6.6949 | 0.4835 | 6.216 | 15.4916 | 0.857 | 0.8507 | 0.8535 |
3.1711 | 0.5314 | 1200 | 2.8825 | 6.284 | 0.4519 | 5.9964 | 16.6215 | 0.8625 | 0.8513 | 0.8566 |
3.0298 | 0.7086 | 1600 | 2.7078 | 6.7319 | 0.5659 | 6.3339 | 16.9338 | 0.8602 | 0.8527 | 0.8562 |
2.8722 | 0.8857 | 2000 | 2.5615 | 7.9438 | 0.9859 | 7.4262 | 15.1393 | 0.8622 | 0.8538 | 0.8577 |
2.7398 | 1.0629 | 2400 | 2.4291 | 8.3717 | 0.9243 | 7.7474 | 14.9595 | 0.8601 | 0.854 | 0.8568 |
2.6196 | 1.2400 | 2800 | 2.3052 | 8.8458 | 1.117 | 8.2482 | 14.5368 | 0.8616 | 0.8544 | 0.8577 |
2.5076 | 1.4172 | 3200 | 2.1998 | 9.1369 | 1.2588 | 8.5322 | 14.2939 | 0.8626 | 0.855 | 0.8585 |
2.4567 | 1.5943 | 3600 | 2.1053 | 9.6663 | 1.4424 | 8.9702 | 12.928 | 0.8634 | 0.8547 | 0.8587 |
2.3665 | 1.7715 | 4000 | 2.0065 | 9.7718 | 1.4302 | 9.0438 | 13.2387 | 0.8622 | 0.8554 | 0.8585 |
2.3142 | 1.9486 | 4400 | 1.9322 | 10.2579 | 1.7969 | 9.5247 | 12.6887 | 0.8629 | 0.8558 | 0.859 |
2.1932 | 2.1258 | 4800 | 1.8453 | 10.4199 | 1.9269 | 9.6524 | 12.3864 | 0.8611 | 0.8553 | 0.8578 |
2.1476 | 2.3029 | 5200 | 1.7822 | 10.798 | 2.141 | 9.9692 | 12.5258 | 0.8617 | 0.8562 | 0.8586 |
2.1042 | 2.4801 | 5600 | 1.7094 | 10.9928 | 2.3716 | 10.1852 | 12.5231 | 0.8623 | 0.8564 | 0.859 |
2.0594 | 2.6572 | 6000 | 1.6480 | 11.5976 | 2.3487 | 10.6123 | 12.7434 | 0.8629 | 0.8573 | 0.8598 |
2.0017 | 2.8344 | 6400 | 1.5988 | 12.0383 | 2.689 | 11.044 | 12.582 | 0.8632 | 0.8577 | 0.8601 |
1.967 | 3.0115 | 6800 | 1.5445 | 11.8601 | 3.0155 | 10.9982 | 12.438 | 0.8636 | 0.8579 | 0.8604 |
1.9023 | 3.1887 | 7200 | 1.4983 | 12.1844 | 3.2931 | 11.3648 | 12.04 | 0.864 | 0.8575 | 0.8604 |
1.8604 | 3.3658 | 7600 | 1.4650 | 12.498 | 3.2698 | 11.5864 | 12.5668 | 0.8635 | 0.8586 | 0.8607 |
1.8345 | 3.5430 | 8000 | 1.4231 | 12.7205 | 3.6395 | 11.7643 | 12.4364 | 0.8644 | 0.8591 | 0.8614 |
1.8117 | 3.7201 | 8400 | 1.3965 | 12.6639 | 3.6483 | 11.824 | 12.4795 | 0.864 | 0.8585 | 0.8609 |
1.7999 | 3.8973 | 8800 | 1.3652 | 13.0487 | 3.7723 | 12.1147 | 12.4784 | 0.8643 | 0.8588 | 0.8612 |
1.7583 | 4.0744 | 9200 | 1.3443 | 13.2051 | 3.807 | 12.2615 | 12.4264 | 0.865 | 0.8591 | 0.8617 |
1.7608 | 4.2516 | 9600 | 1.3277 | 13.1401 | 3.7791 | 12.2312 | 12.4989 | 0.8644 | 0.859 | 0.8613 |
1.7111 | 4.4287 | 10000 | 1.3100 | 12.9527 | 3.7422 | 12.0599 | 12.4911 | 0.8641 | 0.8587 | 0.861 |
1.7137 | 4.6058 | 10400 | 1.2955 | 13.1295 | 4.0091 | 12.2326 | 12.5794 | 0.864 | 0.8588 | 0.8611 |
1.712 | 4.7830 | 10800 | 1.2899 | 13.511 | 4.1509 | 12.574 | 12.3586 | 0.865 | 0.8593 | 0.8618 |
1.6791 | 4.9601 | 11200 | 1.2877 | 13.4731 | 4.141 | 12.5428 | 12.3875 | 0.8648 | 0.8591 | 0.8616 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 25
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for greatakela/flan-t5-base-gen-chat_base-5
Base model
google/flan-t5-base