Visualize in Weights & Biases

flan-t5-base-gen-chat_base-5

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2877
  • Rouge 1: 13.4731
  • Rouge 2: 4.141
  • Rouge L: 12.5428
  • Avg Len: 12.3875
  • Bertscore Prec: 0.8648
  • Bertscore Rec: 0.8591
  • Bertscore F1: 0.8616

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge 1 Rouge 2 Rouge L Avg Len Bertscore Prec Bertscore Rec Bertscore F1
3.602 0.1771 400 3.3043 6.1296 0.3906 5.6284 13.5263 0.8505 0.8478 0.8488
3.3799 0.3543 800 3.0822 6.6949 0.4835 6.216 15.4916 0.857 0.8507 0.8535
3.1711 0.5314 1200 2.8825 6.284 0.4519 5.9964 16.6215 0.8625 0.8513 0.8566
3.0298 0.7086 1600 2.7078 6.7319 0.5659 6.3339 16.9338 0.8602 0.8527 0.8562
2.8722 0.8857 2000 2.5615 7.9438 0.9859 7.4262 15.1393 0.8622 0.8538 0.8577
2.7398 1.0629 2400 2.4291 8.3717 0.9243 7.7474 14.9595 0.8601 0.854 0.8568
2.6196 1.2400 2800 2.3052 8.8458 1.117 8.2482 14.5368 0.8616 0.8544 0.8577
2.5076 1.4172 3200 2.1998 9.1369 1.2588 8.5322 14.2939 0.8626 0.855 0.8585
2.4567 1.5943 3600 2.1053 9.6663 1.4424 8.9702 12.928 0.8634 0.8547 0.8587
2.3665 1.7715 4000 2.0065 9.7718 1.4302 9.0438 13.2387 0.8622 0.8554 0.8585
2.3142 1.9486 4400 1.9322 10.2579 1.7969 9.5247 12.6887 0.8629 0.8558 0.859
2.1932 2.1258 4800 1.8453 10.4199 1.9269 9.6524 12.3864 0.8611 0.8553 0.8578
2.1476 2.3029 5200 1.7822 10.798 2.141 9.9692 12.5258 0.8617 0.8562 0.8586
2.1042 2.4801 5600 1.7094 10.9928 2.3716 10.1852 12.5231 0.8623 0.8564 0.859
2.0594 2.6572 6000 1.6480 11.5976 2.3487 10.6123 12.7434 0.8629 0.8573 0.8598
2.0017 2.8344 6400 1.5988 12.0383 2.689 11.044 12.582 0.8632 0.8577 0.8601
1.967 3.0115 6800 1.5445 11.8601 3.0155 10.9982 12.438 0.8636 0.8579 0.8604
1.9023 3.1887 7200 1.4983 12.1844 3.2931 11.3648 12.04 0.864 0.8575 0.8604
1.8604 3.3658 7600 1.4650 12.498 3.2698 11.5864 12.5668 0.8635 0.8586 0.8607
1.8345 3.5430 8000 1.4231 12.7205 3.6395 11.7643 12.4364 0.8644 0.8591 0.8614
1.8117 3.7201 8400 1.3965 12.6639 3.6483 11.824 12.4795 0.864 0.8585 0.8609
1.7999 3.8973 8800 1.3652 13.0487 3.7723 12.1147 12.4784 0.8643 0.8588 0.8612
1.7583 4.0744 9200 1.3443 13.2051 3.807 12.2615 12.4264 0.865 0.8591 0.8617
1.7608 4.2516 9600 1.3277 13.1401 3.7791 12.2312 12.4989 0.8644 0.859 0.8613
1.7111 4.4287 10000 1.3100 12.9527 3.7422 12.0599 12.4911 0.8641 0.8587 0.861
1.7137 4.6058 10400 1.2955 13.1295 4.0091 12.2326 12.5794 0.864 0.8588 0.8611
1.712 4.7830 10800 1.2899 13.511 4.1509 12.574 12.3586 0.865 0.8593 0.8618
1.6791 4.9601 11200 1.2877 13.4731 4.141 12.5428 12.3875 0.8648 0.8591 0.8616

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
25
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for greatakela/flan-t5-base-gen-chat_base-5

Finetuned
(825)
this model