Visualize in Weights & Biases

flan-t5-base-gen-12-small_dataset

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1621
  • Rouge 1: 7.3814
  • Rouge 2: 0.6192
  • Rouge L: 6.8531
  • Avg Len: 13.0278
  • Bertscore Prec: 0.8612
  • Bertscore Rec: 0.8542
  • Bertscore F1: 0.8573

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Rouge 1 Rouge 2 Rouge L Avg Len Bertscore Prec Bertscore Rec Bertscore F1
3.8134 0.6173 200 3.4410 6.2979 0.223 5.7832 13.5052 0.8507 0.8498 0.8498
3.5423 1.2346 400 3.3112 6.0189 0.3369 5.6265 14.6944 0.8611 0.8514 0.8558
3.3863 1.8519 600 3.2457 5.8478 0.312 5.5206 14.901 0.8649 0.8522 0.8581
3.2873 2.4691 800 3.2077 6.1468 0.4176 5.7813 14.4757 0.8643 0.8522 0.8578
3.2097 3.0864 1000 3.1873 6.8407 0.5555 6.391 13.6875 0.8591 0.8521 0.8553
3.1199 3.7037 1200 3.1723 6.6644 0.3774 6.2188 15.6545 0.8557 0.8511 0.8531
3.0885 4.3210 1400 3.1635 7.0627 0.5238 6.5367 14.4826 0.861 0.8527 0.8565
3.033 4.9383 1600 3.1565 7.0399 0.5467 6.4524 14.401 0.8596 0.8527 0.8558
2.9712 5.5556 1800 3.1555 7.1467 0.5327 6.4363 14.6406 0.8566 0.853 0.8545
2.9196 6.1728 2000 3.1563 7.1535 0.4741 6.6271 14.8073 0.8558 0.8531 0.8542
2.8896 6.7901 2200 3.1531 7.1215 0.5534 6.5025 14.408 0.8579 0.853 0.8551
2.8631 7.4074 2400 3.1547 7.4895 0.7019 6.8118 14.092 0.8581 0.8533 0.8554
2.8525 8.0247 2600 3.1532 7.1931 0.6333 6.6858 13.9201 0.8586 0.8528 0.8553
2.7951 8.6420 2800 3.1546 7.2016 0.7094 6.6671 13.4878 0.8599 0.8534 0.8563
2.7996 9.2593 3000 3.1568 7.225 0.6035 6.7029 13.724 0.8582 0.8532 0.8554
2.7721 9.8765 3200 3.1563 7.0646 0.6486 6.5622 13.125 0.8602 0.853 0.8562
2.759 10.4938 3400 3.1625 7.3836 0.7279 6.9035 12.6927 0.8613 0.8535 0.857
2.7459 11.1111 3600 3.1600 7.4314 0.6359 6.8986 13.1528 0.8605 0.8539 0.8569
2.7356 11.7284 3800 3.1621 7.3814 0.6192 6.8531 13.0278 0.8612 0.8542 0.8573

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for greatakela/flan-t5-base-gen-12-small_dataset

Finetuned
(832)
this model