flan-t5-base-gen-12-small_dataset

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 12

Training Loss	Epoch	Step	Validation Loss	Rouge 1	Rouge 2	Rouge L	Avg Len	Bertscore Prec	Bertscore Rec	Bertscore F1
3.8134	0.6173	200	3.4410	6.2979	0.223	5.7832	13.5052	0.8507	0.8498	0.8498
3.5423	1.2346	400	3.3112	6.0189	0.3369	5.6265	14.6944	0.8611	0.8514	0.8558
3.3863	1.8519	600	3.2457	5.8478	0.312	5.5206	14.901	0.8649	0.8522	0.8581
3.2873	2.4691	800	3.2077	6.1468	0.4176	5.7813	14.4757	0.8643	0.8522	0.8578
3.2097	3.0864	1000	3.1873	6.8407	0.5555	6.391	13.6875	0.8591	0.8521	0.8553
3.1199	3.7037	1200	3.1723	6.6644	0.3774	6.2188	15.6545	0.8557	0.8511	0.8531
3.0885	4.3210	1400	3.1635	7.0627	0.5238	6.5367	14.4826	0.861	0.8527	0.8565
3.033	4.9383	1600	3.1565	7.0399	0.5467	6.4524	14.401	0.8596	0.8527	0.8558
2.9712	5.5556	1800	3.1555	7.1467	0.5327	6.4363	14.6406	0.8566	0.853	0.8545
2.9196	6.1728	2000	3.1563	7.1535	0.4741	6.6271	14.8073	0.8558	0.8531	0.8542
2.8896	6.7901	2200	3.1531	7.1215	0.5534	6.5025	14.408	0.8579	0.853	0.8551
2.8631	7.4074	2400	3.1547	7.4895	0.7019	6.8118	14.092	0.8581	0.8533	0.8554
2.8525	8.0247	2600	3.1532	7.1931	0.6333	6.6858	13.9201	0.8586	0.8528	0.8553
2.7951	8.6420	2800	3.1546	7.2016	0.7094	6.6671	13.4878	0.8599	0.8534	0.8563
2.7996	9.2593	3000	3.1568	7.225	0.6035	6.7029	13.724	0.8582	0.8532	0.8554
2.7721	9.8765	3200	3.1563	7.0646	0.6486	6.5622	13.125	0.8602	0.853	0.8562
2.759	10.4938	3400	3.1625	7.3836	0.7279	6.9035	12.6927	0.8613	0.8535	0.857
2.7459	11.1111	3600	3.1600	7.4314	0.6359	6.8986	13.1528	0.8605	0.8539	0.8569
2.7356	11.7284	3800	3.1621	7.3814	0.6192	6.8531	13.0278	0.8612	0.8542	0.8573

Safetensors

Model size

248M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(832)

this model