Finetuned model has smaller model.safetensors size
I fine-tuned this model for a sequence classification task but the model.safetensors file of the fine-tuned checkpoint is much smaller than the base model, 2.44GB (fine-tuned checkpoint) versus 3.02GB (based model).
Also when I eval the checkpoint model from the saved checkpoint, the checkpoint performance is different from what reported in the during the training loop. I suspect that somehow the saved model is missing module weights.
Please help with this issue. Much appreciated!
I fine-tuned this model for a sequence classification task but the model.safetensors file of the fine-tuned checkpoint is much smaller than the base model, 2.44GB (fine-tuned checkpoint) versus 3.02GB (based model).
Also when I eval the checkpoint model from the saved checkpoint, the checkpoint performance is different from what reported in the during the training loop. I suspect that somehow the saved model is missing module weights.
Please help with this issue. Much appreciated!
Have you tried for information retrieval?
In case you are still wondering about the difference in size, it is due to the droping of the lm_head which adds about 130M parameters to the model.
