File size: 733 Bytes
207fcb6 c86a1af 207fcb6 c86a1af 4042f7a c86a1af fe4f26d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
---
license: apache-2.0
base_model:
- google/t5-v1_1-xl
tags:
- t5
---
This is just the encoder weights from "google/t5-v1_1-xl"
It takes 11GB down to 4GB.
The script to do the extraction is included here as
[transform.py](transform.py)
Edit: Now that I have this in a convenient form...
I got a chance to test t5-xxl projected down to 2048, vs this t5-xl
Surprisingly, even with an untrained projection layer, trivial embedding diversity scores rate
the projected xxl version higher than native xl at 2048.
So, while this model will continue to exist as a convenient way to compare.. and possibly as something
to use if you are really, really REALLY tight on memory... you are probably best off
using t5-xxl whenever you can. |