|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- google/t5-v1_1-xl |
|
tags: |
|
- t5 |
|
--- |
|
|
|
This is just the encoder weights from "google/t5-v1_1-xl" |
|
It takes 11GB down to 4GB. |
|
|
|
The script to do the extraction is included here as |
|
[transform.py](transform.py) |
|
|
|
Edit: Now that I have this in a convenient form... |
|
I got a chance to test t5-xxl projected down to 2048, vs this t5-xl |
|
Surprisingly, even with an untrained projection layer, trivial embedding diversity scores rate |
|
the projected xxl version higher than native xl at 2048. |
|
|
|
So, while this model will continue to exist as a convenient way to compare.. and possibly as something |
|
to use if you are really, really REALLY tight on memory... you are probably best off |
|
using t5-xxl whenever you can. |