Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,13 @@ This is just the encoder weights from "google/t5-v1_1-xl"
|
|
10 |
It takes 11GB down to 4GB.
|
11 |
|
12 |
The script to do the extraction is included here as
|
13 |
-
[transform.py](transform.py)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
It takes 11GB down to 4GB.
|
11 |
|
12 |
The script to do the extraction is included here as
|
13 |
+
[transform.py](transform.py)
|
14 |
+
|
15 |
+
Edit: Now that I have this in a convenient form...
|
16 |
+
I got a chance to test t5-xxl projected down to 2048, vs this t5-xl
|
17 |
+
Surprisingly, even with an untrained projection layer, trivial embedding diversity scores rate
|
18 |
+
the projected xxl version higher than native xl at 2048.
|
19 |
+
|
20 |
+
So, while this model will continue to exist as a convenient way to compare.. and possibly as something
|
21 |
+
to use if you are really, really REALLY tight on memory... you are probably best off
|
22 |
+
using t5-xxl whenever you can.
|