ppbrown commited on
Commit
fe4f26d
·
verified ·
1 Parent(s): 4042f7a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -10,4 +10,13 @@ This is just the encoder weights from "google/t5-v1_1-xl"
10
  It takes 11GB down to 4GB.
11
 
12
  The script to do the extraction is included here as
13
- [transform.py](transform.py)
 
 
 
 
 
 
 
 
 
 
10
  It takes 11GB down to 4GB.
11
 
12
  The script to do the extraction is included here as
13
+ [transform.py](transform.py)
14
+
15
+ Edit: Now that I have this in a convenient form...
16
+ I got a chance to test t5-xxl projected down to 2048, vs this t5-xl
17
+ Surprisingly, even with an untrained projection layer, trivial embedding diversity scores rate
18
+ the projected xxl version higher than native xl at 2048.
19
+
20
+ So, while this model will continue to exist as a convenient way to compare.. and possibly as something
21
+ to use if you are really, really REALLY tight on memory... you are probably best off
22
+ using t5-xxl whenever you can.