ybelkada
commited on
Commit
·
a3da4bf
1
Parent(s):
20dcd04
Update README.md (#2)
Browse files- Update README.md (7d38e96b5b320c75f4195d4f9a7c537760dd82f1)
README.md
CHANGED
|
@@ -122,11 +122,11 @@ Please see [the BLOOM training README](https://github.com/bigscience-workshop/bi
|
|
| 122 |
|
| 123 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
| 124 |
|
| 125 |
-
*
|
| 126 |
|
| 127 |
-
*
|
| 128 |
|
| 129 |
-
* Hidden layers are
|
| 130 |
|
| 131 |
* Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))
|
| 132 |
|
|
|
|
| 122 |
|
| 123 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
| 124 |
|
| 125 |
+
* 760 million parameters:
|
| 126 |
|
| 127 |
+
* 24 layers, 16 attention heads
|
| 128 |
|
| 129 |
+
* Hidden layers are 1536-dimensional
|
| 130 |
|
| 131 |
* Sequence length of 2048 tokens used (see [BLOOM tokenizer](https://huggingface.co/bigscience/tokenizer), [tokenizer description](#tokenization))
|
| 132 |
|