Commit
·
6af8a50
1
Parent(s):
b1188a4
Update README.md
Browse files
README.md
CHANGED
|
@@ -31,14 +31,14 @@ This model uses the MosaicML LLM codebase, which can be found in the [llm-foundr
|
|
| 31 |
|
| 32 |
### Models finetuned off MPT-7B (Base):
|
| 33 |
|
| 34 |
-
* [MPT-7B-StoryWriter-65k+
|
| 35 |
It is built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
|
| 36 |
At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
|
| 37 |
We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in our blogpost {HERE}.
|
| 38 |
* License: _Apache-2.0_ (commercial use permitted)
|
| 39 |
|
| 40 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
| 41 |
-
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/
|
| 42 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
| 43 |
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
| 44 |
|
|
|
|
| 31 |
|
| 32 |
### Models finetuned off MPT-7B (Base):
|
| 33 |
|
| 34 |
+
* [MPT-7B-StoryWriter-65k+](https://huggingface.co/mosaicml/mpt-7b-storywriter): a model designed to read and write fictional stories with super long context lengths.
|
| 35 |
It is built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
|
| 36 |
At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
|
| 37 |
We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in our blogpost {HERE}.
|
| 38 |
* License: _Apache-2.0_ (commercial use permitted)
|
| 39 |
|
| 40 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
| 41 |
+
It is built by finetuning MPT-7B on a [dataset](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) we also release, derived from the [Databricks Dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the [Anthropic Helpful and Harmless (HH-RLHF)](https://huggingface.co/datasets/Anthropic/hh-rlhf) datasets.
|
| 42 |
* License: _CC-By-SA-3.0_ (commercial use permitted)
|
| 43 |
* [Online Demo on HuggingFace Spaces](https://huggingface.co/spaces/mosaicml/mpt-7b-instruct)
|
| 44 |
|