Taishi-N324 commited on
Commit
0507dbb
·
verified ·
1 Parent(s): 2e807cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -10,13 +10,16 @@ model_type: llama
10
 
11
  # Llama3.1 Swallow
12
 
13
- Our Swallow model has undergone continual pre-training from the [Llama 3.1 family](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f), primarily with the addition of Japanese language data. The Instruct versions use supervised fine-tuning (SFT). Links to other models can be found in the index.
 
 
 
 
 
14
 
 
15
 
16
- # Model Release Updates
17
-
18
- We are excited to share the release schedule for our latest models:
19
- - **October 08, 2024**: Released the [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
20
 
21
  ## Swallow Model Index
22
 
@@ -27,7 +30,7 @@ We are excited to share the release schedule for our latest models:
27
 
28
  ![logo](./logo.png)
29
 
30
- This repository provides large language models developed by [Swallow-LLM](https://swallow-llm.github.io/).
31
 
32
  ## Model Details
33
 
@@ -119,9 +122,14 @@ The models released here are still in the early stages of our research and devel
119
 
120
  ## Acknowledgements
121
 
122
- We thank Meta Research for releasing Llama 3.1 under an open license for others to build on.
 
 
123
 
124
- Our project is supported by the [Large Generative AI Development Support Program](https://abci.ai/en/link/lfm_support_program.html) of the National Institute of Advanced Industrial Science and Technology.
 
 
 
125
 
126
  ## License
127
 
 
10
 
11
  # Llama3.1 Swallow
12
 
13
+ Llama 3.1 Swallow is a series of large language models (8B, 70B) that were built by continual pre-training on the [Meta Llama 3.1](https://huggingface.co/collections/meta-llama/llama-31-669fc079a0c406a149a5738f) models.
14
+ Llama 3.1 Swallow enhanced the Japanese language capabilities of the original Llama 3.1 while retaining the English language capabilities.
15
+ We use approximately 200 billion tokens that were sampled from a large Japanese web corpus (Swallow Corpus Version 2), Japanese and English Wikipedia articles, and mathematical and
16
+ coding contents, etc (see the Training Datasets section) for continual pre-training.
17
+ The instruction-tuned models (Instruct) were built by supervised fine-tuning (SFT) on the synthetic data specially built for Japanese.
18
+ See the Swallow Model Index section to find other model variants.
19
 
20
+ # Release History
21
 
22
+ - **October 08, 2024**: Released [Llama-3.1-Swallow-8B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-v0.1), [Llama-3.1-Swallow-8B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.1), [Llama-3.1-Swallow-70B-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-v0.1), and [Llama-3.1-Swallow-70B-Instruct-v0.1](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-70B-Instruct-v0.1).
 
 
 
23
 
24
  ## Swallow Model Index
25
 
 
30
 
31
  ![logo](./logo.png)
32
 
33
+ The website [https://swallow-llm.github.io/](https://swallow-llm.github.io/) provides large language models developed by the Swallow team.
34
 
35
  ## Model Details
36
 
 
122
 
123
  ## Acknowledgements
124
 
125
+ We thank Meta Research for releasing Llama 3.1 under a generous open license.
126
+
127
+ We received various supports including:
128
 
129
+ + AIST project: “Research and Development of Foundation Models for Generative AI in the Physical Domain”
130
+ + NEDO project: “Development of Artificial Intelligence Application Technology to Support Judgment in Design Risk Assessment Work Based on the Perspective of Skilled Persons" (JPNP18002) of “Development of Integration Technology as the Core of Next Generation Artificial Intelligence and Robotics”
131
+ + MEXT project: "Formation of R&D center to ensure transparency and reliability of generative AI models"
132
+ + AIST program: [Large Generative AI Development Support Program](https://abci.ai/en/link/lfm_support_program.html)
133
 
134
  ## License
135