Commit
·
be707dc
1
Parent(s):
cdbc65b
Update README.md
Browse files
README.md
CHANGED
@@ -35,9 +35,9 @@ Polyglot-Ko was trained on 863 GB of Korean language data (1.2TB before processi
|
|
35 |
|
36 |
| Source |Size (GB) | Link |
|
37 |
|-------------------------------------|---------|------------------------------------------|
|
38 |
-
| Korean blog posts | 682 | - |
|
39 |
-
| Korean news dataset | 87
|
40 |
-
| Modu corpus |
|
41 |
| Korean patent dataset | 26.4 | - |
|
42 |
| Korean Q & A dataset | 18.1 | - |
|
43 |
| KcBert dataset | 12.7 | github.com/Beomi/KcBERT |
|
|
|
35 |
|
36 |
| Source |Size (GB) | Link |
|
37 |
|-------------------------------------|---------|------------------------------------------|
|
38 |
+
| Korean blog posts | 682.3 | - |
|
39 |
+
| Korean news dataset | 87.0 | - |
|
40 |
+
| Modu corpus | 19.0 |corpus.korean.go.kr |
|
41 |
| Korean patent dataset | 26.4 | - |
|
42 |
| Korean Q & A dataset | 18.1 | - |
|
43 |
| KcBert dataset | 12.7 | github.com/Beomi/KcBERT |
|