bzantium commited on
Commit
b69b8d6
·
1 Parent(s): 39801cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -34
README.md CHANGED
@@ -86,45 +86,27 @@ python main.py \
86
  --output_path $/path/to/output/
87
  ```
88
 
89
- - the number of few shot examples = 1
90
 
91
- | Model | \\(n_{parameters}\\) | boolq (F1) | copa (F1) | wic (F1) | hellaswag (F1) | sentineg (F1) | average |
92
- |----------------------------------------------------------------------------------------------|----------------------|------------|------------|------------|----------------|---------------|------------|
93
- | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) † | 1.2B | 0.4243 | 0.6773 | 0.328 | 0.4178 | 0.5587 | 0.48122 |
94
- | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) * | 6.0B | **0.5014** | **0.7446** | **0.4187** | **0.4524** | 0.7419 | **0.5718** |
95
- | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | 0.3986 | 0.7106 | 0.4116 | 0.3884 | **0.8509** | 0.55202 |
 
96
 
97
- - the number of few shot examples = 5
98
 
99
- | Model | \\(n_{parameters}\\) | boolq (F1) | copa (F1) | wic (F1) | hellaswag (F1) | sentineg (F1) | average |
100
- |----------------------------------------------------------------------------------------------|----------------------|------------|------------|------------|----------------|---------------|-------------|
101
- | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) † | 1.2B | 0.3346 | 0.6477 | 0.328 | 0.4 | 0.5186 | 0.44578 |
102
- | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) * | 6.0B | **0.5561** | **0.7287** | **0.3802** | **0.456** | 0.7152 | **0.56724** |
103
- | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | 0.5101 | 0.7193 | 0.328 | 0.3984 | **0.8057** | 0.5523 |
104
 
105
- - the number of few shot examples = 10
 
 
 
 
 
106
 
107
- | Model | \\(n_{parameters}\\) | boolq (F1) | copa (F1) | wic (F1) | hellaswag (F1) | sentineg (F1) | average |
108
- |----------------------------------------------------------------------------------------------|----------------------|------------|------------|------------|----------------|---------------|-------------|
109
- | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) † | 1.2B | 0.3402 | 0.6419 | 0.328 | 0.4011 | 0.529 | 0.44804 |
110
- | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) * | 6.0B | 0.4838 | **0.7277** | **0.3989** | **0.4616** | 0.7422 | 0.56284 |
111
- | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | **0.5262** | 0.7204 | 0.3314 | 0.417 | **0.8413** | **0.56726** |
112
-
113
- - the number of few shot examples = 50
114
-
115
- | Model | \\(n_{parameters}\\) | boolq (F1) | copa (F1) | wic (F1) | hellaswag (F1) | sentineg (F1) | average |
116
- |----------------------------------------------------------------------------------------------|----------------------|------------|------------|------------|----------------|---------------|-------------|
117
- | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) † | 1.2B | 0.3405 | 0.6514 | 0.328 | 0.4214 | 0.3798 | 0.42422 |
118
- | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) * | 6.0B | 0.4888 | **0.7479** | 0.4233 | **0.4754** | **0.6757** | **0.56222** |
119
- | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | **0.5072** | 0.7206 | **0.4288** | 0.4416 | 0.6049 | 0.54062 |
120
-
121
- - the number of few shot examples = 100
122
-
123
- | Model | \\(n_{parameters}\\) | boolq (F1) | copa (F1) | wic (F1) | hellaswag (F1) | sentineg (F1) | average |
124
- |----------------------------------------------------------------------------------------------|----------------------|------------|------------|------------|----------------|---------------|-------------|
125
- | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) † | 1.2B | 0.3381 | 0.6593 | 0.328 | 0.4187 | 0.3798 | 0.42478 |
126
- | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) * | 6.0B | 0.4755 | **0.7468** | 0.4225 | **0.458** | **0.7081** | **0.56218** |
127
- | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | **0.4981** | 0.7343 | **0.4329** | 0.426 | 0.5948 | 0.53722 |
128
 
129
  <p><strong>&dagger;</strong> The model card of this model provides evaluation results for the KOBEST dataset, but when we evaluated the model with the prompts described in the paper, we can't get similar results to it. Therefore, we checked the KOBEST paper and found that the results were similar to the fine-tuning results reported in the paper. Because we evaluated by prompt-based generation without fine-tuning the model, the results provided by the model card for the this model may differ.</p>
130
 
 
86
  --output_path $/path/to/output/
87
  ```
88
 
89
+ - COPA (F1)
90
 
91
+ | Model | params | 0-shot | 5-shot | 10-shot | 50-shot |
92
+ |----------------------------------------------------------------------------------------------|--------|--------|--------|---------|---------|
93
+ | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) &dagger; | 1.2B | 0.6696 | 0.6477 | 0.6419 | 0.6514 |
94
+ | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) &ast; | 6.0B | 0.7345 | 0.7287 | 0.7277 | 0.7479 |
95
+ | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | **0.7196** | **0.7193** | **0.7204** | **0.7206** |
96
+ | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-2.7b) (ours) | 2.7B | 0.7595 | 0.7608 | 0.7638 | 0.7788 |
97
 
98
+ <img src="https://user-images.githubusercontent.com/19511788/191888663-60ede34c-17cd-4234-b200-5ad8b72c66f6.png" width="800px">
99
 
100
+ - HellaSwag (F1)
 
 
 
 
101
 
102
+ | Model | params | 0-shot | 5-shot | 10-shot | 50-shot |
103
+ |----------------------------------------------------------------------------------------------|--------|--------|--------|---------|---------|
104
+ | [skt/ko-gpt-trinity-1.2B-v0.5](https://huggingface.co/skt/ko-gpt-trinity-1.2B-v0.5) &dagger; | 1.2B | 0.4036 | 0.4 | 0.4011 | 0.4214 |
105
+ | [kakaobrain/kogpt](https://huggingface.co/kakaobrain/kogpt) &ast; | 6.0B | 0.4599 | 0.456 | 0.4616 | 0.4754 |
106
+ | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-1.3b) (ours) | 1.3B | **0.4013** | **0.3984** | **0.417** | **0.4416** |
107
+ | [EleutherAI/polyglot-ko-1.3b](https://huggingface.co/EleutherAI/polyglot-ko-2.7b) (ours) | 2.7B | 0.4438 | 0.4786 | 0.4737 | 0.4822 |
108
 
109
+ <img src="https://user-images.githubusercontent.com/19511788/191888673-dc7643f3-5ffe-4f85-8a8a-d358a0f8a9c0.png" width="800px">
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  <p><strong>&dagger;</strong> The model card of this model provides evaluation results for the KOBEST dataset, but when we evaluated the model with the prompts described in the paper, we can't get similar results to it. Therefore, we checked the KOBEST paper and found that the results were similar to the fine-tuning results reported in the paper. Because we evaluated by prompt-based generation without fine-tuning the model, the results provided by the model card for the this model may differ.</p>
112