wangclnlp commited on
Commit
a559cc7
·
verified ·
1 Parent(s): a9a7dc9

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -59,7 +59,7 @@ We evaluate our model on two challenging reward benchmarks, [RM-Bench](https://g
59
 
60
  - Results on the JudgeBench.
61
 
62
- | **Model** | **Params.** | **Chat** | **Math** | **Code** | **Safety** | **Overall** |
63
  |:-|-:|:-:|:-:|:-:|:-:|:-:|
64
  |**LLM-as-a-Judge**||||||
65
  |GPT-4o |- |50.6 | 54.1 | 75.0 | 59.5 | 59.8 |
 
59
 
60
  - Results on the JudgeBench.
61
 
62
+ | **Model** | **Params.** | **Knowl.** | **Reason.** | **Math** | **Coding** | **Overall** |
63
  |:-|-:|:-:|:-:|:-:|:-:|:-:|
64
  |**LLM-as-a-Judge**||||||
65
  |GPT-4o |- |50.6 | 54.1 | 75.0 | 59.5 | 59.8 |