Update README.md
Browse files
README.md
CHANGED
@@ -44,35 +44,35 @@ Complete end-to-end training framework designed for maximum efficiency:
|
|
44 |
## Evaluation Results
|
45 |
All evaluations were conducted using [lmms_eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
|
46 |
|
47 |
-
| | **LLaVA-OV-1.5-8B** | **Qwen2.5 VL 7B** |
|
48 |
-
|
49 |
-
| MMMU (Validation) | **55.44** | 51.33 |
|
50 |
-
| MMMU-Pro (Standard) | **37.40** | 36.30 |
|
51 |
-
| MMMU-Pro (Vision) | 25.15 | **32.83** |
|
52 |
-
| MMBench (English; Test) | **84.14** | 83.40 |
|
53 |
-
| MMBench (Chinese; Test) | 81.00 | **81.61** |
|
54 |
-
| MME-RealWorld (English) | **62.31** | 57.33 |
|
55 |
-
| MME-RealWorld (Chinese) | **56.11** | 51.50 |
|
56 |
-
| AI2D (With Mask) | **84.16** | 82.58 |
|
57 |
-
| AI2D (Without Mask) | **94.11** | 93.36 |
|
58 |
-
| CV-Bench | **80.82** | 79.95 |
|
59 |
-
| VL-RewardBench | 45.90 | **49.65** |
|
60 |
-
| V* | **78.01** | 76.96 |
|
61 |
-
| PixmoCount | 62.19 | **63.33** |
|
62 |
-
| CountBench | **88.19** | 86.35 |
|
63 |
-
| ChartQA | **86.48** | 84.08 |
|
64 |
-
| CharXiv (Direct Questions) | **74.10** | 69.80 |
|
65 |
-
| DocVQA (Test) | **95.00** | 94.93 |
|
66 |
-
| InfoVQA (Test) | 78.42 | **81.67** |
|
67 |
-
| WeMath | **33.62** | 33.33 |
|
68 |
-
| MathVista (Mini) | **69.57** | 68.60 |
|
69 |
-
| MathVision | **25.56** | 22.37 |
|
70 |
-
| MMStar | **67.72** | 62.54 |
|
71 |
-
| SEED-Bench (Image) | 77.32 | **77.53** |
|
72 |
-
| ScienceQA | **94.98** | 88.75 |
|
73 |
-
| SEED-Bench 2-Plus | 69.21 | **70.93** |
|
74 |
-
| OCRBench | 82.90 | **84.20** |
|
75 |
-
| RealWorldQA | 68.10 | **68.50** |
|
76 |
|
77 |
### Using 🤗 Transformers to Chat
|
78 |
Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
|
|
|
44 |
## Evaluation Results
|
45 |
All evaluations were conducted using [lmms_eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
|
46 |
|
47 |
+
| | **LLaVA-OV-1.5-8B** | **Qwen2.5 VL 7B** |
|
48 |
+
|:----------------------------------|:---------------:|:-------------:|
|
49 |
+
| MMMU (Validation) | **55.44** | 51.33 |
|
50 |
+
| MMMU-Pro (Standard) | **37.40** | 36.30 |
|
51 |
+
| MMMU-Pro (Vision) | 25.15 | **32.83** |
|
52 |
+
| MMBench (English; Test) | **84.14** | 83.40 |
|
53 |
+
| MMBench (Chinese; Test) | 81.00 | **81.61** |
|
54 |
+
| MME-RealWorld (English) | **62.31** | 57.33 |
|
55 |
+
| MME-RealWorld (Chinese) | **56.11** | 51.50 |
|
56 |
+
| AI2D (With Mask) | **84.16** | 82.58 |
|
57 |
+
| AI2D (Without Mask) | **94.11** | 93.36 |
|
58 |
+
| CV-Bench | **80.82** | 79.95 |
|
59 |
+
| VL-RewardBench | 45.90 | **49.65** |
|
60 |
+
| V* | **78.01** | 76.96 |
|
61 |
+
| PixmoCount | 62.19 | **63.33** |
|
62 |
+
| CountBench | **88.19** | 86.35 |
|
63 |
+
| ChartQA | **86.48** | 84.08 |
|
64 |
+
| CharXiv (Direct Questions) | **74.10** | 69.80 |
|
65 |
+
| DocVQA (Test) | **95.00** | 94.93 |
|
66 |
+
| InfoVQA (Test) | 78.42 | **81.67** |
|
67 |
+
| WeMath | **33.62** | 33.33 |
|
68 |
+
| MathVista (Mini) | **69.57** | 68.60 |
|
69 |
+
| MathVision | **25.56** | 22.37 |
|
70 |
+
| MMStar | **67.72** | 62.54 |
|
71 |
+
| SEED-Bench (Image) | 77.32 | **77.53** |
|
72 |
+
| ScienceQA | **94.98** | 88.75 |
|
73 |
+
| SEED-Bench 2-Plus | 69.21 | **70.93** |
|
74 |
+
| OCRBench | 82.90 | **84.20** |
|
75 |
+
| RealWorldQA | 68.10 | **68.50** |
|
76 |
|
77 |
### Using 🤗 Transformers to Chat
|
78 |
Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:
|