Benchmark-Results / README.md
FallenMerick's picture
Update README.md
9657564 verified
|
raw
history blame
1.98 kB
MODEL HellaSwag EQ_Bench % Parsed (EQ)
athirdpath/NSFW_DPO_vmgb-7b 85.36 74.83 100
FallenMerick/Iced-Lemon-Cookie-7B 85.54 71.54 100
FallenMerick/Smart-Lemon-Cookie-7B 85.41 68.12 100
Intel/neural-chat-7b-v3-1 79.76 62.26 100
jondurbin/airoboros-m-7b-3.1.2 81.34 38.52 100
jondurbin/cinematika-7b-v0.1 80.31 44.85 100
KoboldAI/Mistral-7B-Erebus-v3 76.65 18.19 97.66
KoboldAI/Mistral-7B-Holodeck-1 79.19 2.10 98.25
migtissera/Synthia-7B-v3.0 81.74 15.03 94.74
mlabonne/NeuralBeagle14-7B 86.46 74.21 99.42
NousResearch/Hermes-2-Pro-Mistral-7B 80.56 65.93 100
Open-Orca/Mistral-7B-OpenOrca 81.67 63.98 99.42
rwitz/go-bruins 84.92 73.62 100
SanjiWatsuki/Kunoichi-7B 85.25 72.36 100
senseable/WestLake-7B-v2 87.42 77.87 100
teknium/OpenHermes-2.5-Mistral-7B 81.68 65.75 100
Undi95/Toppy-M-7B 83.52 66.57 100

MODEL HellaSwag EQ_Bench % Parsed (EQ)
ABX-AI/Silver-Sun-v2-11B 86.40 69.92 100
backyardai/Fimbulvetr-Holodeck-Erebus-Westlake-10.7B 86.00 69.25 100
BlueNipples/SnowLotus-v2-10.7B 83.42 60.54 99.42
FallenMerick/Chewy-Lemon-Cookie-11B 84.39 76.24 100
FallenMerick/Chunky-Lemon-Cookie-11B 84.36 76.29 100
froggeric/WestLake-10.7B-v2 86.74 73.35 95.32
Himitsui/KuroMitsu-11B 86.33 70.50 98.83
kyujinpy/SOLAR-Platypus-10.7B-v2 82.05 25.11 45.61
migtissera/Tess-10.7B-v1.5b 83.82 61.83 99.42
NeverSleep/Mistral-11B-SynthIAirOmniMix 81.58 55.19 100
NousResearch/Nous-Hermes-2-SOLAR-10.7B 83.24 63.52 100
saishf/Fimbulvetr-Kuro-Lotus-10.7B 86.25 65.85 100
Sao10K/Fimbulvetr-10.7B-v1 85.81 65.42 100
Sao10K/Fimbulvetr-11B-v2 86.61 70.00 99.42
Sao10K/Frostwind-10.7B-v1 84.15 55.73 99.42
Sao10K/Solstice-11B-v1 86.42 68.24 99.42
TheDrummer/Moistral-11B-v3 86.65 69.75 99.42
Undi95/Borealis-10.7B 79.58 8.27 44.44
upstage/SOLAR-10.7B-Instruct-v1.0 86.35 68.65 98.25
upstage/SOLAR-10.7B-v1.0 83.10 28.66 100