bobig commited on
Commit
b81b29d
·
verified ·
1 Parent(s): f2c05d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -5,6 +5,31 @@ tags:
5
  - mlx
6
  ---
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  # bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8
9
 
10
  The Model [bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8](https://huggingface.co/bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8) was
 
5
  - mlx
6
  ---
7
 
8
+ 13.7 TPS
9
+
10
+ 27.1 TPS with Specultive decoding
11
+
12
+ Macbook M4 Max: high power
13
+
14
+ system prompt: "You are Fuse01. You answer very direct brief and concise"
15
+
16
+ prompt: "Write a quick sort in C++"
17
+
18
+ Temp: 0
19
+
20
+
21
+ Try this model & quant in roo coder, starting in Architect Mode and letting it auto switch to Code Mode.... it actually spits decent code for small projects with multiple files.
22
+ Almost Claude Sonnet level for small projects. It actually stays reasonably stable even with Roo Code's huge 10k system prompt. Still shits the bed for big projects but better after adding roo-code-memory-bank.
23
+
24
+ All the smaller quants I tested shit the bed
25
+
26
+ All the smaller models I tested shit the bed
27
+
28
+ So far (Feb 20, 2025) this is the only model & quant that fits on a Macbook, spits decent code AND works with Speculative Decoding.
29
+
30
+
31
+ Huge thanks to all who helped Macs get this far!
32
+
33
  # bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8
34
 
35
  The Model [bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8](https://huggingface.co/bobig/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-Q8) was