Distill Qwen3-Coder-480b-A35B over your Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill

by NIK2703 - opened 6 days ago

6 days ago

•

as my tests have shown, this model performs better than the basic model in programming tasks, but general thinking models such as Qwen3-30B-A3B-Thinking-2507-Deepseek-v3.1-Distill and gpt-oss-20b often perform better in complex problems than both of them. It would be interesting to see a combination of deepseek's thinking abilities and coder-480b's programming skills.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment