Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

SeaWolf-AI 
posted an update 1 day ago
view post
Post
1147
🧬 Darwin-35B-A3B-Opus — The Child That Surpassed Both Parents

What if a merged model could beat both its parents? We proved it can.
Darwin-35B-A3B-Opus is a 35B MoE model (3B active) built with our Darwin V5 engine — the first evolution system that CT-scans parent models before merging them.
🤗 Model: FINAL-Bench/Darwin-35B-A3B-Opus

The result speaks for itself: GPQA Diamond 90.0%, versus Father (Qwen3.5-35B-A3B) at 84.2% and Mother (Claude 4.6 Opus Distilled) at 85.0%. That's +6.9% over Father and +5.9% over Mother. Not a tradeoff — a genuine leap. Meanwhile, MMMLU sits at 85.0% (Father: 85.2%), multimodal is fully intact, and all 201 languages are preserved.

How? Model MRI changed everything. Traditional merging is guesswork. Darwin V4 added evolution. Darwin V5 added X-ray vision. Model MRI scans each parent layer by layer and discovers: Mother's L34–L38 is the reasoning engine (peak cosine distance), 50–65% of Mother's experts are dead (killed by text-only distillation), and Father is a healthy generalist with every expert alive. The prescription: transplant Mother's reasoning brain at L38 (90% weight), replace her dead experts with Father's living ones, and let Father's router handle the output layer. Reasoning went up. Versatility stayed intact. No tradeoff — just evolution.

35B total, 3B active (MoE) · GPQA Diamond 90.0% · MMMLU 85.0% (201 languages) · Multimodal Image & Video · 262K native context · 147.8 tok/s on H100 · Runs on a single RTX 4090 (Q4) · Apache 2.0
Darwin V5's full algorithm and technical details will be released alongside an upcoming paper.

🚀 Live Demo: FINAL-Bench/Darwin-35B-A3B-Opus

🏆 FINAL Bench Leaderboard: FINAL-Bench/Leaderboard

📊 ALL Bench Leaderboard: FINAL-Bench/all-bench-leaderboard

Built by VIDRAFT · Supported by the Korean Government GPU Support Program
danielhanchen 
posted an update 1 day ago
view post
Post
1260
A new way to use Unsloth.

Coming soon...
sergiopaniego 
posted an update 1 day ago
view post
Post
812
TRL is officially an adult 🥳

excited to announce TRL v1.0❗️

head to the blog to see how we got here and what’s next for this post-training library, designed to keep pace with the field

https://huggingface.co/blog/trl-v1
  • 2 replies
·
alibidaran 
posted an update 1 day ago
view post
Post
907
🧠 Introducing Qwen3.5 — Cognitive Reasoning Mode

I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.

Most LLMs mimic reasoning. This one builds a real cognitive path:

📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering

Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.

The model self-checks its own structure. If a section is missing or malformed → the response is invalid.

This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.

🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged

#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource
reaperdoesntknow 
posted an update 1 day ago
view post
Post
969
Your Loss Function Has Singularities. Classical Calculus Can't See Them.

Introducing Discrepancy Calculus (DISC) — treating training singularities as structure, not noise.

Loss plateaus, mode collapse, catastrophic forgetting, distilled models that know things the teacher never taught — we engineer around these. But what if those singularities are the actual structure of the learning problem?

The core insight: Every BV function decomposes into smooth (what classical calculus handles), jump (capability emergence, loss plateaus breaking), and Cantor (ghost imprinting — knowledge transferring through weight-space topology, not gradient signal). Classical analysis sees only the first. DISC sees all three.

The paper proves this isn't alternative notation — it's strictly larger. The Meta-Discrepancy Theorem: where singularities exist, the classical FTC/MVT/chain-rule package is provably impossible.

What it explains:

TopologicalQwen exhibited literary reasoning from physics-only data — the Cantor part explains how. DualMind's Explore→Examine→Response loop operationalizes DISC as inference dynamics. 50 models, 35K+ downloads, all built on this framework.

Paper: Discrepancy Calculus: Foundations and Core Theory (DOI: 10.57967/hf/8194) — 8 axioms, proofs, computational recipes.

Series: Structure Over Scale (DOI: 10.57967/hf/8165) → Three Teachers to Dual Cognition (DOI: 10.57967/hf/8184) → DISC Foundations

— Roy S. Colca Jr., Convergent Intelligence LLC: Research Division
qgallouedec 
posted an update 1 day ago
view post
Post
1040
TRL v1.0 is out!

Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.

The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.

What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.

pip install --upgrade trl


Read more: hf.co/blog/trl-v1
OzTianlu 
posted an update 1 day ago
view post
Post
1010
https://github.com/lizixi-0x2F/March
I just released March, an open-source high-performance KV cache sharing library for LLM inference that uses Trie-based prefix deduplication.
When you run LLM services, you often see thousands of requests sharing the same system prompt and conversation history. But traditional KV cache systems store each sequence separately — duplicating the exact same data over and over again. Pure waste.
March uses a Trie structure to automatically detect and reuse identical token prefixes. Instead of storing [system_prompt + history] 1000 times, it's stored once. Everyone shares it.
- 80-97% memory reduction in prefix-heavy workloads (tested on SmolLM2-135M with 500 multi-turn conversations)
- Zero-copy queries — returns direct pointers into the memory pool, no expensive memcpy on the hot path
- Predictable memory usage — fixed-size page pool with O(L) complexity
- Trade-off: slightly slower than dict O(1) lookup, but the memory savings are worth it in production
  • 1 reply
·
ArtelTaleb 
posted an update about 15 hours ago
view post
Post
642
Hello 3D lover !

You have a 3D model sitting somewhere and just want to see it
no software to install, no account to create.

Canva3D is a free browser viewer built for that exact moment.

Drop any format — GLB, OBJ, FBX, STL, USDZ — it just loads.
Swap the background with HDRI lighting to set the mood of your scene.
Record a video ready to share — orbit, animate, export.

That's it. No settings to configure. No GPU required.

______________________________________________________

Bonjour les amoureux de la 3D

T'as un modèle 3D quelque part et tu veux juste le voir sans logiciel à
installer, sans compte à créer.

Canva3D est un viewer gratuit dans le navigateur, fait exactement pour ça.

Glisse n'importe quel format — GLB, OBJ, FBX, STL, USDZ —
il charge direct.
Change le fond avec un éclairage HDRI pour poser l'ambiance de ta scène.
Enregistre une vidéo prête à partager — orbite, anime, exporte.

C'est tout. Rien à configurer. Pas besoin de GPU.



👉 ArtelTaleb/canva3d
MikeDoes 
posted an update about 16 hours ago
view post
Post
608
Things our clients and open source actually said to us this year:

"Finally, someone built a synthetic PII training data for German."

"Does it cover have localised information? Not just the language, the actual format. That must have been a lot of work that we can save from our side."

"We operate in 12 EU countries. Your dataset is the only one that covers all of them which has helped us out a lot in compliance especially because it's synthetic."

Every language has strong PII localization names, addresses, IDs, phone numbers, dates in the real format of that country.

23 languages. 29 regions. 3 scripts. 1,428,143 examples.

100% synthetic. Zero real personal data. Free on Hugging Face.
unmodeled-tyler 
posted an update 1 day ago
view post
Post
937
RESULTS ARE IN!

- Videos of each evaluation: https://www.youtube.com/playlist?list=PLkDBfeR-zsShiZ2HpcscFDH-36uDwsl5W
- Link to repo: https://github.com/unmodeled-tyler/vessel-browser
-
quanta-intellect


Finally just wrapped up a comparative analysis of my new open source AI browser, Vessel, against Claude Chrome from Anthropic.

The test evaluates both web navigation harnesses for speed and efficiency on a simple real-world e-commerce task. Opus 4.6 was used for each of the 3 evaluations, and the results show that Opus 4.6 was AT LEAST 2X FASTER when using Vessel Browser for web navigation in place of Claude Chrome.

Results (in order, fastest to slowest)

1. Claude Code + Vessel Browser: 3 minutes and 10s

2. Hermes Agent + Vessel Browser: 4 minutes and 13s

3. Claude Code + Claude Chrome: 7 minutes and 57s

Vessel Browser is open source, designed explicitly for agents from the ground-up (it is not a fork of a human browser with AI features bolted on), and supports a local MCP server for agent control, or BYOK custom OAI endpoints. Check it out for yourself!
Free AI Image Generator No sign-up. Instant results. Open Now