QuixiAI
/

QuixiGR00T-N1.5-3B-Zero

Safetensors

gr00t_n1_5

Model card Files Files and versions

xet

Community

ehartford commited on Jun 27

Commit

e01a537

verified ·

1 Parent(s): e50ecc3

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -7

README.md CHANGED Viewed

@@ -1,13 +1,30 @@
 ---
 license: apache-2.0
 ---
-# OpenGR00T-N1.5-3B-Zero
-A fully open-source, randomly initialized version of the GR00T-N1.5-3B architecture for humanoid robot control. This model has the exact same architecture as NVIDIA's GR00T-N1.5-3B but with random weights and Apache-2.0 licensing.
 ## Model Description
-OpenGR00T-N1.5-3B-Zero is a Vision-Language-Action (VLA) model designed for humanoid robot control:
 - **Architecture**: Dual-system design with vision-language backbone (Eagle-based with Qwen3 LLM) and diffusion transformer action head
 - **Parameters**: 2,724M total (1,655M backbone in bfloat16, 1,069M action head in float32)
@@ -38,13 +55,13 @@ from transformers import AutoModel, AutoTokenizer
 # Load model
 model = AutoModel.from_pretrained(
-    "OpenGR00T-N1.5-3B-Zero",
     trust_remote_code=True,
     torch_dtype="auto"
 )
 # Load tokenizer
-tokenizer = AutoTokenizer.from_pretrained("OpenGR00T-N1.5-3B-Zero")
 # Move to GPU if available
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -389,8 +406,8 @@ The model consists of two main components:
 If you use this model in your research, please cite:
 ```bibtex
-@software{opengr00t2024,
-  title={OpenGR00T-N1.5-3B-Zero: Open Source Blank GR00T Architecture},
   author={Community Contributors},
   year={2024},
   license={Apache-2.0}

 ---
 license: apache-2.0
 ---
+# DolphinGR00T-N1.5-3B-Zero
+by Eric Hartford
+I love GR00T but NVidia's license - Tsk-tsk, no no no, that won't do at all.
+The world - our future - deserves a high quality permissively licensed robot control model.
+This rep contains a fully open-source Apache 2.0 licensed, randomly initialized version of the GR00T-N1.5-3B architecture for humanoid robot control. This model has the exact same architecture as NVIDIA's GR00T-N1.5-3B but with random weights.
+I created this model using [this script](init_DolphinGR00T_zero.py)
+The purpose is to distill GR00T into an Apache-2.0 licensed version.
+The whole job looks like this:
+1) make an Apache 2.0 licensed "blank slate" with the right shape (this repo)
+2) Track down the sub-components that are Apache 2.0, and bring those weights in.  (qwen3-1.7b, for instance, is used as the language tower.)
+3) missing components - find some initialization that's better than "random" - like merging from similar models into the correct shape.
+4) distill GR00T onto it.
 ## Model Description
+DolphinGR00T-N1.5-3B-Zero is a Vision-Language-Action (VLA) model designed for humanoid robot control:
 - **Architecture**: Dual-system design with vision-language backbone (Eagle-based with Qwen3 LLM) and diffusion transformer action head
 - **Parameters**: 2,724M total (1,655M backbone in bfloat16, 1,069M action head in float32)
 # Load model
 model = AutoModel.from_pretrained(
+    "DolphinGR00T-N1.5-3B-Zero",
     trust_remote_code=True,
     torch_dtype="auto"
 )
 # Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained("DolphinGR00T-N1.5-3B-Zero")
 # Move to GPU if available
 device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 If you use this model in your research, please cite:
 ```bibtex
+@software{DolphinGR00T2024,
+  title={DolphinGR00T-N1.5-3B-Zero: Open Source Blank GR00T Architecture},
   author={Community Contributors},
   year={2024},
   license={Apache-2.0}