ehartford commited on
Commit
e01a537
·
verified ·
1 Parent(s): e50ecc3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -7
README.md CHANGED
@@ -1,13 +1,30 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
- # OpenGR00T-N1.5-3B-Zero
5
 
6
- A fully open-source, randomly initialized version of the GR00T-N1.5-3B architecture for humanoid robot control. This model has the exact same architecture as NVIDIA's GR00T-N1.5-3B but with random weights and Apache-2.0 licensing.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
  ## Model Description
9
 
10
- OpenGR00T-N1.5-3B-Zero is a Vision-Language-Action (VLA) model designed for humanoid robot control:
11
 
12
  - **Architecture**: Dual-system design with vision-language backbone (Eagle-based with Qwen3 LLM) and diffusion transformer action head
13
  - **Parameters**: 2,724M total (1,655M backbone in bfloat16, 1,069M action head in float32)
@@ -38,13 +55,13 @@ from transformers import AutoModel, AutoTokenizer
38
 
39
  # Load model
40
  model = AutoModel.from_pretrained(
41
- "OpenGR00T-N1.5-3B-Zero",
42
  trust_remote_code=True,
43
  torch_dtype="auto"
44
  )
45
 
46
  # Load tokenizer
47
- tokenizer = AutoTokenizer.from_pretrained("OpenGR00T-N1.5-3B-Zero")
48
 
49
  # Move to GPU if available
50
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
@@ -389,8 +406,8 @@ The model consists of two main components:
389
  If you use this model in your research, please cite:
390
 
391
  ```bibtex
392
- @software{opengr00t2024,
393
- title={OpenGR00T-N1.5-3B-Zero: Open Source Blank GR00T Architecture},
394
  author={Community Contributors},
395
  year={2024},
396
  license={Apache-2.0}
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # DolphinGR00T-N1.5-3B-Zero
5
 
6
+ by Eric Hartford
7
+
8
+ I love GR00T but NVidia's license - Tsk-tsk, no no no, that won't do at all.
9
+
10
+ The world - our future - deserves a high quality permissively licensed robot control model.
11
+
12
+ This rep contains a fully open-source Apache 2.0 licensed, randomly initialized version of the GR00T-N1.5-3B architecture for humanoid robot control. This model has the exact same architecture as NVIDIA's GR00T-N1.5-3B but with random weights.
13
+
14
+ I created this model using [this script](init_DolphinGR00T_zero.py)
15
+
16
+ The purpose is to distill GR00T into an Apache-2.0 licensed version.
17
+
18
+ The whole job looks like this:
19
+
20
+ 1) make an Apache 2.0 licensed "blank slate" with the right shape (this repo)
21
+ 2) Track down the sub-components that are Apache 2.0, and bring those weights in. (qwen3-1.7b, for instance, is used as the language tower.)
22
+ 3) missing components - find some initialization that's better than "random" - like merging from similar models into the correct shape.
23
+ 4) distill GR00T onto it.
24
 
25
  ## Model Description
26
 
27
+ DolphinGR00T-N1.5-3B-Zero is a Vision-Language-Action (VLA) model designed for humanoid robot control:
28
 
29
  - **Architecture**: Dual-system design with vision-language backbone (Eagle-based with Qwen3 LLM) and diffusion transformer action head
30
  - **Parameters**: 2,724M total (1,655M backbone in bfloat16, 1,069M action head in float32)
 
55
 
56
  # Load model
57
  model = AutoModel.from_pretrained(
58
+ "DolphinGR00T-N1.5-3B-Zero",
59
  trust_remote_code=True,
60
  torch_dtype="auto"
61
  )
62
 
63
  # Load tokenizer
64
+ tokenizer = AutoTokenizer.from_pretrained("DolphinGR00T-N1.5-3B-Zero")
65
 
66
  # Move to GPU if available
67
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
 
406
  If you use this model in your research, please cite:
407
 
408
  ```bibtex
409
+ @software{DolphinGR00T2024,
410
+ title={DolphinGR00T-N1.5-3B-Zero: Open Source Blank GR00T Architecture},
411
  author={Community Contributors},
412
  year={2024},
413
  license={Apache-2.0}