x-square-robot
/

wall-oss-flow

Safetensors

qwen2_5_vl

Model card Files Files and versions

xet

Community

Shalfunnn commited on 4 days ago

Commit

aa10d79

verified ·

1 Parent(s): d304709

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -9,13 +9,13 @@
 </div>
-## Model Description
 We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
-Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables {Unified Cross-Level CoT}—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
 Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex   understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
-## Quick Start
 ### Installation
@@ -52,7 +52,7 @@ model = model.to(device).bfloat16()
 # Your inference code here...
 ```
-## Supervised Fine-Tuning (SFT)
 For training Wall-X on your robotics datasets, please refer to our comprehensive training guide:
@@ -70,7 +70,7 @@ The training process includes:
 bash ./workspace/lerobot_example/run.sh
 ```
-## Inference
 For detailed inference examples and model evaluation:
@@ -139,7 +139,7 @@ python ./scripts/draw_openloop_plot.py
 **📁 [View all inference scripts](https://github.com/X-Square-Robot/wall-x/tree/main/scripts)**
-## Complete Documentation
 For comprehensive setup, training, and inference instructions:
@@ -152,7 +152,7 @@ The repository contains:
 - **Configuration Templates**: Ready-to-use configs for different robot setups
 - **Troubleshooting Guide**: Common issues and solutions
-## 📚 Cite Us
 If you find WALL-OSS models useful, please cite:

 </div>
+## 🤖 Model Description
 We introduce **WALL-OSS**, an end-to-end embodied foundation model that leverages large-scale multimodal pretraining to achieve (1) embodiment-aware vision--language understanding, (2) strong language--action association, and (3) robust manipulation capability.
+Our approach employs a tightly coupled architecture and multi-strategies training curriculum that enables Unified Cross-Level CoT—seamlessly unifying instruction reasoning, subgoal decomposition, and fine-grained action synthesis within a single differentiable framework.
 Our results show that WALL-OSS attains high success on complex long-horizon manipulations, demonstrates strong instruction-following capabilities, complex   understanding and reasoning, and outperforms strong baselines, thereby providing a reliable and scalable path from VLMs to embodied foundation models.
+## 🚀 Quick Start
 ### Installation
 # Your inference code here...
 ```
+## 🎯 Supervised Fine-Tuning (SFT)
 For training Wall-X on your robotics datasets, please refer to our comprehensive training guide:
 bash ./workspace/lerobot_example/run.sh
 ```
+## 🔮 Inference
 For detailed inference examples and model evaluation:
 **📁 [View all inference scripts](https://github.com/X-Square-Robot/wall-x/tree/main/scripts)**
+## 📚 Complete Documentation
 For comprehensive setup, training, and inference instructions:
 - **Configuration Templates**: Ready-to-use configs for different robot setups
 - **Troubleshooting Guide**: Common issues and solutions
+## 📄 Cite Us
 If you find WALL-OSS models useful, please cite: