Update pipeline tag, add library name, and expand content (#1)

Browse files

- Update pipeline tag, add library name, and expand content (db553fad8de46e5ab050e17e154f524940d50bb4)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +35 -13

README.md CHANGED Viewed

@@ -1,12 +1,14 @@
 ---
-license: creativeml-openrail-m
-datasets:
-- manycore-research/SpatialGen-Testset
 base_model:
 - stabilityai/stable-diffusion-2-1
-pipeline_tag: image-to-image
 ---
-# SpatialGen
 <!-- markdownlint-disable first-line-h1 -->
 <!-- markdownlint-disable html -->
@@ -32,25 +34,32 @@ pipeline_tag: image-to-image
 <div align="center">
-| Image-to-Scene Results | Text-to-Scene Results |
-| :--------------------: | :-------------------: |
 | ![Img2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/ksN5t8QEu3Iv6KhpsYsk6.png) | ![Text2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/waCRa3kp01KAsKgmqS1bb.png) |
-<p>SpatialGen produces multi-view, multi-modal information from a semantic layout using a multi-view, multi-modal diffusion model.</p>
 </div>
 ## ✨ News
 - [Aug, 2025] Initial release of SpatialGen-1.0!
 ## SpatialGen Models
 <div align="center">
-| **Model**       | **Download**                                                               |
-| :-------------: | -------------------------------------------------------------------------- |
-| SpatialGen-1.0  | [🤗 HuggingFace](https://huggingface.co/manycore-research/SpatialGen-1.0)  |
 </div>
@@ -87,15 +96,28 @@ We provide [SpatialGen-Testset](https://huggingface.co/datasets/manycore-researc
 bash scripts/infer_spatialgen_i2s.sh
 # Text-to-image-to-3D Scene
 bash scripts/infer_spatialgen_t2s.sh
 ```
 ## License
-[SpatialGen-1.0](https://huggingface.co/manycore-research/SpatialGen-1.0) is derived from [Stable-Diffusion-v2.1](https://github.com/Stability-AI/stablediffusion), which is licensed under the [CreativeML Open RAIL++-M License](https://github.com/Stability-AI/stablediffusion/blob/main/LICENSE-MODEL).
 ## Acknowledgements
 We would like to thank the following projects that made this work possible:
-[DiffSplat](https://github.com/chenguolin/DiffSplat) | [SD 2.1](https://github.com/Stability-AI/stablediffusion) | [TAESD](https://github.com/madebyollin/taesd) | [SpatialLM](https://github.com/manycore-research/SpatialLM)

 ---
 base_model:
 - stabilityai/stable-diffusion-2-1
+datasets:
+- manycore-research/SpatialGen-Testset
+license: creativeml-openrail-m
+pipeline_tag: image-to-3d
+library_name: diffusers
 ---
+# SpatialGen: Layout-guided 3D Indoor Scene Generation
 <!-- markdownlint-disable first-line-h1 -->
 <!-- markdownlint-disable html -->
 <div align="center">
+| Image-to-Scene Results                   | Text-to-Scene Results                      |
+| :--------------------------------------: | :----------------------------------------: |
 | ![Img2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/ksN5t8QEu3Iv6KhpsYsk6.png) | ![Text2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/waCRa3kp01KAsKgmqS1bb.png) |
+<p>TL;DR: Given a 3D semantic layout, SpatialGen can generate a 3D indoor scene conditioned on either a reference image (left) or a textual description (right) using a multi-view, multi-modal diffusion model.</p>
 </div>
 ## ✨ News
 - [Aug, 2025] Initial release of SpatialGen-1.0!
+- [Sep, 2025] We release the paper of SpatialGen!
+## 📋 Release Plan
+- [x] Provide inference code of SpatialGen.
+- [ ] Provide training instruction for SpatialGen.
+- [ ] Release SpatialGen dataset.
 ## SpatialGen Models
 <div align="center">
+| **Model**                | **Download**                                                                        |
+| :----------------------: | ----------------------------------------------------------------------------------- |
+| SpatialGen-1.0           | [🤗 HuggingFace](https://huggingface.co/manycore-research/SpatialGen-1.0)           |
+| FLUX.1-Layout-ControlNet | [🤗 HuggingFace](https://huggingface.co/manycore-research/FLUX.1-Layout-ControlNet) |
 </div>
 bash scripts/infer_spatialgen_i2s.sh
 # Text-to-image-to-3D Scene
+# in captions/spatialgen_testset_captions.jsonl, we provide text prompts of different styles for each room,
+# choose a pair of scene_id and prompt to run the text2scene experiment
 bash scripts/infer_spatialgen_t2s.sh
 ```
 ## License
+[SpatialGen-1.0](https://huggingface.co/manycore-research/SpatialGen-1.0) is derived from [Stable-Diffusion-v2.1](https://github.com/Stability-AI/stablediffusion), which is licensed under the [CreativeML Open RAIL++-M License](https://github.com/Stability-AI/stablediffusion/blob/main/LICENSE-MODEL). [FLUX.1-Layout-ControlNet](https://huggingface.co/manycore-research/FLUX.1-Layout-ControlNet) is licensed under the [FLUX.1-dev Non-Commercial License](https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev).
 ## Acknowledgements
 We would like to thank the following projects that made this work possible:
+[DiffSplat](https://github.com/chenguolin/DiffSplat) | [SD 2.1](https://github.com/Stability-AI/stablediffusion) | [TAESD](https://github.com/madebyollin/taesd) | [FLUX](https://github.com/black-forest-labs/flux/) | [SpatialLM](https://github.com/manycore-research/SpatialLM)
+## Citation
+```bibtex
+@article{wu2024spatialgen,
+  title={SPATIALGEN: Layout-guided 3D Indoor Scene Generation},
+  author={Zhenqing Wu and Zhenxiong Tan and Guolin Chen and Wenbo Zhao and Xingyi Yang and Xiaofeng Wang and Jianmin Li and Bo Dai and Dahua Lin and Xinchao Wang},
+  journal={arXiv preprint arXiv:2509.14981},
+  year={2025}
+}
+```