Image-to-3D
Diffusers
Safetensors
SpatialGenDiffusionPipeline
bertjiazheng nielsr HF Staff commited on
Commit
3c6ff63
·
verified ·
1 Parent(s): cdee095

Update pipeline tag, add library name, and expand content (#1)

Browse files

- Update pipeline tag, add library name, and expand content (db553fad8de46e5ab050e17e154f524940d50bb4)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +35 -13
README.md CHANGED
@@ -1,12 +1,14 @@
1
  ---
2
- license: creativeml-openrail-m
3
- datasets:
4
- - manycore-research/SpatialGen-Testset
5
  base_model:
6
  - stabilityai/stable-diffusion-2-1
7
- pipeline_tag: image-to-image
 
 
 
 
8
  ---
9
- # SpatialGen
 
10
 
11
  <!-- markdownlint-disable first-line-h1 -->
12
  <!-- markdownlint-disable html -->
@@ -32,25 +34,32 @@ pipeline_tag: image-to-image
32
 
33
  <div align="center">
34
 
35
- | Image-to-Scene Results | Text-to-Scene Results |
36
- | :--------------------: | :-------------------: |
37
  | ![Img2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/ksN5t8QEu3Iv6KhpsYsk6.png) | ![Text2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/waCRa3kp01KAsKgmqS1bb.png) |
38
 
39
- <p>SpatialGen produces multi-view, multi-modal information from a semantic layout using a multi-view, multi-modal diffusion model.</p>
40
  </div>
41
 
42
  ## ✨ News
43
 
44
  - [Aug, 2025] Initial release of SpatialGen-1.0!
 
 
 
45
 
 
 
 
46
 
47
  ## SpatialGen Models
48
 
49
  <div align="center">
50
 
51
- | **Model** | **Download** |
52
- | :-------------: | -------------------------------------------------------------------------- |
53
- | SpatialGen-1.0 | [🤗 HuggingFace](https://huggingface.co/manycore-research/SpatialGen-1.0) |
 
54
 
55
  </div>
56
 
@@ -87,15 +96,28 @@ We provide [SpatialGen-Testset](https://huggingface.co/datasets/manycore-researc
87
  bash scripts/infer_spatialgen_i2s.sh
88
 
89
  # Text-to-image-to-3D Scene
 
 
90
  bash scripts/infer_spatialgen_t2s.sh
91
  ```
92
 
93
  ## License
94
 
95
- [SpatialGen-1.0](https://huggingface.co/manycore-research/SpatialGen-1.0) is derived from [Stable-Diffusion-v2.1](https://github.com/Stability-AI/stablediffusion), which is licensed under the [CreativeML Open RAIL++-M License](https://github.com/Stability-AI/stablediffusion/blob/main/LICENSE-MODEL).
96
 
97
  ## Acknowledgements
98
 
99
  We would like to thank the following projects that made this work possible:
100
 
101
- [DiffSplat](https://github.com/chenguolin/DiffSplat) | [SD 2.1](https://github.com/Stability-AI/stablediffusion) | [TAESD](https://github.com/madebyollin/taesd) | [SpatialLM](https://github.com/manycore-research/SpatialLM)
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
 
2
  base_model:
3
  - stabilityai/stable-diffusion-2-1
4
+ datasets:
5
+ - manycore-research/SpatialGen-Testset
6
+ license: creativeml-openrail-m
7
+ pipeline_tag: image-to-3d
8
+ library_name: diffusers
9
  ---
10
+
11
+ # SpatialGen: Layout-guided 3D Indoor Scene Generation
12
 
13
  <!-- markdownlint-disable first-line-h1 -->
14
  <!-- markdownlint-disable html -->
 
34
 
35
  <div align="center">
36
 
37
+ | Image-to-Scene Results | Text-to-Scene Results |
38
+ | :--------------------------------------: | :----------------------------------------: |
39
  | ![Img2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/ksN5t8QEu3Iv6KhpsYsk6.png) | ![Text2Scene](https://cdn-uploads.huggingface.co/production/uploads/6437c0ead38ce48bdd4b0067/waCRa3kp01KAsKgmqS1bb.png) |
40
 
41
+ <p>TL;DR: Given a 3D semantic layout, SpatialGen can generate a 3D indoor scene conditioned on either a reference image (left) or a textual description (right) using a multi-view, multi-modal diffusion model.</p>
42
  </div>
43
 
44
  ## ✨ News
45
 
46
  - [Aug, 2025] Initial release of SpatialGen-1.0!
47
+ - [Sep, 2025] We release the paper of SpatialGen!
48
+
49
+ ## 📋 Release Plan
50
 
51
+ - [x] Provide inference code of SpatialGen.
52
+ - [ ] Provide training instruction for SpatialGen.
53
+ - [ ] Release SpatialGen dataset.
54
 
55
  ## SpatialGen Models
56
 
57
  <div align="center">
58
 
59
+ | **Model** | **Download** |
60
+ | :----------------------: | ----------------------------------------------------------------------------------- |
61
+ | SpatialGen-1.0 | [🤗 HuggingFace](https://huggingface.co/manycore-research/SpatialGen-1.0) |
62
+ | FLUX.1-Layout-ControlNet | [🤗 HuggingFace](https://huggingface.co/manycore-research/FLUX.1-Layout-ControlNet) |
63
 
64
  </div>
65
 
 
96
  bash scripts/infer_spatialgen_i2s.sh
97
 
98
  # Text-to-image-to-3D Scene
99
+ # in captions/spatialgen_testset_captions.jsonl, we provide text prompts of different styles for each room,
100
+ # choose a pair of scene_id and prompt to run the text2scene experiment
101
  bash scripts/infer_spatialgen_t2s.sh
102
  ```
103
 
104
  ## License
105
 
106
+ [SpatialGen-1.0](https://huggingface.co/manycore-research/SpatialGen-1.0) is derived from [Stable-Diffusion-v2.1](https://github.com/Stability-AI/stablediffusion), which is licensed under the [CreativeML Open RAIL++-M License](https://github.com/Stability-AI/stablediffusion/blob/main/LICENSE-MODEL). [FLUX.1-Layout-ControlNet](https://huggingface.co/manycore-research/FLUX.1-Layout-ControlNet) is licensed under the [FLUX.1-dev Non-Commercial License](https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev).
107
 
108
  ## Acknowledgements
109
 
110
  We would like to thank the following projects that made this work possible:
111
 
112
+ [DiffSplat](https://github.com/chenguolin/DiffSplat) | [SD 2.1](https://github.com/Stability-AI/stablediffusion) | [TAESD](https://github.com/madebyollin/taesd) | [FLUX](https://github.com/black-forest-labs/flux/) | [SpatialLM](https://github.com/manycore-research/SpatialLM)
113
+
114
+ ## Citation
115
+
116
+ ```bibtex
117
+ @article{wu2024spatialgen,
118
+ title={SPATIALGEN: Layout-guided 3D Indoor Scene Generation},
119
+ author={Zhenqing Wu and Zhenxiong Tan and Guolin Chen and Wenbo Zhao and Xingyi Yang and Xiaofeng Wang and Jianmin Li and Bo Dai and Dahua Lin and Xinchao Wang},
120
+ journal={arXiv preprint arXiv:2509.14981},
121
+ year={2025}
122
+ }
123
+ ```