Upload README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,322 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<p align="center">
|
| 2 |
+
<img src="assets/logo.png" width="400">
|
| 3 |
+
</p>
|
| 4 |
+
|
| 5 |
+
## DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
|
| 6 |
+
|
| 7 |
+
[Paper](https://arxiv.org/abs/2308.15070) | [Project Page](https://0x3f3f3f3fun.github.io/projects/diffbir/)
|
| 8 |
+
|
| 9 |
+
 [](https://openxlab.org.cn/apps/detail/linxinqi/DiffBIR-official) [](https://colab.research.google.com/github/camenduru/DiffBIR-colab/blob/main/DiffBIR_colab.ipynb)
|
| 10 |
+
|
| 11 |
+
[Xinqi Lin](https://0x3f3f3f3fun.github.io/)<sup>1,\*</sup>, [Jingwen He](https://github.com/hejingwenhejingwen)<sup>2,3,\*</sup>, [Ziyan Chen](https://orcid.org/0000-0001-6277-5635)<sup>1</sup>, [Zhaoyang Lyu](https://scholar.google.com.tw/citations?user=gkXFhbwAAAAJ&hl=en)<sup>2</sup>, [Bo Dai](http://daibo.info/)<sup>2</sup>, [Fanghua Yu](https://github.com/Fanghua-Yu)<sup>1</sup>, [Wanli Ouyang](https://wlouyang.github.io/)<sup>2</sup>, [Yu Qiao](http://mmlab.siat.ac.cn/yuqiao)<sup>2</sup>, [Chao Dong](http://xpixel.group/2010/01/20/chaodong.html)<sup>1,2</sup>
|
| 12 |
+
|
| 13 |
+
<sup>1</sup>Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences<br><sup>2</sup>Shanghai AI Laboratory<br><sup>3</sup>The Chinese University of Hong Kong
|
| 14 |
+
|
| 15 |
+
<p align="center">
|
| 16 |
+
<img src="assets/teaser.png">
|
| 17 |
+
</p>
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
<p align="center">
|
| 22 |
+
<img src="assets/pipeline.png">
|
| 23 |
+
</p>
|
| 24 |
+
|
| 25 |
+
:star:If DiffBIR is helpful for you, please help star this repo. Thanks!:hugs:
|
| 26 |
+
|
| 27 |
+
## :book:Table Of Contents
|
| 28 |
+
|
| 29 |
+
- [Update](#update)
|
| 30 |
+
- [Visual Results On Real-world Images](#visual_results)
|
| 31 |
+
- [TODO](#todo)
|
| 32 |
+
- [Installation](#installation)
|
| 33 |
+
- [Pretrained Models](#pretrained_models)
|
| 34 |
+
- [Inference](#inference)
|
| 35 |
+
- [Train](#train)
|
| 36 |
+
|
| 37 |
+
## <a name="update"></a>:new:Update
|
| 38 |
+
|
| 39 |
+
- **2024.04.08**: ✅ Release everything about our [updated manuscript](https://arxiv.org/abs/2308.15070), including (1) a **new model** trained on subset of laion2b-en and (2) a **more readable code base**, etc. DiffBIR is now a general restoration pipeline that could handle different blind image restoration tasks with a unified generation module.
|
| 40 |
+
- **2023.09.19**: ✅ Add support for Apple Silicon! Check [installation_xOS.md](assets/docs/installation_xOS.md) to work with **CPU/CUDA/MPS** device!
|
| 41 |
+
- **2023.09.14**: ✅ Integrate a patch-based sampling strategy ([mixture-of-diffusers](https://github.com/albarji/mixture-of-diffusers)). [**Try it!**](#patch-based-sampling) Here is an [example](https://imgsli.com/MjA2MDA1) with a resolution of 2396 x 1596. GPU memory usage will continue to be optimized in the future and we are looking forward to your pull requests!
|
| 42 |
+
- **2023.09.14**: ✅ Add support for background upsampler (DiffBIR/[RealESRGAN](https://github.com/xinntao/Real-ESRGAN)) in face enhancement! :rocket: [**Try it!**](#inference_fr)
|
| 43 |
+
- **2023.09.13**: :rocket: Provide online demo (DiffBIR-official) in [OpenXLab](https://openxlab.org.cn/apps/detail/linxinqi/DiffBIR-official), which integrates both general model and face model. Please have a try! [camenduru](https://github.com/camenduru) also implements an online demo, thanks for his work.:hugs:
|
| 44 |
+
- **2023.09.12**: ✅ Upload inference code of latent image guidance and release [real47](inputs/real47) testset.
|
| 45 |
+
- **2023.09.08**: ✅ Add support for restoring unaligned faces.
|
| 46 |
+
- **2023.09.06**: :rocket: Update [colab demo](https://colab.research.google.com/github/camenduru/DiffBIR-colab/blob/main/DiffBIR_colab.ipynb). Thanks to [camenduru](https://github.com/camenduru)!:hugs:
|
| 47 |
+
- **2023.08.30**: This repo is released.
|
| 48 |
+
|
| 49 |
+
## <a name="visual_results"></a>:eyes:Visual Results On Real-world Images
|
| 50 |
+
|
| 51 |
+
### Blind Image Super-Resolution
|
| 52 |
+
|
| 53 |
+
[<img src="assets/visual_results/bsr6.png" height="223px"/>](https://imgsli.com/MTk5ODI3) [<img src="assets/visual_results/bsr7.png" height="223px"/>](https://imgsli.com/MTk5ODI4) [<img src="assets/visual_results/bsr4.png" height="223px"/>](https://imgsli.com/MTk5ODI1)
|
| 54 |
+
|
| 55 |
+
<!-- [<img src="assets/visual_results/bsr1.png" height="223px"/>](https://imgsli.com/MTk5ODIy) [<img src="assets/visual_results/bsr2.png" height="223px"/>](https://imgsli.com/MTk5ODIz)
|
| 56 |
+
|
| 57 |
+
[<img src="assets/visual_results/bsr3.png" height="223px"/>](https://imgsli.com/MTk5ODI0) [<img src="assets/visual_results/bsr5.png" height="223px"/>](https://imgsli.com/MjAxMjM0) -->
|
| 58 |
+
|
| 59 |
+
<!-- [<img src="assets/visual_results/bsr1.png" height="223px"/>](https://imgsli.com/MTk5ODIy) [<img src="assets/visual_results/bsr5.png" height="223px"/>](https://imgsli.com/MjAxMjM0) -->
|
| 60 |
+
|
| 61 |
+
### Blind Face Restoration
|
| 62 |
+
|
| 63 |
+
<!-- [<img src="assets/visual_results/bfr1.png" height="223px"/>](https://imgsli.com/MTk5ODI5) [<img src="assets/visual_results/bfr2.png" height="223px"/>](https://imgsli.com/MTk5ODMw) [<img src="assets/visual_results/bfr4.png" height="223px"/>](https://imgsli.com/MTk5ODM0) -->
|
| 64 |
+
|
| 65 |
+
[<img src="assets/visual_results/whole_image1.png" height="370"/>](https://imgsli.com/MjA2MTU0)
|
| 66 |
+
[<img src="assets/visual_results/whole_image2.png" height="370"/>](https://imgsli.com/MjA2MTQ4)
|
| 67 |
+
|
| 68 |
+
:star: Face and the background enhanced by DiffBIR.
|
| 69 |
+
|
| 70 |
+
### Blind Image Denoising
|
| 71 |
+
|
| 72 |
+
[<img src="assets/visual_results/bid1.png" height="215px"/>](https://imgsli.com/MjUzNzkz) [<img src="assets/visual_results/bid3.png" height="215px"/>](https://imgsli.com/MjUzNzky)
|
| 73 |
+
[<img src="assets/visual_results/bid2.png" height="215px"/>](https://imgsli.com/MjUzNzkx)
|
| 74 |
+
|
| 75 |
+
### 8x Blind Super-Resolution With Patch-based Sampling
|
| 76 |
+
|
| 77 |
+
> I often think of Bag End. I miss my books and my arm chair, and my garden. See, that's where I belong. That's home. --- Bilbo Baggins
|
| 78 |
+
|
| 79 |
+
[<img src="assets/visual_results/tiled_sampling.png" height="480px"/>](https://imgsli.com/MjUzODE4)
|
| 80 |
+
|
| 81 |
+
## <a name="todo"></a>:climbing:TODO
|
| 82 |
+
|
| 83 |
+
- [x] Release code and pretrained models :computer:.
|
| 84 |
+
- [x] Update links to paper and project page :link:.
|
| 85 |
+
- [x] Release real47 testset :minidisc:.
|
| 86 |
+
- [ ] Provide webui.
|
| 87 |
+
- [ ] Reduce the vram usage of DiffBIR :fire::fire::fire:.
|
| 88 |
+
- [ ] Provide HuggingFace demo :notebook:.
|
| 89 |
+
- [x] Add a patch-based sampling schedule :mag:.
|
| 90 |
+
- [x] Upload inference code of latent image guidance :page_facing_up:.
|
| 91 |
+
- [ ] Improve the performance :superhero:.
|
| 92 |
+
- [x] Support MPS acceleration for MacOS users.
|
| 93 |
+
- [ ] DiffBIR-turbo :fire::fire::fire:.
|
| 94 |
+
- [ ] Speed up inference, such as using fp16/bf16, torch.compile :fire::fire::fire:.
|
| 95 |
+
|
| 96 |
+
## <a name="installation"></a>:gear:Installation
|
| 97 |
+
|
| 98 |
+
```shell
|
| 99 |
+
# clone this repo
|
| 100 |
+
git clone https://github.com/XPixelGroup/DiffBIR.git
|
| 101 |
+
cd DiffBIR
|
| 102 |
+
|
| 103 |
+
# create environment
|
| 104 |
+
conda create -n diffbir python=3.10
|
| 105 |
+
conda activate diffbir
|
| 106 |
+
pip install -r requirements.txt
|
| 107 |
+
```
|
| 108 |
+
|
| 109 |
+
Our new code is based on pytorch 2.2.2 for the built-in support of memory-efficient attention. If you are working on a GPU that is not compatible with the latest pytorch, just downgrade pytorch to 1.13.1+cu116 and install xformers 0.0.16 as an alternative.
|
| 110 |
+
<!-- Note the installation is only compatible with **Linux** users. If you are working on different platforms, please check [xOS Installation](assets/docs/installation_xOS.md). -->
|
| 111 |
+
|
| 112 |
+
## <a name="pretrained_models"></a>:dna:Pretrained Models
|
| 113 |
+
|
| 114 |
+
Here we list pretrained weight of stage 2 model (IRControlNet) and our trained SwinIR, which was used for degradation removal during the training of stage 2 model.
|
| 115 |
+
|
| 116 |
+
| Model Name | Description | HuggingFace | BaiduNetdisk | OpenXLab |
|
| 117 |
+
| :---------: | :----------: | :----------: | :----------: | :----------: |
|
| 118 |
+
| v2.pth | IRControlNet trained on filtered laion2b-en | [download](https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v2.pth) | [download](https://pan.baidu.com/s/1uTAFl13xgGAzrnznAApyng?pwd=xiu3)<br>(pwd: xiu3) | [download](https://openxlab.org.cn/models/detail/linxinqi/DiffBIR/tree/main) |
|
| 119 |
+
| v1_general.pth | IRControlNet trained on ImageNet-1k | [download](https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v1_general.pth) | [download](https://pan.baidu.com/s/1PhXHAQSTOUX4Gy3MOc2t2Q?pwd=79n9)<br>(pwd: 79n9) | [download](https://openxlab.org.cn/models/detail/linxinqi/DiffBIR/tree/main) |
|
| 120 |
+
| v1_face.pth | IRControlNet trained on FFHQ | [download](https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/v1_face.pth) | [download](https://pan.baidu.com/s/1kvM_SB1VbXjbipLxdzlI3Q?pwd=n7dx)<br>(pwd: n7dx) | [download](https://openxlab.org.cn/models/detail/linxinqi/DiffBIR/tree/main) |
|
| 121 |
+
| codeformer_swinir.ckpt | SwinIR trained on ImageNet-1k | [download](https://huggingface.co/lxq007/DiffBIR-v2/resolve/main/codeformer_swinir.ckpt) | [download](https://pan.baidu.com/s/176fARg2ySYtDgX2vQOeRbA?pwd=vfif)<br>(pwd: vfif) | [download](https://openxlab.org.cn/models/detail/linxinqi/DiffBIR/tree/main) |
|
| 122 |
+
|
| 123 |
+
During inference, we use off-the-shelf models from other papers as the stage 1 model: [BSRNet](https://github.com/cszn/BSRGAN) for BSR, [SwinIR-Face](https://github.com/zsyOAOA/DifFace) used in DifFace for BFR, and [SCUNet-PSNR](https://github.com/cszn/SCUNet) for BID, while the trained IRControlNet remains **unchanged** for all tasks. Please check [code](utils/inference.py) for more details. Thanks for their work!
|
| 124 |
+
|
| 125 |
+
<!-- ## <a name="quick_start"></a>:flight_departure:Quick Start
|
| 126 |
+
|
| 127 |
+
Download [general_full_v1.ckpt](https://huggingface.co/lxq007/DiffBIR/resolve/main/general_full_v1.ckpt) and [general_swinir_v1.ckpt](https://huggingface.co/lxq007/DiffBIR/resolve/main/general_swinir_v1.ckpt) to `weights/`, then run the following command to interact with the gradio website.
|
| 128 |
+
|
| 129 |
+
```shell
|
| 130 |
+
python gradio_diffbir.py \
|
| 131 |
+
--ckpt weights/general_full_v1.ckpt \
|
| 132 |
+
--config configs/model/cldm.yaml \
|
| 133 |
+
--reload_swinir \
|
| 134 |
+
--swinir_ckpt weights/general_swinir_v1.ckpt \
|
| 135 |
+
--device cuda
|
| 136 |
+
```
|
| 137 |
+
|
| 138 |
+
<div align="center">
|
| 139 |
+
<kbd><img src="assets/gradio.png"></img></kbd>
|
| 140 |
+
</div> -->
|
| 141 |
+
|
| 142 |
+
## <a name="inference"></a>:crossed_swords:Inference
|
| 143 |
+
|
| 144 |
+
We provide some examples for inference, check [inference.py](inference.py) for more arguments. Pretrained weights will be **automatically downloaded**.
|
| 145 |
+
|
| 146 |
+
### Blind Image Super-Resolution
|
| 147 |
+
|
| 148 |
+
```shell
|
| 149 |
+
python -u inference.py \
|
| 150 |
+
--version v2 \
|
| 151 |
+
--task sr \
|
| 152 |
+
--upscale 4 \
|
| 153 |
+
--cfg_scale 4.0 \
|
| 154 |
+
--input inputs/demo/bsr \
|
| 155 |
+
--output results/demo_bsr \
|
| 156 |
+
--device cuda
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
### Blind Face Restoration
|
| 160 |
+
<a name="inference_fr"></a>
|
| 161 |
+
|
| 162 |
+
```shell
|
| 163 |
+
# for aligned face inputs
|
| 164 |
+
python -u inference.py \
|
| 165 |
+
--version v2 \
|
| 166 |
+
--task fr \
|
| 167 |
+
--upscale 1 \
|
| 168 |
+
--cfg_scale 4.0 \
|
| 169 |
+
--input inputs/demo/bfr/aligned \
|
| 170 |
+
--output results/demo_bfr_aligned \
|
| 171 |
+
--device cuda
|
| 172 |
+
```
|
| 173 |
+
|
| 174 |
+
```shell
|
| 175 |
+
# for unaligned face inputs
|
| 176 |
+
python -u inference.py \
|
| 177 |
+
--version v2 \
|
| 178 |
+
--task fr_bg \
|
| 179 |
+
--upscale 2 \
|
| 180 |
+
--cfg_scale 4.0 \
|
| 181 |
+
--input inputs/demo/bfr/whole_img \
|
| 182 |
+
--output results/demo_bfr_unaligned \
|
| 183 |
+
--device cuda
|
| 184 |
+
```
|
| 185 |
+
|
| 186 |
+
### Blind Image Denoising
|
| 187 |
+
|
| 188 |
+
```shell
|
| 189 |
+
python -u inference.py \
|
| 190 |
+
--version v2 \
|
| 191 |
+
--task dn \
|
| 192 |
+
--upscale 1 \
|
| 193 |
+
--cfg_scale 4.0 \
|
| 194 |
+
--input inputs/demo/bid \
|
| 195 |
+
--output results/demo_bid \
|
| 196 |
+
--device cuda
|
| 197 |
+
```
|
| 198 |
+
|
| 199 |
+
### Other options
|
| 200 |
+
|
| 201 |
+
#### Patch-based sampling
|
| 202 |
+
<a name="patch_based_sampling"></a>
|
| 203 |
+
|
| 204 |
+
Add the following arguments to enable patch-based sampling:
|
| 205 |
+
|
| 206 |
+
```shell
|
| 207 |
+
[command...] --tiled --tile_size 512 --tile_stride 256
|
| 208 |
+
```
|
| 209 |
+
|
| 210 |
+
Patch-based sampling supports super-resolution with a large scale factor. Our patch-based sampling is built upon [mixture-of-diffusers](https://github.com/albarji/mixture-of-diffusers). Thanks for their work!
|
| 211 |
+
|
| 212 |
+
#### Restoration Guidance
|
| 213 |
+
|
| 214 |
+
Restoration guidance is used to achieve a trade-off bwtween quality and fidelity. We default to closing it since we prefer quality rather than fidelity. Here is an example:
|
| 215 |
+
|
| 216 |
+
```shell
|
| 217 |
+
python -u inference.py \
|
| 218 |
+
--version v2 \
|
| 219 |
+
--task sr \
|
| 220 |
+
--upscale 4 \
|
| 221 |
+
--cfg_scale 4.0 \
|
| 222 |
+
--input inputs/demo/bsr \
|
| 223 |
+
--guidance --g_loss w_mse --g_scale 0.5 --g_space rgb \
|
| 224 |
+
--output results/demo_bsr_wg \
|
| 225 |
+
--device cuda
|
| 226 |
+
```
|
| 227 |
+
|
| 228 |
+
You will see that the results become more smooth.
|
| 229 |
+
|
| 230 |
+
#### Better Start Point For Sampling
|
| 231 |
+
|
| 232 |
+
Add the following argument to offer better start point for reverse sampling:
|
| 233 |
+
|
| 234 |
+
```shell
|
| 235 |
+
[command...] --better_start
|
| 236 |
+
```
|
| 237 |
+
|
| 238 |
+
This option prevents our model from generating noise in
|
| 239 |
+
image background.
|
| 240 |
+
|
| 241 |
+
## <a name="train"></a>:stars:Train
|
| 242 |
+
|
| 243 |
+
|
| 244 |
+
### Stage 1
|
| 245 |
+
|
| 246 |
+
First, we train a SwinIR, which will be used for degradation removal during the training of stage 2.
|
| 247 |
+
|
| 248 |
+
<a name="gen_file_list"></a>
|
| 249 |
+
1. Generate file list of training set and validation set, a file list looks like:
|
| 250 |
+
|
| 251 |
+
```txt
|
| 252 |
+
/path/to/image_1
|
| 253 |
+
/path/to/image_2
|
| 254 |
+
/path/to/image_3
|
| 255 |
+
...
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
You can write a simple python script or directly use shell command to produce file lists. Here is an example:
|
| 259 |
+
|
| 260 |
+
```shell
|
| 261 |
+
# collect all iamge files in img_dir
|
| 262 |
+
find [img_dir] -type f > files.list
|
| 263 |
+
# shuffle collected files
|
| 264 |
+
shuf files.list > files_shuf.list
|
| 265 |
+
# pick train_size files in the front as training set
|
| 266 |
+
head -n [train_size] files_shuf.list > files_shuf_train.list
|
| 267 |
+
# pick remaining files as validation set
|
| 268 |
+
tail -n +[train_size + 1] files_shuf.list > files_shuf_val.list
|
| 269 |
+
```
|
| 270 |
+
|
| 271 |
+
2. Fill in the [training configuration file](configs/train/train_stage1.yaml) with appropriate values.
|
| 272 |
+
|
| 273 |
+
3. Start training!
|
| 274 |
+
|
| 275 |
+
```shell
|
| 276 |
+
accelerate launch train_stage1.py --config configs/train/train_stage1.yaml
|
| 277 |
+
```
|
| 278 |
+
|
| 279 |
+
### Stage 2
|
| 280 |
+
|
| 281 |
+
1. Download pretrained [Stable Diffusion v2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1-base) to provide generative capabilities. :bulb:: If you have ran the [inference script](inference.py), the SD v2.1 checkpoint can be found in [weights](weights).
|
| 282 |
+
|
| 283 |
+
```shell
|
| 284 |
+
wget https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt --no-check-certificate
|
| 285 |
+
```
|
| 286 |
+
|
| 287 |
+
2. Generate file list as mentioned [above](#gen_file_list). Currently, the training script of stage 2 doesn't support validation set, so you only need to create training file list.
|
| 288 |
+
|
| 289 |
+
3. Fill in the [training configuration file](configs/train/train_stage2.yaml) with appropriate values.
|
| 290 |
+
|
| 291 |
+
4. Start training!
|
| 292 |
+
|
| 293 |
+
```shell
|
| 294 |
+
accelerate launch train_stage2.py --config configs/train/train_stage2.yaml
|
| 295 |
+
```
|
| 296 |
+
|
| 297 |
+
## Citation
|
| 298 |
+
|
| 299 |
+
Please cite us if our work is useful for your research.
|
| 300 |
+
|
| 301 |
+
```
|
| 302 |
+
@misc{lin2024diffbir,
|
| 303 |
+
title={DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior},
|
| 304 |
+
author={Xinqi Lin and Jingwen He and Ziyan Chen and Zhaoyang Lyu and Bo Dai and Fanghua Yu and Wanli Ouyang and Yu Qiao and Chao Dong},
|
| 305 |
+
year={2024},
|
| 306 |
+
eprint={2308.15070},
|
| 307 |
+
archivePrefix={arXiv},
|
| 308 |
+
primaryClass={cs.CV}
|
| 309 |
+
}
|
| 310 |
+
```
|
| 311 |
+
|
| 312 |
+
## License
|
| 313 |
+
|
| 314 |
+
This project is released under the [Apache 2.0 license](LICENSE).
|
| 315 |
+
|
| 316 |
+
## Acknowledgement
|
| 317 |
+
|
| 318 |
+
This project is based on [ControlNet](https://github.com/lllyasviel/ControlNet) and [BasicSR](https://github.com/XPixelGroup/BasicSR). Thanks for their awesome work.
|
| 319 |
+
|
| 320 |
+
## Contact
|
| 321 |
+
|
| 322 |
+
If you have any questions, please feel free to contact with me at [email protected].
|