File size: 4,723 Bytes
6baa8cc
 
6234cad
6baa8cc
 
 
b9fa26c
 
6baa8cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61b016d
6baa8cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5f73389
6baa8cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# 🧠 Text-Conditioned Latent Diffusion for Contrast-Enhanced CT Synthesis

**Model Name**: `mlii0117/sd1.5_MPECT`  
**Model Type**: Fine-tuned `Stable Diffusion v1.5` for medical image-to-image translation  
**Paper**: _Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT_  
**Conference**: AAPM 2025 (Oral)  
**Authors**: Mingjie Li, Yizheng Chen, Lei Xing, Michael F. Gensheimer  
**Affiliation**: Department of Radiation Oncology - Medical Physics Divison, Stanford University

---

## 🧬 Model Description

This model is a fine-tuned version of **Stable Diffusion v1.5**, specialized for converting **non-contrast CT images** into **contrast-enhanced CT images**, guided by **textual phase prompts** (e.g., *venous phase*, *arterial phase*). It utilizes the `InstructPix2Pix` framework to enable flexible prompt-conditioned generation, enabling control over contrast timing without requiring explicit paired data.

---

## 💡 Key Features

- 🧾 **Text-guided control** over contrast phase (arterial vs. venous)
- 🖼️ Processes **2D CT slices** in image format (converted from DICOM)
- 🏥 Focused on **clinical realism and anatomical fidelity**
- 🧠 Reconstructs full 3D volume with NIfTI output support
- ✅ Evaluated and presented as **Oral at AAPM 2025**

---

## 🛠️ Usage

### 🔧 Requirements
```bash
pip install diffusers==0.25.0 nibabel pydicom tqdm pillow
```

### 📦 Load the Model
```python
from diffusers import StableDiffusionInstructPix2PixPipeline
import torch

pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
    "mlii0117/sd1.5_MPECT", torch_dtype=torch.float16
).to("cuda")
generator = torch.Generator("cuda").manual_seed(0)
```

### 📝 Example Prompts

- **Arterial Phase**
  ```
  Convert this non-contrast CT slice to mimic an arterial-phase contrast-enhanced CT.
  Brighten and enhance the aorta, major arteries, and adjacent organ boundaries.
  ```

- **Venous Phase**
  ```
  Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT.
  Brighten and enhance the portal and hepatic veins and emphasize organ boundaries.
  ```

### 🧪 Full Pipeline Example

```python
import os
import numpy as np
import nibabel as nib
from PIL import Image
from glob import glob
from tqdm import tqdm
from pydicom import dcmread
from diffusers import StableDiffusionInstructPix2PixPipeline

pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
    "mlii0117/sd1.5_MPECT", torch_dtype=torch.float16
).to("cuda")
generator = torch.Generator("cuda").manual_seed(0)

prompt = "Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT. Brighten and enhance the portal and hepatic veins, and emphasize organ boundaries."

def load_dicom_folder(dicom_folder):
    dicom_folder = os.path.join(dicom_folder, 'DICOM')
    files = sorted(glob(os.path.join(dicom_folder, "*")))
    slices = [dcmread(f).pixel_array.astype(np.float32) for f in files]
    volume = np.stack(slices, axis=0)
    volume += dcmread(files[0]).RescaleIntercept
    volume = np.clip(volume, -1000, 1000)
    return (volume + 1000) / 2000.0

def process(volume):
    results = []
    for i in tqdm(range(volume.shape[0])):
        img = Image.fromarray((volume[i] * 255).astype(np.uint8)).convert("RGB")
        out = pipe(prompt, image=img, num_inference_steps=20,
                   image_guidance_scale=1.5, guidance_scale=10,
                   generator=generator).images[0]
        gray = np.array(out.convert("L")).astype(np.float32) / 255.0
        gray = gray * 2000 - 1000
        results.append(gray)
    return np.stack(results, axis=0)

def save_nifti(volume, path):
    nib.save(nib.Nifti1Image(volume, np.eye(4)), path)

input_path = "/path/to/dicom_folder"
output_path = "/path/to/output.nii.gz"

vol = load_dicom_folder(input_path)
out_vol = process(vol)
save_nifti(out_vol, output_path)
```

---

## 🧠 Intended Use

- Medical research and simulation
- Data augmentation for contrast-enhanced imaging
- Exploratory analysis in non-contrast → contrast CT enhancement

> ⚠️ **Disclaimer**: This model is for research purposes only. It is not intended for clinical decision-making or diagnostic use.

---

## 📝 Citation

```
@inproceedings{li2025text,
  title={Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT},
  author={Li, Mingjie and Chen, Yizheng and Xing, Lei and Gensheimer, Michael},
  booktitle={AAPM Annual Meeting (Oral)},
  year={2025}
}
```

---

## 🧾 License

This model is released for **non-commercial research purposes only**. Please contact the authors if you wish to use it in clinical or commercial settings.