| # 🧠 Text-Conditioned Latent Diffusion for Contrast-Enhanced CT Synthesis | |
| **Model Name**: `mlii0117/sd1.5_MPECT` | |
| **Model Type**: Fine-tuned `Stable Diffusion v1.5` for medical image-to-image translation | |
| **Paper**: _Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT_ | |
| **Conference**: AAPM 2025 (Oral) | |
| **Authors**: Mingjie Li, Yizheng Chen, Lei Xing, Michael F. Gensheimer | |
| **Affiliation**: Department of Radiation Oncology - Medical Physics Divison, Stanford University | |
| --- | |
| ## 🧬 Model Description | |
| This model is a fine-tuned version of **Stable Diffusion v1.5**, specialized for converting **non-contrast CT images** into **contrast-enhanced CT images**, guided by **textual phase prompts** (e.g., *venous phase*, *arterial phase*). It utilizes the `InstructPix2Pix` framework to enable flexible prompt-conditioned generation, enabling control over contrast timing without requiring explicit paired data. | |
| --- | |
| ## 💡 Key Features | |
| - 🧾 **Text-guided control** over contrast phase (arterial vs. venous) | |
| - 🖼️ Processes **2D CT slices** in image format (converted from DICOM) | |
| - 🏥 Focused on **clinical realism and anatomical fidelity** | |
| - 🧠 Reconstructs full 3D volume with NIfTI output support | |
| - ✅ Evaluated and presented as **Oral at AAPM 2025** | |
| --- | |
| ## 🛠️ Usage | |
| ### 🔧 Requirements | |
| ```bash | |
| pip install diffusers==0.25.0 nibabel pydicom tqdm pillow | |
| ``` | |
| ### 📦 Load the Model | |
| ```python | |
| from diffusers import StableDiffusionInstructPix2PixPipeline | |
| import torch | |
| pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained( | |
| "mlii0117/sd1.5_MPECT", torch_dtype=torch.float16 | |
| ).to("cuda") | |
| generator = torch.Generator("cuda").manual_seed(0) | |
| ``` | |
| ### 📝 Example Prompts | |
| - **Arterial Phase** | |
| ``` | |
| Convert this non-contrast CT slice to mimic an arterial-phase contrast-enhanced CT. | |
| Brighten and enhance the aorta, major arteries, and adjacent organ boundaries. | |
| ``` | |
| - **Venous Phase** | |
| ``` | |
| Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT. | |
| Brighten and enhance the portal and hepatic veins and emphasize organ boundaries. | |
| ``` | |
| ### 🧪 Full Pipeline Example | |
| ```python | |
| import os | |
| import numpy as np | |
| import nibabel as nib | |
| from PIL import Image | |
| from glob import glob | |
| from tqdm import tqdm | |
| from pydicom import dcmread | |
| from diffusers import StableDiffusionInstructPix2PixPipeline | |
| pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained( | |
| "mlii0117/sd1.5_MPECT", torch_dtype=torch.float16 | |
| ).to("cuda") | |
| generator = torch.Generator("cuda").manual_seed(0) | |
| prompt = "Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT. Brighten and enhance the portal and hepatic veins, and emphasize organ boundaries." | |
| def load_dicom_folder(dicom_folder): | |
| dicom_folder = os.path.join(dicom_folder, 'DICOM') | |
| files = sorted(glob(os.path.join(dicom_folder, "*"))) | |
| slices = [dcmread(f).pixel_array.astype(np.float32) for f in files] | |
| volume = np.stack(slices, axis=0) | |
| volume += dcmread(files[0]).RescaleIntercept | |
| volume = np.clip(volume, -1000, 1000) | |
| return (volume + 1000) / 2000.0 | |
| def process(volume): | |
| results = [] | |
| for i in tqdm(range(volume.shape[0])): | |
| img = Image.fromarray((volume[i] * 255).astype(np.uint8)).convert("RGB") | |
| out = pipe(prompt, image=img, num_inference_steps=20, | |
| image_guidance_scale=1.5, guidance_scale=10, | |
| generator=generator).images[0] | |
| gray = np.array(out.convert("L")).astype(np.float32) / 255.0 | |
| gray = gray * 2000 - 1000 | |
| results.append(gray) | |
| return np.stack(results, axis=0) | |
| def save_nifti(volume, path): | |
| nib.save(nib.Nifti1Image(volume, np.eye(4)), path) | |
| input_path = "/path/to/dicom_folder" | |
| output_path = "/path/to/output.nii.gz" | |
| vol = load_dicom_folder(input_path) | |
| out_vol = process(vol) | |
| save_nifti(out_vol, output_path) | |
| ``` | |
| --- | |
| ## 🧠 Intended Use | |
| - Medical research and simulation | |
| - Data augmentation for contrast-enhanced imaging | |
| - Exploratory analysis in non-contrast → contrast CT enhancement | |
| > ⚠️ **Disclaimer**: This model is for research purposes only. It is not intended for clinical decision-making or diagnostic use. | |
| --- | |
| ## 📝 Citation | |
| ``` | |
| @inproceedings{li2025text, | |
| title={Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT}, | |
| author={Li, Mingjie and Chen, Yizheng and Xing, Lei and Gensheimer, Michael}, | |
| booktitle={AAPM Annual Meeting (Oral)}, | |
| year={2025} | |
| } | |
| ``` | |
| --- | |
| ## 🧾 License | |
| This model is released for **non-commercial research purposes only**. Please contact the authors if you wish to use it in clinical or commercial settings. |