This is a int8-wo pre-quantized version of Qwen-Image.
It needs at least 24GB VRAM GPU to run efficiently.

How to use

Install the latest version of diffusers, transformers, torchao and accelerate:

pip install -U diffusers transformers torchao accelerate

The following contains a code snippet illustrating how to use the model to generate images based on text prompts:

import torch

from diffusers import AutoModel, DiffusionPipeline

torch_dtype = torch.bfloat16
device = "cuda"

transformer = AutoModel.from_pretrained(
    "dimitribarbot/Qwen-Image-int8wo",
    torch_dtype=torch_dtype,
    use_safetensors=False
)
pipe = DiffusionPipeline.from_pretrained(
    "Qwen/Qwen-Image",
    transformer=transformer,
    torch_dtype=torch_dtype
)
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()

prompt = "a woman and a man sitting at a cafe, the woman has red hair and she's wearing purple sweater with a black scarf and a white hat, the man is sitting on the other side of the table and he's wearing a white shirt with a purple scarf and red hat, both of them are sipping their coffee while in the table there's some cake slices on their respective plates, each with forks and knives at each side."
negative_prompt = ""

generator = torch.Generator(device=device).manual_seed(42)

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=1664,
    height=928,
    num_inference_steps=25,
    true_cfg_scale=4.0,
    generator=generator,
).images[0]

image.save("qwen_image_torchao.png")

Credits

OzzyGT for the code snippet here: https://huggingface.co/Qwen/Qwen-Image/discussions/27
The Qwen-Image team
The HuggingFace team

dimitribarbot
/

Qwen-Image-int8wo

How to use

Credits

Model tree for dimitribarbot/Qwen-Image-int8wo