This is a int8-wo pre-quantized version of Qwen-Image.
It needs at least 24GB VRAM GPU to run efficiently.

How to use

Install the latest version of diffusers, transformers, torchao and accelerate:

pip install -U diffusers transformers torchao accelerate

The following contains a code snippet illustrating how to use the model to generate images based on text prompts:

import torch

from diffusers import AutoModel, DiffusionPipeline

torch_dtype = torch.bfloat16
device = "cuda"

transformer = AutoModel.from_pretrained(
    "dimitribarbot/Qwen-Image-int8wo",
    torch_dtype=torch_dtype,
    use_safetensors=False
)
pipe = DiffusionPipeline.from_pretrained(
    "Qwen/Qwen-Image",
    transformer=transformer,
    torch_dtype=torch_dtype
)
pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()

prompt = "a woman and a man sitting at a cafe, the woman has red hair and she's wearing purple sweater with a black scarf and a white hat, the man is sitting on the other side of the table and he's wearing a white shirt with a purple scarf and red hat, both of them are sipping their coffee while in the table there's some cake slices on their respective plates, each with forks and knives at each side."
negative_prompt = ""

generator = torch.Generator(device=device).manual_seed(42)

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=1664,
    height=928,
    num_inference_steps=25,
    true_cfg_scale=4.0,
    generator=generator,
).images[0]

image.save("qwen_image_torchao.png")

Credits

Downloads last month
83
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dimitribarbot/Qwen-Image-int8wo

Base model

Qwen/Qwen-Image
Quantized
(11)
this model