drbaph's picture
Update README.md
d73e97d verified
|
raw
history blame
3.49 kB
metadata
library_name: HunyuanImage-2.1
license: other
license_name: tencent-hunyuan-community
license_link: https://github.com/Tencent-Hunyuan/HunyuanImage-2.1/blob/master/LICENSE
language:
  - en
  - zh
tags:
  - text-to-image
  - comfyui
  - diffusers
pipeline_tag: text-to-image
extra_gated_eu_disallowed: true

HunyuanImage-2.1 Banner

HunyuanImage-2.1

An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation


Performance on RTX 5090

When using HunyuanImage-2.1 with the quantized encoder + quantized base model,
the VRAM usage on an NVIDIA RTX 5090 typically ranges between 26 GB and 30 GB with average
16 second inference time depending on resolution, batch size, and prompt complexity.

Important Note:
The refiner and distilled model are not yet implemented and are not ready for use in ComfyUI.
Currently, only the base model is supported.


Image1

Image2

image/jpeg

image/jpeg


Download Quantized Model (FP8 e4m3fn)

Download hunyuanimage2.1_fp8_e4m3fn.safetensors


Workflow Notes

  • Model: HunyuanImage-2.1
  • Mode: Quantized Encoder + Quantized Base Model
  • VRAM Usage: ~26GB–30GB on RTX 5090
  • Resolution Tested: 2K (2048×2048)
  • Frameworks: ComfyUI & Diffusers
  • Optimisations Works with Patch Sage Attention + Lazycache / TeaCache ✅
  • Refiner & Distilled Model: ❌ Not implemented yet, not available in ComfyUI
  • License: tencent-hunyuan-community

🚀 **Optimized for High-Resolution, Memory-Efficient Text-to-Image Generation**