metadata
license: apple-ascl
datasets:
- graph-based-captions/GBC10M
language:
- en
Graph-based captioning (GBC) is a new image annotation paradigm that combines the strengths of long captions, region captions, and scene graphs
GBC interconnects region captions to create a unified description akin to a long caption, while also providing structural information similar to scene graphs.
Text-to-Image with GBC as Middleware
We propose to use GBC as middleware for text-to-image generation. This repository provides model for generating GBC annotation from simple text prompt.
For futher detail on how to use the model please refer to the accompanying code repository.
License
For license please checkout the LICENSE file.
Citation
@article{GBC2024,
title={Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions},
author={Yu-Guan Hsieh and Cheng-Yu Hsieh and Shih-Ying Yeh and Louis Béthune and Hadi Pouransari and Pavan Kumar Anasosalu Vasu and Chun-Liang Li and Ranjay Krishna and Oncel Tuzel and Marco Cuturi},
journal={arXiv preprint arXiv:2407.06723},
year={2024}
}