Alibaba-Research-Intelligence-Computing
/

Tora

@@ -1,7 +1,6 @@
 <div align="center">
-<img src="icon.jpg" width="250"/>
 <h2><center>Tora: Trajectory-oriented Diffusion Transformer for Video Generation</h2>
@@ -17,8 +16,12 @@ Zhenghao Zhang\*, Junchao Liao\*, Menghao Li, Zuozhuo Dai, Bingxue Qiu, Siyu Zhu
 <a href='https://modelscope.cn/models/xiaoche/Tora'><img src='https://img.shields.io/badge/🤖_ModelScope-weights-%23654dfc'></a>
 <a href='https://huggingface.co/Le0jc/Tora'><img src='https://img.shields.io/badge/🤗_HuggingFace-weights-%23ff9e0e'></a>
 </div>
 ## 💡 Abstract
@@ -26,6 +29,8 @@ Recent advancements in Diffusion Transformer (DiT) have demonstrated remarkable
 ## 📣 Updates
 - `2024/10/31` Model weights uploaded to [HuggingFace](https://huggingface.co/Le0jc/Tora). We also provided an English demo on [ModelScope](https://www.modelscope.cn/studios/Alibaba_Research_Intelligence_Computing/Tora_En).
 - `2024/10/23` 🔥🔥Our [ModelScope Demo](https://www.modelscope.cn/studios/xiaoche/Tora) is launched. Welcome to try it out! We also upload the model weights to [ModelScope](https://www.modelscope.cn/models/xiaoche/Tora).
 - `2024/10/21` Thanks to [@kijai](https://github.com/kijai) for supporting Tora in ComfyUI! [Link](https://github.com/kijai/ComfyUI-CogVideoXWrapper)
@@ -33,15 +38,6 @@ Recent advancements in Diffusion Transformer (DiT) have demonstrated remarkable
 - `2024/08/27` We released our v2 paper including appendix.
 - `2024/07/31` We submitted our paper on arXiv and released our project page.
-## 📑 Table of Contents
-- [Showcases](#%EF%B8%8F-showcases)
-- [Model Weights](#-model-weights)
-- [Inference](#-inference)
-- [Acknowledgements](#-acknowledgements)
-- [Our previous work](#-our-previous-work)
-- [Citation](#-citation)
 ## 🎞️ Showcases
 https://github.com/user-attachments/assets/949d5e99-18c9-49d6-b669-9003ccd44bf1
@@ -52,80 +48,6 @@ https://github.com/user-attachments/assets/4026c23d-229d-45d7-b5be-6f3eb9e4fd50
 All videos are available in this [Link](https://cloudbook-public-daily.oss-cn-hangzhou.aliyuncs.com/Tora_t2v/showcases.zip)
-## 📦 Model Weights
-### Folder Structure
-```
-Tora
-└── sat
-    └── ckpts
-        ├── t5-v1_1-xxl
-        │   ├── model-00001-of-00002.safetensors
-        │   └── ...
-        ├── vae
-        │   └── 3d-vae.pt
-        └── tora
-            └── t2v
-                └── mp_rank_00_model_states.pt
-```
-### Download Links
-*Note: Downloading the `tora` weights requires following the [CogVideoX License](CogVideoX_LICENSE).* You can choose one of the following options: HuggingFace, ModelScope, or native links.
-After downloading the model weights, you can put them in the `Tora/sat/ckpts` folder.
-#### HuggingFace
-```bash
-# This can be faster
-pip install "huggingface_hub[hf_transfer]"
-HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Le0jc/Tora --local-dir ckpts
-```
-or
-```bash
-# use git
-git lfs install
-git clone https://huggingface.co/Le0jc/Tora
-```
-#### ModelScope
-- SDK
-```bash
-from modelscope import snapshot_download
-model_dir = snapshot_download('xiaoche/Tora')
-```
-- Git
-```bash
-git clone https://www.modelscope.cn/xiaoche/Tora.git
-```
-#### Native
-- Download the VAE and T5 model following [CogVideo](https://github.com/THUDM/CogVideo/blob/main/sat/README.md#2-download-model-weights):
-    - VAE: https://cloud.tsinghua.edu.cn/f/fdba7608a49c463ba754/?dl=1
-    - T5: [text_encoder](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/text_encoder), [tokenizer](https://huggingface.co/THUDM/CogVideoX-2b/tree/main/tokenizer)
-- Tora t2v model weights: [Link](https://cloudbook-public-daily.oss-cn-hangzhou.aliyuncs.com/Tora_t2v/mp_rank_00_model_states.pt). Downloading this weight requires following the [CogVideoX License](CogVideoX_LICENSE).
-## 🔄 Inference
-please refer to our [Github](https://github.com/alibaba/Tora) or [modelscope online demo](https://www.modelscope.cn/studios/Alibaba_Research_Intelligence_Computing/Tora_En)
-### Recommendations for Text Prompts
-For text prompts, we highly recommend using GPT-4 to enhance the details. Simple prompts may negatively impact both visual quality and motion control effectiveness.
-You can refer to the following resources for guidance:
-- [CogVideoX Documentation](https://github.com/THUDM/CogVideo/blob/main/inference/convert_demo.py)
-- [OpenSora Scripts](https://github.com/hpcaitech/Open-Sora/blob/main/scripts/inference.py)
 ## 🤝 Acknowledgements
 We would like to express our gratitude to the following open-source projects that have been instrumental in the development of our project:

 <div align="center">
 <h2><center>Tora: Trajectory-oriented Diffusion Transformer for Video Generation</h2>
 <a href='https://modelscope.cn/models/xiaoche/Tora'><img src='https://img.shields.io/badge/🤖_ModelScope-weights-%23654dfc'></a>
 <a href='https://huggingface.co/Le0jc/Tora'><img src='https://img.shields.io/badge/🤗_HuggingFace-weights-%23ff9e0e'></a>
 </div>
+This is the official repository for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation".
+## Please visit our [Github repo](https://github.com/alibaba/Tora) for more details.
 ## 💡 Abstract
 ## 📣 Updates
+- `2024/12/09` 🔥🔥Diffusers version of Tora and the corresponding model weights are released. Inference VRAM requirements are reduced to around 5 GiB. Please refer to [this](diffusers-version/README.md) for details.
+- `2024/11/25` 🔥Text-to-Video training code released.
 - `2024/10/31` Model weights uploaded to [HuggingFace](https://huggingface.co/Le0jc/Tora). We also provided an English demo on [ModelScope](https://www.modelscope.cn/studios/Alibaba_Research_Intelligence_Computing/Tora_En).
 - `2024/10/23` 🔥🔥Our [ModelScope Demo](https://www.modelscope.cn/studios/xiaoche/Tora) is launched. Welcome to try it out! We also upload the model weights to [ModelScope](https://www.modelscope.cn/models/xiaoche/Tora).
 - `2024/10/21` Thanks to [@kijai](https://github.com/kijai) for supporting Tora in ComfyUI! [Link](https://github.com/kijai/ComfyUI-CogVideoXWrapper)
 - `2024/08/27` We released our v2 paper including appendix.
 - `2024/07/31` We submitted our paper on arXiv and released our project page.
 ## 🎞️ Showcases
 https://github.com/user-attachments/assets/949d5e99-18c9-49d6-b669-9003ccd44bf1
 All videos are available in this [Link](https://cloudbook-public-daily.oss-cn-hangzhou.aliyuncs.com/Tora_t2v/showcases.zip)
 ## 🤝 Acknowledgements
 We would like to express our gratitude to the following open-source projects that have been instrumental in the development of our project: