File size: 726 Bytes
2269f8f
 
43f8106
 
 
 
 
 
 
 
 
 
2269f8f
 
a6b6f6a
1ef02e3
566eef2
1ef02e3
 
 
2211bf2
 
b1face4
a6b6f6a
26fe2a5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: apache-2.0
datasets:
- lmms-lab/LLaVA-Video-178K
language:
- en
base_model:
- Qwen/Qwen2-VL-7B
tags:
- qwen2_vl
- multimodal
- conversational
---

# Model Card

This model is obtained by fine-tuning [Qwen2-VL-7B-Base](https://huggingface.co/Qwen/Qwen2-VL-7B) on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K). It is used as a comparison baseline in [LiveCC](https://showlab.github.io/livecc) project.

# Performance



# Acknowledgement

[Joya Chen](https://chenjoya.github.io/) built the training code, and Yiqi Lin trained the model. The QA evaluation is done by [Joya Chen](https://chenjoya.github.io/), and CC evaluation is done by Ziyun Zeng. Infra is supported by the company.