Update README.md (#1)
Browse files- Update README.md (f7c86bcc3a32ed6c29001e44398be83fac3f0894)
Co-authored-by: Pedro Cuenca <[email protected]>
README.md
CHANGED
@@ -9,21 +9,19 @@ base_model:
|
|
9 |
pipeline_tag: video-classification
|
10 |
---
|
11 |
|
12 |
-
# Fine Tuned V-JEPA 2 on UCF101
|
13 |
|
14 |
A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
|
15 |
The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
|
16 |
|
17 |
-
|
18 |
-
|
19 |
-
</div>
|
20 |
-
<br></br>
|
21 |
|
22 |
<img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">
|
23 |
|
24 |
## Installation
|
25 |
|
26 |
-
To run V-JEPA 2 model, ensure you have installed the latest transformers:
|
27 |
|
28 |
```bash
|
29 |
pip install -U git+https://github.com/huggingface/transformers
|
|
|
9 |
pipeline_tag: video-classification
|
10 |
---
|
11 |
|
12 |
+
# Fine Tuned V-JEPA 2 on UCF101 Subset
|
13 |
|
14 |
A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
|
15 |
The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
|
16 |
|
17 |
+
The base model we used is [vjepa2-vitl-fpc16-256-ssv2](https://hf.co/qubvel-hf/vjepa2-vitl-fpc16-256-ssv2), a V-JEPA 2 model pretrained on the <a href="https://paperswithcode.com/dataset/something-something-v2" style="color: black;">Something-Something-V2</a> dataset.
|
18 |
+
We further fine-tuned this model on a subset of [UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild](https://huggingface.co/papers/1212.0402). This dataset contains just 400 short videos (in total) across 10 different categories.
|
|
|
|
|
19 |
|
20 |
<img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">
|
21 |
|
22 |
## Installation
|
23 |
|
24 |
+
To run this V-JEPA 2 model, ensure you have installed the latest transformers:
|
25 |
|
26 |
```bash
|
27 |
pip install -U git+https://github.com/huggingface/transformers
|