ariG23498 HF Staff pcuenq HF Staff commited on
Commit
50eeb46
·
verified ·
1 Parent(s): 77db6d9

Update README.md (#1)

Browse files

- Update README.md (f7c86bcc3a32ed6c29001e44398be83fac3f0894)


Co-authored-by: Pedro Cuenca <[email protected]>

Files changed (1) hide show
  1. README.md +4 -6
README.md CHANGED
@@ -9,21 +9,19 @@ base_model:
9
  pipeline_tag: video-classification
10
  ---
11
 
12
- # Fine Tuned V-JEPA 2 on UCF101 Sebset
13
 
14
  A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
15
  The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
16
 
17
- <div style="background-color: rgba(251, 255, 120, 0.4); padding: 10px; color: black; border-radius: 5px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);">
18
- 💡 This is V-JEPA 2 model with video classification head pretrained on <a href="https://paperswithcode.com/dataset/something-something-v2" style="color: black;">Something-Something-V2</a> dataset.
19
- </div>
20
- <br></br>
21
 
22
  <img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">&nbsp;
23
 
24
  ## Installation
25
 
26
- To run V-JEPA 2 model, ensure you have installed the latest transformers:
27
 
28
  ```bash
29
  pip install -U git+https://github.com/huggingface/transformers
 
9
  pipeline_tag: video-classification
10
  ---
11
 
12
+ # Fine Tuned V-JEPA 2 on UCF101 Subset
13
 
14
  A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
15
  The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
16
 
17
+ The base model we used is [vjepa2-vitl-fpc16-256-ssv2](https://hf.co/qubvel-hf/vjepa2-vitl-fpc16-256-ssv2), a V-JEPA 2 model pretrained on the <a href="https://paperswithcode.com/dataset/something-something-v2" style="color: black;">Something-Something-V2</a> dataset.
18
+ We further fine-tuned this model on a subset of [UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild](https://huggingface.co/papers/1212.0402). This dataset contains just 400 short videos (in total) across 10 different categories.
 
 
19
 
20
  <img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">&nbsp;
21
 
22
  ## Installation
23
 
24
+ To run this V-JEPA 2 model, ensure you have installed the latest transformers:
25
 
26
  ```bash
27
  pip install -U git+https://github.com/huggingface/transformers