Update README.md (#1)

Files changed (1) hide show

README.md CHANGED Viewed

@@ -9,21 +9,19 @@ base_model:
 pipeline_tag: video-classification
 ---
-# Fine Tuned V-JEPA 2 on UCF101 Sebset
 A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
 The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
-<div style="background-color: rgba(251, 255, 120, 0.4); padding: 10px; color: black; border-radius: 5px; box-shadow: 0 4px 8px rgba(0,0,0,0.1);">
-    💡 This is V-JEPA 2 model with video classification head pretrained on <a href="https://paperswithcode.com/dataset/something-something-v2" style="color: black;">Something-Something-V2</a> dataset.
-</div>
-<br></br>
 <img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">&nbsp;
 ## Installation
-To run V-JEPA 2 model, ensure you have installed the latest transformers:
 ```bash
 pip install -U git+https://github.com/huggingface/transformers

 pipeline_tag: video-classification
 ---
+# Fine Tuned V-JEPA 2 on UCF101 Subset
 A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of [VJEPA](https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/), resulting in state-of-the-art video understanding capabilities, leveraging data and model sizes at scale.
 The code is released [in this repository](https://github.com/facebookresearch/vjepa2).
+The base model we used is [vjepa2-vitl-fpc16-256-ssv2](https://hf.co/qubvel-hf/vjepa2-vitl-fpc16-256-ssv2), a V-JEPA 2 model pretrained on the <a href="https://paperswithcode.com/dataset/something-something-v2" style="color: black;">Something-Something-V2</a> dataset.
+We further fine-tuned this model on a subset of [UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild](https://huggingface.co/papers/1212.0402). This dataset contains just 400 short videos (in total) across 10 different categories.
 <img src="https://dl.fbaipublicfiles.com/vjepa2/vjepa2-pretrain.gif">&nbsp;
 ## Installation
+To run this V-JEPA 2 model, ensure you have installed the latest transformers:
 ```bash
 pip install -U git+https://github.com/huggingface/transformers