prithivMLmods
/

Qwen2-VL-OCR-2B-Instruct

@@ -15,12 +15,17 @@ tags:
 - Latex
 - VLM
 - Plain_Text
 ---
-# Qwen2-VL-OCR-2B-Instruct [ VL / OCR ]
 ![aaaaaaaaaaa.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/s42kASSQCoJAyYMJkoEuD.png)
-The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
 #### Key Enhancements:
@@ -32,6 +37,11 @@ The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-V
 * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
 | **File Name**             | **Size**   | **Description**                                 | **Upload Status** |
 |---------------------------|------------|------------------------------------------------|-------------------|
 | `.gitattributes`          | 1.52 kB   | Configures LFS tracking for specific model files. | Initial commit    |
@@ -46,11 +56,7 @@ The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-V
 | `vocab.json`              | 2.78 MB   | Vocabulary file for tokenization.               | Uploaded          |
 ---
-### Sample Inference with Doc
-![123.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/TlsmcTqoQMvaBhwo8tGeU.png)
-**📍Demo**: https://huggingface.co/prithivMLmods/Qwen2-VL-OCR-2B-Instruct/blob/main/Demo/ocrtest_qwen.ipynb
 ### How to Use
 ```python

 - Latex
 - VLM
 - Plain_Text
+- KIE
+- Equations
+- VQA
 ---
+# **Qwen2-VL-OCR-2B-Instruct [ VL / OCR ]**
 ![aaaaaaaaaaa.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/s42kASSQCoJAyYMJkoEuD.png)
+> The **Qwen2-VL-OCR-2B-Instruct** model is a fine-tuned version of **Qwen/Qwen2-VL-2B-Instruct**, tailored for tasks that involve **Optical Character Recognition (OCR)**, **image-to-text conversion**, and **math problem solving with LaTeX formatting**. This model integrates a conversational approach with visual and textual understanding to handle multi-modal tasks effectively.
+[![Open Demo in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://huggingface.co/prithivMLmods/Qwen2-VL-OCR-2B-Instruct/blob/main/Demo/ocrtest_qwen.ipynb)
 #### Key Enhancements:
 * **Multilingual Support**: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
+### Sample Inference
+![123.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/TlsmcTqoQMvaBhwo8tGeU.png)
 | **File Name**             | **Size**   | **Description**                                 | **Upload Status** |
 |---------------------------|------------|------------------------------------------------|-------------------|
 | `.gitattributes`          | 1.52 kB   | Configures LFS tracking for specific model files. | Initial commit    |
 | `vocab.json`              | 2.78 MB   | Vocabulary file for tokenization.               | Uploaded          |
 ---
 ### How to Use
 ```python