ibm-granite
/

granite-vision-3.3-2b

Model card Files Files and versions

dhirajjoshi116 commited on Jun 12

Commit

62b3ac4

·

verified ·

1 Parent(s): b17173e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -165,7 +165,7 @@ Granite-vision-3.3-2b introduces three new experimental capabilities:
 (1) Image segmentation: [A notebook showing a segmentation example](https://github.com/ibm-granite/granite-vision-models/blob/main/cookbooks/GraniteVision_Segmentation_Notebook.ipynb)
-(2) Doctags generation: Please see [Docling project](https://github.com/docling-project/docling) for more details on doctags.
 (3) Multipage support: The model was trained to handle question answering (QA) tasks using multiple consecutive pages from a document—up to 8 pages—given the demands of long-context processing. To support such long sequences without exceeding GPU memory limits, we recommend resizing images so that their longer dimension is 768 pixels.

 (1) Image segmentation: [A notebook showing a segmentation example](https://github.com/ibm-granite/granite-vision-models/blob/main/cookbooks/GraniteVision_Segmentation_Notebook.ipynb)
+(2) Doctags generation: Parse document images to structured text in doctags format. Please see [Docling project](https://github.com/docling-project/docling) for more details on doctags.
 (3) Multipage support: The model was trained to handle question answering (QA) tasks using multiple consecutive pages from a document—up to 8 pages—given the demands of long-context processing. To support such long sequences without exceeding GPU memory limits, we recommend resizing images so that their longer dimension is 768 pixels.