Divyasreepat commited on
Commit
c1ed33f
·
verified ·
1 Parent(s): d338bd6

Update README.md with new model card content

Browse files
Files changed (1) hide show
  1. README.md +90 -8
README.md CHANGED
@@ -1,11 +1,93 @@
1
  ---
2
  library_name: keras-hub
3
  ---
4
- This is a [`SigLIP` model](https://keras.io/api/keras_hub/models/sig_lip) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
5
- Model config:
6
- * **name:** sig_lip_backbone
7
- * **trainable:** True
8
- * **vision_encoder:** {'module': 'keras_hub.src.models.siglip.siglip_vision_encoder', 'class_name': 'SigLIPVisionEncoder', 'config': {'name': 'sig_lip_vision_encoder', 'trainable': True, 'patch_size': 16, 'hidden_dim': 768, 'num_layers': 12, 'num_heads': 12, 'intermediate_dim': 3072, 'intermediate_activation': 'gelu_approximate', 'layer_norm_epsilon': 1e-06, 'image_shape': [512, 512, 3]}, 'registered_name': 'keras_hub>SigLIPVisionEncoder'}
9
- * **text_encoder:** {'module': 'keras_hub.src.models.siglip.siglip_text_encoder', 'class_name': 'SigLIPTextEncoder', 'config': {'name': 'sig_lip_text_encoder', 'trainable': True, 'vocabulary_size': 32000, 'embedding_dim': 768, 'hidden_dim': 768, 'num_layers': 12, 'num_heads': 12, 'intermediate_dim': 3072, 'intermediate_activation': 'gelu_approximate', 'layer_norm_epsilon': 1e-06, 'max_sequence_length': 64}, 'registered_name': 'keras_hub>SigLIPTextEncoder'}
10
-
11
- This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: keras-hub
3
  ---
4
+ ### Model Overview
5
+ SigLIP model pre-trained on WebLi at resolution 224x224. It was introduced in the paper [Sigmoid Loss for Language Image Pre-Training](https://arxiv.org/abs/2303.15343) by Zhai et al. and first released in this [repository](https://github.com/google-research/big_vision).
6
+ SigLIP is [CLIP](https://huggingface.co/docs/transformers/model_doc/clip), a multimodal model, with a better loss function. The sigmoid loss operates solely on image-text pairs and does not require a global view of the pairwise similarities for normalization. This allows further scaling up the batch size, while also performing better at smaller batch sizes.
7
+ A TLDR of SigLIP by one of the authors can be found [here](https://twitter.com/giffmana/status/1692641733459267713).
8
+
9
+ Weights are released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE) . Keras model code is released under the [Apache 2 License](https://github.com/keras-team/keras-hub/blob/master/LICENSE).
10
+
11
+ ## Links
12
+
13
+ * [SigLIP Quickstart Notebook](https://www.kaggle.com/code/laxmareddypatlolla/siglip-quickstart-notebook-with-hub)
14
+ * [SigLIP API Documentation](coming soon)
15
+ * [SigLIP Model Card](https://arxiv.org/abs/2303.15343)
16
+ * [KerasHub Beginner Guide](https://keras.io/guides/keras_hub/getting_started/)
17
+ * [KerasHub Model Publishing Guide](https://keras.io/guides/keras_hub/upload/)
18
+
19
+ ## Installation
20
+
21
+ Keras and KerasHub can be installed with:
22
+
23
+ ```
24
+ pip install -U -q keras-hub
25
+ pip install -U -q keras
26
+ ```
27
+
28
+ Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the [Keras Getting Started](https://keras.io/getting_started/) page.
29
+
30
+ ## Presets
31
+
32
+ The following model checkpoints are provided by the Keras team. Full code examples for each are available below.
33
+
34
+ | Preset name | Parameters | Description |
35
+ |---------------------------------------|------------|--------------------------------------------------------------------------------------------------------------|
36
+ | | | |
37
+
38
+ ## Example Usage
39
+ ```Python
40
+ import keras
41
+ import numpy as np
42
+ import matplotlib.pyplot as plt
43
+ from keras_hub.models import SigLIPBackbone, SigLIPTokenizer
44
+ from keras_hub.layers import SigLIPImageConverter
45
+
46
+ # instantiate the model and preprocessing tools
47
+ siglip = SigLIPBackbone.from_preset("siglip_base_patch16_512")
48
+ tokenizer = SigLIPTokenizer.from_preset("siglip_base_patch16_512",
49
+ sequence_length=64)
50
+ image_converter = SigLIPImageConverter.from_preset("siglip_base_patch16_512")
51
+
52
+ # obtain tokens for some input text
53
+ tokens = tokenizer.tokenize(["mountains", "cat on tortoise", "house"])
54
+
55
+ # preprocess image and text
56
+ image = keras.utils.load_img("cat.jpg")
57
+ image = image_converter(np.array([image]).astype(float))
58
+
59
+ # query the model for similarities
60
+ siglip({
61
+ "images": image,
62
+ "token_ids": tokens,
63
+ })
64
+ ```
65
+
66
+ ## Example Usage with Hugging Face URI
67
+
68
+ ```Python
69
+ import keras
70
+ import numpy as np
71
+ import matplotlib.pyplot as plt
72
+ from keras_hub.models import SigLIPBackbone, SigLIPTokenizer
73
+ from keras_hub.layers import SigLIPImageConverter
74
+
75
+ # instantiate the model and preprocessing tools
76
+ siglip = SigLIPBackbone.from_preset("hf://keras/siglip_base_patch16_512")
77
+ tokenizer = SigLIPTokenizer.from_preset("hf://keras/siglip_base_patch16_512",
78
+ sequence_length=64)
79
+ image_converter = SigLIPImageConverter.from_preset("hf://keras/siglip_base_patch16_512")
80
+
81
+ # obtain tokens for some input text
82
+ tokens = tokenizer.tokenize(["mountains", "cat on tortoise", "house"])
83
+
84
+ # preprocess image and text
85
+ image = keras.utils.load_img("cat.jpg")
86
+ image = image_converter(np.array([image]).astype(float))
87
+
88
+ # query the model for similarities
89
+ siglip({
90
+ "images": image,
91
+ "token_ids": tokens,
92
+ })
93
+ ```