Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ A collection of GGUF and quantizations for [`jina-embeddings-v4`](https://huggin
|
|
15 |
|
16 |
## Text-Only Task-Specific Models
|
17 |
|
18 |
-
|
19 |
|
20 |
| HuggingFace Repo | Task |
|
21 |
|---|---|
|
@@ -108,3 +108,7 @@ To some users, ⚠️ indicates a somewhat surprising behavior where `prompt_nam
|
|
108 |
### Matryoshka embeddings
|
109 |
|
110 |
Note, v4 is trained with Matryoshka embeddings, and converting to GGUF doesn't break the Matryoshka feature. Let's say you get embeddings with shape `NxD` - you can simply use `embeddings[:, :truncate_dim]` to get smaller truncated embeddings. Note that not every dimension is trained though. For v4, you can set `truncate_dim` to any of these values: `[128, 256, 512, 1024, 2048]`.
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
## Text-Only Task-Specific Models
|
17 |
|
18 |
+
We removed the visual components of `qwen2.5-vl` and merged all LoRA adapters back into the base language model. This results in three task-specific v4 models with 3.09B parameters, downsized from the original jina-embeddings-v4 3.75B parameters:
|
19 |
|
20 |
| HuggingFace Repo | Task |
|
21 |
|---|---|
|
|
|
108 |
### Matryoshka embeddings
|
109 |
|
110 |
Note, v4 is trained with Matryoshka embeddings, and converting to GGUF doesn't break the Matryoshka feature. Let's say you get embeddings with shape `NxD` - you can simply use `embeddings[:, :truncate_dim]` to get smaller truncated embeddings. Note that not every dimension is trained though. For v4, you can set `truncate_dim` to any of these values: `[128, 256, 512, 1024, 2048]`.
|
111 |
+
|
112 |
+
### Quantizations
|
113 |
+
|
114 |
+
We use [`llama-quantize`](./quantize.sh) with `imatrix` to quantize models from float16. `imatrix` is generated by `llama-imatrix -m jina-embeddings-v4-text-retrieval-F16.gguf -f calibration_data_v5_rc.txt -ngl 99 --no-ppl -o imatrix-retrieval-512.dat`. `calibration_data_v5_rc.txt` can be found [here](https://gist.github.com/tristandruyen/9e207a95c7d75ddf37525d353e00659c/) and is recommended by Unsloth docs.
|