jinaai
/

jina-embeddings-v4-text-code-GGUF

hanxiao commited on Jul 25

Commit

649c74e

verified ·

1 Parent(s): 8864c0c

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ A collection of GGUF and quantizations for [`jina-embeddings-v4`](https://huggin
 ## Text-Only Task-Specific Models
-Here, we removed the visual components of qwen2.5-vl and merged all LoRA adapters back into the base language model. This results in three task-specific v4 models with 3.09B parameters, downsized from the original jina-embeddings-v4 3.75B parameters:
 | HuggingFace Repo | Task |
 |---|---|
@@ -108,3 +108,7 @@ To some users, ⚠️ indicates a somewhat surprising behavior where `prompt_nam
 ### Matryoshka embeddings
 Note, v4 is trained with Matryoshka embeddings, and converting to GGUF doesn't break the Matryoshka feature. Let's say you get embeddings with shape `NxD` - you can simply use `embeddings[:, :truncate_dim]` to get smaller truncated embeddings. Note that not every dimension is trained though. For v4, you can set `truncate_dim` to any of these values: `[128, 256, 512, 1024, 2048]`.

 ## Text-Only Task-Specific Models
+We removed the visual components of `qwen2.5-vl` and merged all LoRA adapters back into the base language model. This results in three task-specific v4 models with 3.09B parameters, downsized from the original jina-embeddings-v4 3.75B parameters:
 | HuggingFace Repo | Task |
 |---|---|
 ### Matryoshka embeddings
 Note, v4 is trained with Matryoshka embeddings, and converting to GGUF doesn't break the Matryoshka feature. Let's say you get embeddings with shape `NxD` - you can simply use `embeddings[:, :truncate_dim]` to get smaller truncated embeddings. Note that not every dimension is trained though. For v4, you can set `truncate_dim` to any of these values: `[128, 256, 512, 1024, 2048]`.
+### Quantizations
+We use [`llama-quantize`](./quantize.sh) with `imatrix` to quantize models from float16. `imatrix` is generated by `llama-imatrix -m jina-embeddings-v4-text-retrieval-F16.gguf -f calibration_data_v5_rc.txt -ngl 99 --no-ppl -o imatrix-retrieval-512.dat`. `calibration_data_v5_rc.txt` can be found [here](https://gist.github.com/tristandruyen/9e207a95c7d75ddf37525d353e00659c/) and is recommended by Unsloth docs.