File size: 3,254 Bytes
b697865 e5e1604 b697865 c8d5b79 b697865 e5e1604 b697865 7023d0f b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d b697865 28e6b7d 9297f55 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- rag
base_model: sentence-transformers/all-MiniLM-L6-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
license: apache-2.0
language:
- en
---
# The Fastest Text Embedding Model: tabularisai/all-MiniLM-L2-v2
This model is distilled from [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2), delivering **almost 2 times faster inference** in comparasion to the smallest all-MiniLM-L6-v2 model, while maintaining strong accuracy on CPU and GPU.
## Usage
### Retrieval-Augmented Generation (RAG) Example
Use this model as a retriever in a RAG pipeline:
```python
from sentence_transformers import SentenceTransformer, util
import faiss
import numpy as np
# Load embedding model
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
# Your 5 simple documents
documents = [
"Renewable energy comes from natural sources.",
"Solar panels convert sunlight into electricity.",
"Wind turbines harness wind power.",
"Fossil fuels are non-renewable sources of energy.",
"Hydropower uses water to generate electricity."
]
# Embed documents
doc_embeddings = model.encode(documents, convert_to_numpy=True)
# Create FAISS index
dim = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(doc_embeddings)
# Query
query = "What are the benefits of renewable energy?"
query_embedding = model.encode([query], convert_to_numpy=True)
# Search top 3 similar docs
D, I = index.search(query_embedding, k=3)
# Print results
print("Query:", query)
print("\nTop 3 similar documents:")
for rank, idx in enumerate(I[0]):
print(f"{rank+1}. {documents[idx]} (score: {D[0][rank]:.4f})")
```
### Sentence Embedding Example
Install the library:
```bash
pip install -U sentence-transformers
```
Load the model and encode sentences:
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape) # [3, 384]
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape) # [3, 3]
```
<table align="center">
<tr>
<td align="center">
<a href="https://www.linkedin.com/company/tabularis-ai/">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://x.com/tabularis_ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://github.com/tabularis-ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://tabularis.ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30">
</a>
</td>
</tr>
</table> |