|
--- |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- rag |
|
base_model: sentence-transformers/all-MiniLM-L6-v2 |
|
pipeline_tag: sentence-similarity |
|
library_name: sentence-transformers |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
# The Fastest Text Embedding Model: tabularisai/all-MiniLM-L2-v2 |
|
|
|
This model is distilled from [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2), delivering **almost 2 times faster inference** in comparasion to the smallest all-MiniLM-L6-v2 model, while maintaining strong accuracy on CPU and GPU. |
|
|
|
## Usage |
|
|
|
### Retrieval-Augmented Generation (RAG) Example |
|
|
|
Use this model as a retriever in a RAG pipeline: |
|
|
|
```python |
|
from sentence_transformers import SentenceTransformer, util |
|
import faiss |
|
import numpy as np |
|
|
|
# Load embedding model |
|
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2") |
|
|
|
# Your 5 simple documents |
|
documents = [ |
|
"Renewable energy comes from natural sources.", |
|
"Solar panels convert sunlight into electricity.", |
|
"Wind turbines harness wind power.", |
|
"Fossil fuels are non-renewable sources of energy.", |
|
"Hydropower uses water to generate electricity." |
|
] |
|
|
|
# Embed documents |
|
doc_embeddings = model.encode(documents, convert_to_numpy=True) |
|
|
|
# Create FAISS index |
|
dim = doc_embeddings.shape[1] |
|
index = faiss.IndexFlatL2(dim) |
|
index.add(doc_embeddings) |
|
|
|
# Query |
|
query = "What are the benefits of renewable energy?" |
|
query_embedding = model.encode([query], convert_to_numpy=True) |
|
|
|
# Search top 3 similar docs |
|
D, I = index.search(query_embedding, k=3) |
|
|
|
# Print results |
|
print("Query:", query) |
|
print("\nTop 3 similar documents:") |
|
for rank, idx in enumerate(I[0]): |
|
print(f"{rank+1}. {documents[idx]} (score: {D[0][rank]:.4f})") |
|
|
|
``` |
|
|
|
### Sentence Embedding Example |
|
|
|
Install the library: |
|
|
|
```bash |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Load the model and encode sentences: |
|
|
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2") |
|
|
|
sentences = [ |
|
"The weather is lovely today.", |
|
"It's so sunny outside!", |
|
"He drove to the stadium.", |
|
] |
|
|
|
embeddings = model.encode(sentences) |
|
print(embeddings.shape) # [3, 384] |
|
|
|
similarities = model.similarity(embeddings, embeddings) |
|
print(similarities.shape) # [3, 3] |
|
``` |
|
|
|
|
|
<table align="center"> |
|
<tr> |
|
<td align="center"> |
|
<a href="https://www.linkedin.com/company/tabularis-ai/"> |
|
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30"> |
|
</a> |
|
</td> |
|
<td align="center"> |
|
<a href="https://x.com/tabularis_ai"> |
|
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30"> |
|
</a> |
|
</td> |
|
<td align="center"> |
|
<a href="https://github.com/tabularis-ai"> |
|
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30"> |
|
</a> |
|
</td> |
|
<td align="center"> |
|
<a href="https://tabularis.ai"> |
|
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30"> |
|
</a> |
|
</td> |
|
</tr> |
|
</table> |