all-MiniLM-L2-v2 / README.md
vdmbrsv's picture
Update README.md
9297f55 verified
---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- rag
base_model: sentence-transformers/all-MiniLM-L6-v2
pipeline_tag: sentence-similarity
library_name: sentence-transformers
license: apache-2.0
language:
- en
---
# The Fastest Text Embedding Model: tabularisai/all-MiniLM-L2-v2
This model is distilled from [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2), delivering **almost 2 times faster inference** in comparasion to the smallest all-MiniLM-L6-v2 model, while maintaining strong accuracy on CPU and GPU.
## Usage
### Retrieval-Augmented Generation (RAG) Example
Use this model as a retriever in a RAG pipeline:
```python
from sentence_transformers import SentenceTransformer, util
import faiss
import numpy as np
# Load embedding model
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
# Your 5 simple documents
documents = [
"Renewable energy comes from natural sources.",
"Solar panels convert sunlight into electricity.",
"Wind turbines harness wind power.",
"Fossil fuels are non-renewable sources of energy.",
"Hydropower uses water to generate electricity."
]
# Embed documents
doc_embeddings = model.encode(documents, convert_to_numpy=True)
# Create FAISS index
dim = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(doc_embeddings)
# Query
query = "What are the benefits of renewable energy?"
query_embedding = model.encode([query], convert_to_numpy=True)
# Search top 3 similar docs
D, I = index.search(query_embedding, k=3)
# Print results
print("Query:", query)
print("\nTop 3 similar documents:")
for rank, idx in enumerate(I[0]):
print(f"{rank+1}. {documents[idx]} (score: {D[0][rank]:.4f})")
```
### Sentence Embedding Example
Install the library:
```bash
pip install -U sentence-transformers
```
Load the model and encode sentences:
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape) # [3, 384]
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape) # [3, 3]
```
<table align="center">
<tr>
<td align="center">
<a href="https://www.linkedin.com/company/tabularis-ai/">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://x.com/tabularis_ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://github.com/tabularis-ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30">
</a>
</td>
<td align="center">
<a href="https://tabularis.ai">
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30">
</a>
</td>
</tr>
</table>