--- tags: - sentence-transformers - sentence-similarity - feature-extraction - rag base_model: sentence-transformers/all-MiniLM-L6-v2 pipeline_tag: sentence-similarity library_name: sentence-transformers license: apache-2.0 language: - en --- # The Fastest Text Embedding Model: tabularisai/all-MiniLM-L2-v2 This model is distilled from [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2), delivering **almost 2 times faster inference** in comparasion to the smallest all-MiniLM-L6-v2 model, while maintaining strong accuracy on CPU and GPU. ## Usage ### Retrieval-Augmented Generation (RAG) Example Use this model as a retriever in a RAG pipeline: ```python from sentence_transformers import SentenceTransformer, util import faiss import numpy as np # Load embedding model model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2") # Your 5 simple documents documents = [ "Renewable energy comes from natural sources.", "Solar panels convert sunlight into electricity.", "Wind turbines harness wind power.", "Fossil fuels are non-renewable sources of energy.", "Hydropower uses water to generate electricity." ] # Embed documents doc_embeddings = model.encode(documents, convert_to_numpy=True) # Create FAISS index dim = doc_embeddings.shape[1] index = faiss.IndexFlatL2(dim) index.add(doc_embeddings) # Query query = "What are the benefits of renewable energy?" query_embedding = model.encode([query], convert_to_numpy=True) # Search top 3 similar docs D, I = index.search(query_embedding, k=3) # Print results print("Query:", query) print("\nTop 3 similar documents:") for rank, idx in enumerate(I[0]): print(f"{rank+1}. {documents[idx]} (score: {D[0][rank]:.4f})") ``` ### Sentence Embedding Example Install the library: ```bash pip install -U sentence-transformers ``` Load the model and encode sentences: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium.", ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ```
|
|
|
|