Khasi-English Semantic Search Model

First production-ready semantic search model for Khasi-English language pairs.

Overview

This model enables semantic search between English and Khasi languages, supporting Northeast India's linguistic diversity. Trained on 66,794 English-Khasi translation pairs.

Use Cases

  • Cross-lingual semantic search (English ↔ Khasi)
  • Document similarity in bilingual contexts
  • Cultural content discovery for Northeast India
  • Educational language learning tools

Performance

  • English-Khasi similarity: 0.69-0.74
  • Model size: ~90MB (lightweight deployment)
  • 384-dimensional embeddings

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('MWirelabs/khasi-english-semantic-search')
sentences = ['Hello', 'hangne', 'Good morning']
embeddings = model.encode(sentences)

Developed by MWirelabs for Northeast India AI innovation.

Downloads last month
23
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support