--- library_name: transformers.js pipeline_tag: sentence-similarity tags: - sentence-transformers - transformers.js - onnx - biblical-search - semantic-search - embeddinggemma - fine-tuned license: apache-2.0 datasets: - biblical-text-pairs metrics: - accuracy@1: 12.00% - accuracy@3: 15.00% - accuracy@10: 31.00% language: - en --- # EmbeddingGemma-300M Fine-tuned for Biblical Text Search (ONNX) This is the ONNX version of our fine-tuned EmbeddingGemma-300M model specialized for biblical text search and retrieval. This version is optimized for web deployment using transformers.js. ## Model Performance - **Accuracy@1**: 12.00% (13x improvement over base model) - **Accuracy@3**: 15.00% - **Accuracy@10**: 31.00% - **Training Steps**: 25 (optimal stopping point) - **Base Model Accuracy@1**: 0.91% ## Usage with Transformers.js ```javascript import { AutoTokenizer, AutoModel } from '@huggingface/transformers'; // Load the model const model = await AutoModel.from_pretrained('dpshade22/embeddinggemma-scripture-v1-onnx'); const tokenizer = await AutoTokenizer.from_pretrained('dpshade22/embeddinggemma-scripture-v1-onnx'); // Encode queries (use search_query: prefix) const query = "search_query: What is love?"; const query_embedding = await model.encode([query]); // Encode documents (use search_document: prefix) const document = "search_document: Love is patient and kind"; const doc_embedding = await model.encode([document]); ``` ## Prefixes For optimal performance, use these prefixes: - **Queries**: `"search_query: your question here"` - **Documents**: `"search_document: scripture text here"` ## Model Details - **Base Model**: `google/embeddinggemma-300m` - **Training Data**: 26,276 biblical text pairs - **Training Steps**: 25 steps (optimal stopping point) - **Learning Rate**: 2.0e-04 - **Batch Size**: 8 - **Output Dimensions**: 768D (supports Matryoshka 384D, 128D) - **ONNX Conversion**: Using nixiesearch/onnx-convert specialized tool ## Training Details - **Training Data**: 26,276 biblical text pairs - **Learning Rate**: 2.0e-04 - **Batch Size**: 8 - **Training Strategy**: Early stopping at 25 steps to prevent overfitting - **Output Dimensions**: 768D (supports Matryoshka 384D, 128D) ## Intended Use This model is designed for: - Biblical text search and retrieval in web applications - Finding relevant scripture passages - Semantic similarity of religious texts - Question answering on biblical topics - Offline PWA applications using transformers.js ## Conversion Details - **Converted using**: nixiesearch/onnx-convert specialized tool - **ONNX Opset**: 17 - **Optimization Level**: 1 - **Max difference from original**: 1.9e-05 (within acceptable tolerance) ## Related Models - **Original PyTorch version**: dpshade22/embeddinggemma-scripture-v1 - **Base model**: google/embeddinggemma-300m - **Reference ONNX**: onnx-community/embeddinggemma-300m-ONNX