Llama-Prompt-Guard-2-86M-onnx

This repository provides a ONNX converted and quantized version of meta-llama/Llama-Prompt-Guard-2-86M

🧠 Built With

📥 Evaluation Dataset

We use jackhhao/jailbreak-classification for the evaluation

🧪 Evaluation Results

Model Accuracy Precision Recall F1 Score AUC-ROC Inference Time
Llama-Prompt-Guard-2-22M 0.9569 0.9879 0.9260 0.9559 0.9259 33s
Llama-Prompt-Guard-2-22M-q 0.9473 1.0000 0.8956 0.9449 0.9032 29s
Llama-Prompt-Guard-2-86M 0.9770 0.9980 0.9564 0.9767 0.9523 1m29s
Llama-Prompt-Guard-2-86M-q 0.8937 1.0000 0.7894 0.8823 0.7263 1m15s

🤗 Usage

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np

# Load model and tokenizer using optimum
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")

# Tokenize input
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

# Run inference
outputs = model(**inputs)
logits = outputs.logits

# Optional: convert to probabilities
probs = 1 / (1 + np.exp(-logits))
print(probs)

🐙 GitHub Repository:

You can find the full source code, CLI tools, and evaluation scripts in the official GitHub repository.

Downloads last month
23,555
Safetensors
Model size
279M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gravitee-io/Llama-Prompt-Guard-2-86M-onnx

Quantized
(2)
this model