MD-Judge-v0.1 / README.md

Foreshhh

Update README.md

5abb0c1 verified over 1 year ago

preview code

raw

history blame

1.06 kB

metadata

license: apache-2.0
datasets:
  - lmsys/toxic-chat
  - PKU-Alignment/BeaverTails
  - lmsys/lmsys-chat-1m
language:
  - en
metrics:
  - f1
  - accuracy
tags:
  - ai-safety
  - safetyguard
  - safety
  - benchmark
  - mistral
  - salad-bench
  - evluation

MD-Judge for Salad-Bench

Model Details

MD-Judge is a LLM-based safetyguard, fine-tund on top of Mistral-7B. MD-Judge serves as a classifier to evaluate the safety of QA pairs.

MD-Judge was born to study the safety of different LLMs serving as an general evaluation tool, which is proposed under the SALAD-Bench paper

Developed by: The SALAD-Bench Team
Model type: An auto-regressive language model based on the transformer architecture.

Model Sources

Repository: SALAD-Bench Github
Dataset: Coming soon
Paper: Coming soon

Uses

Please refer to our Github for more using examples

Citation

BibTeX: