The Eval Index / Red Teaming & Safety / #178

whitecircle/circle-guard-bench

by whitecircle · Red Teaming & Safety · updated 3mo ago

First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)

37
momentum
70
stars
5
forks
#178
rank
aibenchmarkbenchmarkingguardrailguardrailsjailbreaklarge-language-modellarge-language-modelsllmllm-as-a-judgellm-evalllm-evaluation
View on GitHub →