The Eval Index / Eval Frameworks / #149
ellydee/acceptance-bench
by ellydee · Eval Frameworks · updated 2mo ago
A robust LLM evaluation framework measuring acceptance vs refusal across difficulty levels. Features multi-prompt variation testing, temperature sweeping, and LLM-as-judge evaluation.
44
momentum
73
stars
1
forks
#149
rank