The Eval Index / Benchmarks / #39

openai/evals

by openai · Benchmarks · updated 1mo ago

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

74
momentum
18,685
stars
2,988
forks
#39
rank
View on GitHub →