The Eval Index / Benchmarks / #158

huggingface/evaluation-guidebook

by huggingface · Benchmarks · updated 6mo ago

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

41
momentum
2,117
stars
124
forks
#158
rank
evaluationevaluation-metricsguidebooklarge-language-modelsllmmachine-learningtutorial
View on GitHub →