The Eval Index / Benchmarks / #103
ByteDance-Seed/EvaLearn
by ByteDance-Seed · Benchmarks · updated 1mo ago
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in challenging tasks.
59
momentum
430
stars
12
forks
#103
rank
dynamic-evaluationevaluationllms
View on GitHub →