The Eval Index / Benchmarks / #103

ByteDance-Seed/EvaLearn

by ByteDance-Seed · Benchmarks · updated 1mo ago

EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in challenging tasks.

59
momentum
430
stars
12
forks
#103
rank
dynamic-evaluationevaluationllms
View on GitHub →