The Eval Index / Agent Eval / #235

AGI-Eval-Official/CATArena

by AGI-Eval-Official · Agent Eval · updated 5mo ago

CATArena is an engineering-level tournament evaluation platform for Large Language Model-driven code agents (LLM-driven code agents), based on an iterative competitive peer learning framework.

24
momentum
67
stars
10
forks
#235
rank
View on GitHub →