The Eval Index / Agent Eval / #235
AGI-Eval-Official/CATArena
by AGI-Eval-Official · Agent Eval · updated 5mo ago
CATArena is an engineering-level tournament evaluation platform for Large Language Model-driven code agents (LLM-driven code agents), based on an iterative competitive peer learning framework.
24
momentum
67
stars
10
forks
#235
rank