The Eval Index / Benchmarks / #164

OpenGenerativeAI/llm-colosseum

by OpenGenerativeAI · Benchmarks · updated 1y ago

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

momentum

1,483

stars

180

forks

#164

rank

benchmarkgenaillmstreetfighterai

More in Benchmarks