The Eval Index / Benchmarks / #136
harbor-framework/terminal-bench
by harbor-framework · Benchmarks · updated 4mo ago
A benchmark for LLMs on complicated tasks in the terminal
48
momentum
2,353
stars
541
forks
#136
rank
A benchmark for LLMs on complicated tasks in the terminal