The Eval Index / Agent Eval / #68
pinchbench/skill
by pinchbench · Agent Eval · updated 10d ago
PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai
68
momentum
1,229
stars
138
forks
#68
rank