The Eval Index / Agent Eval / #68

pinchbench/skill

by pinchbench · Agent Eval · updated 10d ago

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

68
momentum
1,229
stars
138
forks
#68
rank
View on GitHub →