suyoumo/ClawProBench

by suyoumo · Agent Eval · updated 5d ago

ClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.

momentum