HumphreySun98/repoagentbench

by HumphreySun98 · Agent Eval · updated 1mo ago

SWE-bench for your codebase — mine your merged PRs into local, contamination-free coding-agent benchmarks. Adapters: claude-code, aider (Opus 4.7 / GPT-5.5 / Sonnet 4.6 / Gemini 3.1 Pro).

momentum

stars

forks

#133

rank

agent-evalsai-agentsaiderbenchmarkclaude-opus-4-7coding-agentsdeveloper-toolsgemini-3-1-progpt-5-5llm-evalswe-bench

View on GitHub →

HumphreySun98/repoagentbench

More in Agent Eval