The Eval Index / Coding Eval / #165
openai/SWELancer-Benchmark
by openai · Coding Eval · updated 11mo ago
This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"
39
momentum
1,439
stars
137
forks
#165
rank