The Eval Index / Coding Eval / #165

openai/SWELancer-Benchmark

by openai · Coding Eval · updated 11mo ago

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

39
momentum
1,439
stars
137
forks
#165
rank
View on GitHub →