The Eval Index / Coding Eval / #165

openai/SWELancer-Benchmark

by openai · Coding Eval · updated 11mo ago

This repo contains the dataset and code for the paper "SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?"

momentum

1,439

stars

137

forks

#165

rank

More in Coding Eval