The Eval Index / Benchmarks / #82

android-bench/android-bench

by android-bench · Benchmarks · updated today

Android Bench is a framework for benchmarking Large Language Models (LLMs) on Android development tasks. It evaluates an AI model's ability to understand mobile codebases, generate accurate patches, and solve Android-specific engineering problems.

64
momentum
283
stars
61
forks
#82
rank
View on GitHub →