The Eval Index / Agent Eval / #29
PaperGuru-AI/PaperGuru-Benchmark
by PaperGuru-AI · Agent Eval · updated 4d ago
Lifecycle-Aware Memory for long-horizon LLM agents — 66.05% on PaperBench, 94.66% on SurveyBench, 10 peer-reviewed acceptances at FSE/ICML/TOSEM/AEI/ICoGB
77
momentum
970
stars
139
forks
#29
rank