The Eval Index / Reasoning / #146

ai-twinkle/Eval

by ai-twinkle · Reasoning · updated 2mo ago

High-performance LLM evaluation framework with parallel API calls — up to 17× faster than sequential tools. Supports box, math, and logit-based evaluation.

45
momentum
96
stars
17
forks
#146
rank
evalevaluationllm
View on GitHub →