The Eval Index / Benchmarks / #249
multinear/multinear
by multinear · Benchmarks · updated 9mo ago
Develop reliable AI apps
20
momentum
45
stars
1
forks
#249
rank
evaluationllmllm-evalllm-evaluationllm-evaluation-frameworkllmsllms-benchmarkingreliability
View on GitHub →