The Eval Index / Benchmarks / #263
khoj-ai/llm-coup
by khoj-ai · Benchmarks · updated 9mo ago
Let LLMs play coup with each other and see who's the best at deception & strategy
14
momentum
13
stars
4
forks
#263
rank
aiartificial-intelligencecoupdeceptionenvironmentgamesllm-benchmarkingllm-evalllm-evaluationllmssimulation
View on GitHub →