A unified framework for evaluating generative language models across numerous academic benchmarks,...

Tokens:167,438
Snippets:1,993
Trust Score:9.6
License:MIT
Update:1 month ago
Tokens:
Raw