PlanBench is an extensible benchmark for evaluating large language models on planning and reasoning...

Tokens:88,574
Snippets:1,313
Trust Score:8.8
License:MIT
Update:1 month ago
Tokens:
Raw