PawBench is a model and harness co-evaluation benchmark for agentic AI, providing 150 agent tasks, 9...

Tokens:605,116
Snippets:5,201
Trust Score:6.6
License:Apache-2.0
Update:2 weeks ago
Tokens:
Raw