An open-source benchmark for evaluating the safety of LLM-based agents operating in realistic,...

Tokens:68,960
Snippets:1,308
Trust Score:5.3
License:MIT
Update:1 month ago
Tokens:
Raw