AutoAWQ is an easy-to-use package for 4-bit quantized models that speeds up LLM inference by 3x and...

Tokens:18,476
Snippets:95
Trust Score:8.7
License:MIT
Update:1 month ago
Tokens:
Raw