casper-hansen/AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.

C++PythonCudaOther
This is stars and forks stats for /casper-hansen/AutoAWQ repository. As of 09 May, 2024 this repository has 256 stars and 25 forks.

AutoAWQ | Roadmap | Examples | Issues: Help Wanted | AutoAWQ is an easy-to-use package for 4-bit quantized models. AutoAWQ speeds up models by 2x while reducing memory requirements by 3x compared to FP16. AutoAWQ implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs. AutoAWQ was created and improved upon from the original work from MIT. Latest News 🔥 [2023/10] Mistral (Fused Modules), Bigcode, Turing support,...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
LinHuangnan/Tutorial_2023CMakeC++Python30190
gngpp/luci-theme-designCSSJavaScriptHTML1870240
nerves-hub/nerves_hub_webElixirHTMLCSS156+1540
ds-teja/100_Days_MLDLJupyter NotebookPython57+2027+6
debakarr/kodekloud-downloaderJupyter NotebookPython550210
spetz911/neuroMATLABOther1000
crashappsec/chalkNimPythonShell168+508+2
thedevdojo/voyagerPHPCSSSCSS11.5k02.7k0
Azure/azure-monitor-baseline-alertsPowerShellBicepPython35+625+7
jackmpcollins/magenticPython1k0360