SqueezeAILab/SqueezeLLM - stats on ReviewGithub

Python Cuda C++natural-language-processing text-generation transformer llama quantization model-compression efficient-inference post-training-quantization large-language-models llm small-models localllm

This is stars and forks stats for /SqueezeAILab/SqueezeLLM repository. As of 03 May, 2024 this repository has 397 stars and 25 forks.

SqueezeLLM: Dense-and-Sparse Quantization [Paper] SqueezeLLM is a post-training quantization framework that incorporates a new method called Dense-and-Sparse Quantization to enable efficient LLM serving. TLDR: Deploying LLMs is difficult due to their large memory size. This can be addressed with reduced precision quantization. But a naive method hurts performance. We address this with a new Dense-and-Sparse Quantization method. Dense-and-Sparse splits weight matrices into two components: A dense...

Read on Github Github Stats Page

repo	techs	stars	forks	weekly
chavinlo/musicgen_trainer	Python	261	25	0
hitachisolutionsamerica/dataestate-benchmarks	ScalaPythonPLpgSQL	8	10	0
estufa-cin-ufpe/RISC-V-Pipeline	SystemVerilogVerilogPython	1	0	0
geodynamics/Rayleigh	FortranPythonJupyter Notebook	52	44	0
AI4Finance-Foundation/FinRL-Tutorials	Jupyter NotebookPython	565	255	0
dbekaert/StaMPS	MATLABPerlC	200	116	+1
Anil-matcha/ChatPDF	Python	903	137	0
wmariuss/awesome-devops	Python	1.4k	221	0
noahshinn024/reflexion	PythonJupyter NotebookShell	1.5k	134	0
cran/CptNonPar	RC++	0	0	0