vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

PythonCudaOtherinferencepytorchtransformergptmodel-servingmlopsllmllmopsllm-serving
This is stars and forks stats for /vllm-project/vllm repository. As of 28 Apr, 2024 this repository has 8165 stars and 904 forks.

Easy, fast, and cheap LLM serving for everyone | Documentation | Blog | Paper | Discord | Latest News 🔥 [2023/10] We hosted the first vLLM meetup in SF! Please find the meetup slides here. [2023/09] We created our Discord server! Join us to discuss vLLM and LLM serving! We will also post the latest announcements and updates there. [2023/09] We released our PagedAttention paper on arXiv! [2023/08] We would like to express our sincere gratitude to Andreessen Horowitz (a16z) for providing a generous...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
ur-whitelab/chemcrow-publicPython2580300
zjunlp/DeepKEPythonJupyter NotebookOther2.4k05650
dorant/RoboStuffOther0000
tauri-apps/tauri-mobileRustHandlebarsOther1.2k+1152+2
sudoswap/lssvm2SolidityPython340100
GalxeHQ/galxe-contractsSolidityTypeScriptOther920870
tokenInsight/coin-toolsSolidityJavaScriptRuby15070
Cute-Dress/DressStandard MLRubyPython190+1550+4
pulp-platform/snitch_clusterSystemVerilogCPython9090
slaclab/axi-soc-ultra-plus-coreTclVHDLPython2000