turboderp/exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

PythonCudaC++C
This is stars and forks stats for /turboderp/exllamav2 repository. As of 07 May, 2024 this repository has 1350 stars and 74 forks.

ExLlamaV2 This is a very initial release of ExLlamaV2, an inference library for running local LLMs on modern consumer GPUs. It still needs a lot of testing and tuning, and a few key features are not yet implemented. Don't be surprised if things are a bit broken to start with, as almost all of this code is completely new and only tested on a few setups so far. Overview of differences compared to V1 Faster, better kernels Cleaner and more versatile codebase Support for a new quant format (see below) Performance Some...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
ZiwenZhuang/parkourPython2450370
hyperledger-labs/open-enterprise-agentScalaKotlinJupyter Notebook24+130
tiagoair/plataformaspjd4v2021ShaderLabC#HLSL00470
rocketseat-education/nlw-ai-masteryTypeScriptJavaScriptCSS377+1910
FL33TW00D/whisper-turboTypeScriptCSSJavaScript1k0400
microsoft/azurechatTypeScriptBicepJavaScript59502480
OpenPipe/OpenPipeTypeScriptPythonJavaScript1.8k0760
YordanPetrov/xhtml-information-retrievalXMLJavaPython1000
fulcrum-so/ziggy-pydust-templateZigPython13000
Metropollution/MetropollutionClassic ASPC#JavaScript0000