PotatoSpudowski/fastLLaMa

fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.

CC++PythonCMakepythonccpplamalamacpp
This is stars and forks stats for /PotatoSpudowski/fastLLaMa repository. As of 26 Apr, 2024 this repository has 389 stars and 24 forks.

fastLLaMa fastLLaMa is an experimental high-performance framework designed to tackle the challenges associated with deploying large language models (LLMs) in production environments. It offers a user-friendly Python interface to a C++ library, llama.cpp, enabling developers to create custom workflows, implement adaptable logging, and seamlessly switch contexts between sessions. This framework is geared towards enhancing the efficiency of operating LLMs at scale, with ongoing development focused on...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
keijiro/StableDiffusionPluginC#SwiftShaderLab5080380
microsoft/kiotaC#TypeScriptPowerShell8520990
barncastle/Battle.Net-InstallerC#2160420
neverlosecc/source2sdkC++1000180
osbert/persona-kitClojure11020
viettranx/micro-clean-architecture-service-demoGoDockerfile2270780
indeedeng/iwfGoMakefileShell3970440
pterm/ptermGo4.2k01650
supabase/gotrueGoOther85102660
tetratelabs/wazeroGoHTMLMakefile4k02030