Noeda/rllama

Rust+OpenCL+AVX2 implementation of LLaMA inference code

RustDockerfile
This is stars and forks stats for /Noeda/rllama repository. As of 26 Apr, 2024 this repository has 464 stars and 25 forks.

RLLaMA RLLaMA is a pure Rust implementation of LLaMA large language model inference.. Supported features Uses either f16 and f32 weights. LLaMA-7B, LLaMA-13B, LLaMA-30B, LLaMA-65B all confirmed working Hand-optimized AVX2 implementation OpenCL support for GPU inference. Load model only partially to GPU with --percentage-to-gpu command line switch to run hybrid-GPU-CPU inference. Simple HTTP API support, with the possibility of doing token sampling on client side It can load Vicuna-13B instruct-finetuned...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
facebook/buck2RustStarlarkPython2.9k01510
tui-rs-revival/ratatuiRustShell3.6k+59128+3
devmentors/PaccoShellPowerShellDockerfile71501870
logspace-ai/langflowPythonTypeScriptJavaScript12.7k01.8k0
ministryofjustice/hmpps-workloadPLpgSQLKotlinDockerfile0000
jina-ai/agentchainPythonShellDockerfile5190430
gfreezy/seekerRustShell5930470
epilys/gerbRustCSSSCSS287070
filecoin-project/ref-fvmRustSolidityShell32901250
KTruong008/aichatbestieSvelteTypeScriptJavaScript590150