tairov/llama2.mojo

Inference Llama 2 in one file of pure 🔥

PythonDockerfileperformancemodularmojoinferencesimdllamatensorvectorizationparallelizetransformer-architecturellama2
This is stars and forks stats for /tairov/llama2.mojo repository. As of 02 May, 2024 this repository has 1230 stars and 75 forks.

llama2.🔥 Have you ever wanted to inference a baby Llama 2 model in pure Mojo? No? Well, now you can! supported version: Mojo 0.4.0 With the release of Mojo, I was inspired to take my Python port of llama2.py and transition it to Mojo. The result? A version that leverages Mojo's SIMD & vectorization primitives, boosting the Python performance by nearly 250x. Impressively, the Mojo version now outperforms the original llama2.c compiled in runfast mode out of the box by 15-20%. This showcases...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
sismics/docker-apache2DockerfileShell9040+5
RedHatTraining/DO288-appsMustacheJavaJavaScript102013.9k+16
turboderp/exllamav2PythonCudaC++1.4k0740
ZiwenZhuang/parkourPython2450370
hyperledger-labs/open-enterprise-agentScalaKotlinJupyter Notebook24+130
OpenPipe/OpenPipeTypeScriptPythonJavaScript1.8k0760
YordanPetrov/xhtml-information-retrievalXMLJavaPython1000
fulcrum-so/ziggy-pydust-templateZigPython13000
Xilinx/meta-python2BitBakePawnPython0010
TryQuiet/quietCTypeScriptC++1.6k+27670