BlinkDL/RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

PythonCudaC++deep-learningtransformerspytorchtransformerlstmrnngptlanguage-modelattention-mechanismgpt-2gpt-3linear-attentionrwkvchatgpt
This is stars and forks stats for /BlinkDL/RWKV-LM repository. As of 27 Apr, 2024 this repository has 9969 stars and 685 forks.

The RWKV Language Model (and my LM tricks) If you are new to RWKV, it would be better to find out more about us via our wiki first here: https://wiki.rwkv.com/ RWKV: Parallelizable RNN with Transformer-level LLM Performance (pronounced as "RwaKuv", from 4 major params: R W K V) RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). And it's 100% attention-free. You only need the hidden state at position t to compute the state...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
microsoft/scala_torchScalaSWIGPython119040
BIIG-UC3M/HoloLens2and3DSlicer-PedicleScrewPlacementPlanningShaderLabC#Python6040
mikeizbicki/cmc-csci046TeXPython5701540
allegheny-college-cmpsc-202-spring-2023/class-materialsPythonTeX00160
Yutaka-Sawada/MultiParCHTMLPython7890370
zoogie/ninjhax2-dxAssemblyPythonC25020
pawelsalawa/sqlitestudioCC++Yacc3.8k+9513+3
ArduPilot/MissionPlannerC#JavaScriptLua1.5k02.2k0
AcademySoftwareFoundation/OpenRVC++mupadPython467+1950
apache/brpcC++CMakePerl15.3k03.8k0