BlinkDL/RWKV-CUDA - stats on ReviewGithub

Cuda Python C++

This is stars and forks stats for /BlinkDL/RWKV-CUDA repository. As of 03 May, 2024 this repository has 135 stars and 27 forks.

RWKV-CUDA The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM ) Towards RWKV-4 (see the wkv folder) I have a basic RWKV-4 kernel in the wkv folder. Let's optimize it. Experiment 1 - depthwise_conv1d - 20x faster than pytorch The formula: w.shape = (C, T) k.shape = (B, C, T) out.shape = (B, C, T) out[b][c][t] = sum_u{ w[c][(T-1)-(t-u)] * k[b][c][u] } pytorch = fwd 94ms bwd 529ms CUDA kernel v0 = fwd 45ms bwd 84ms (simple) CUDA kernel v1 = fwd 17ms bwd 43ms (shared memory) CUDA...

Read on Github Github Stats Page

repo	techs	stars	weekly	forks	weekly
dbsystel/jl23-rp2040	GroovyShellCSS	4	0	1	0
michigan-traffic-lab/Dense-Deep-Reinforcement-Learning	Jupyter NotebookPython	237	0	37	0
hikariming/alpaca_chinese_dataset	Jupyter NotebookPython	970	0	78	0
Mitek-Systems/MiSnap-iOS	CObjective-CSwift	9	0	6	0
openai/chatgpt-retrieval-plugin	PythonOther	19.8k	0	3.6k	0
cisagov/untitledgoosetool	PythonPowerShell	839	0	69	0
gururise/AlpacaDataCleaned	PythonHTMLJavaScript	1.3k	0	133	0
binary-husky/chatgpt_academic	PythonCSSOther	42.9k	+437	5.6k	+41
sahil280114/codealpaca	Python	1.3k	0	96	0
feizc/MLE-LLaMA	Python	292	0	19	0