NVIDIA/cutlass

CUDA Templates for Linear Algebra Subroutines

C++CudaPythonHTMLCMakeCdeep-learningcppgpucudanvidiadeep-learning-library
This is stars and forks stats for /NVIDIA/cutlass repository. As of 24 Apr, 2024 this repository has 3375 stars and 643 forks.

CUTLASS 3.2 CUTLASS 3.2 - August 2023 CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes. Primitives for different...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
gaoxiang12/slambook2C++CMakeTeX4.5k01.8k0
opencv/opencv_contribC++CudaPython8.7k05.7k0
0vercl0k/CVE-2022-28281HTML750130
hak5/usbrubberducky-payloadsPowerShellJavaPython2.8k+221.1k+1
apache/avroJavaC#C2.6k01.5k0
ChmaraX/forensixJavaScriptCSSHTML920210
CesiumGS/cesiumJavaScriptHTMLGLSL11.1k03.3k0
google/docsyHTMLJavaScriptSCSS2.3k08200
adam-p/markdown-hereJavaScriptCSSHTML59.2k011.4k0
filp/whoopsPHPCSSJavaScript13.1k0646+2