This is stars and forks stats for /wangsiping97/FastGEMV repository. As of 11 May, 2024 this repository has 26 stars and 0 forks.
FastGEMV This repository provides a collection of kernel functions that enable high-speed computation of GEMV (matrix-vector dot product). We have implemented and benchmarked the following scenarios: matrix: fp16, vector: fp16; matrix: int8 (quantized with fp16 scale/zero point), vector: fp16; matrix: int4 (quantized with fp16 scale/zero point), vector: fp16. The matrix and vector sizes range from 512 to 16384. On P100 GPUs, we achieved a maximum speedup of 2.7x compared to the PyTorch baseline....
FastGEMV This repository provides a collection of kernel functions that enable high-speed computation of GEMV (matrix-vector dot product). We have implemented and benchmarked the following scenarios: matrix: fp16, vector: fp16; matrix: int8 (quantized with fp16 scale/zero point), vector: fp16; matrix: int4 (quantized with fp16 scale/zero point), vector: fp16. The matrix and vector sizes range from 512 to 16384. On P100 GPUs, we achieved a maximum speedup of 2.7x compared to the PyTorch baseline....
repo | techs | stars | weekly | forks | weekly |
---|---|---|---|---|---|
openwebf/webf | DartC++JavaScript | 1k | 0 | 80 | 0 |
SWMFsoftware/GITM2 | FortranPythonIDL | 0 | 0 | 0 | 0 |
letianzj/quanttrader | HTMLPythonJavaScript | 358 | 0 | 90 | 0 |
Mehdi-H/WeeklyCuration | Makefile | 20 | 0 | 0 | 0 |
sylefeb/a5k | ShellPythonMakefile | 252 | 0 | 7 | 0 |
weaviate/weaviate-io | MDXPythonJavaScript | 36 | 0 | 94 | 0 |
facebookresearch/playtorch | MDXTypeScriptC++ | 806 | 0 | 105 | 0 |
nf-core/marsseq | NextflowPerlPython | 4 | 0 | 1 | 0 |
baichuan-inc/Baichuan-13B | Python | 2.7k | 0 | 195 | 0 |
guoyww/AnimateDiff | PythonShell | 4.5k | +365 | 353 | +23 |