NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

PythonC++Other
This is stars and forks stats for /NVIDIA/Megatron-LM repository. As of 29 Apr, 2024 this repository has 6500 stars and 1432 forks.

Megatron (1, 2, and 3) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision. Below are some of the projects where we have directly used Megatron: BERT and GPT Studies Using Megatron BioMegatron:...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
mit-han-lab/bevfusionPythonC++Cuda1.6k02940
NobuoTsukamoto/meta-tensorflow-liteBitBakeC++280140
Cracked5pider/KaynStrikeCOther3620600
sumatrapdfreader/sumatrapdfCC++Assembly11k01.6k0
dekuNukem/bob_cassette_rewinderCHTMLMakefile1.3k0380
opuntiaOS-Project/opuntiaOSCC++Python6410300
espeak-ng/espeak-ngCJavaShell2.4k+23705+4
dekuNukem/duckyPadCHTMLMakefile1.1k01640
donet5/SqlSugarC#Other4.6k+141.3k+2
CustomEntity/crNormzCrystalCPython27000