JonasGeiping/cramming

Cramming the training of a (BERT-type) language model into limited compute.

PythonShellmachine-learninglanguage-modelenglish-language
This is stars and forks stats for /JonasGeiping/cramming repository. As of 07 May, 2024 this repository has 1144 stars and 87 forks.

Cramming Language Model (Pretraining) This repository contains code to replicate our research described in "Cramming: Training a Language Model on a Single GPU in One Day". We experiment with language model pretraining a BERT-type model with limited compute, wondering "how bad can it really be"? You can find our paper here: https://arxiv.org/abs/2212.14034, and the abstract below: Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
karpathy/nanoGPTPython25.1k03.4k0
kuca-belludo/urnasPython920170
apachecn/ailearningPythonJavaScriptCSS36.6k011.3k0
Aeternalis-Ingenium/FastAPI-Backend-TemplatePythonDockerfileOther4460660
watchexec/cargo-watchRustRoffShell2.4k+11750
K0p1-Git/cloudflare-ddns-updaterShell87002710
AsYetUntitled/FrameworkSQFC++Python23703160
ilaria-manco/multimodal-ml-musicTeXPython2430100
preservim/vim-textobj-sentenceVim ScriptShell93080
lupyuen/pinephone-nuttxZigCShell68080