DAMO-NLP-SG/Video-LLaMA

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Pythonllamalarge-language-modelsvideo-language-pretrainingvision-language-pretrainingcross-modal-pretrainingblip2minigpt4multi-modal-chatgpt
This is stars and forks stats for /DAMO-NLP-SG/Video-LLaMA repository. As of 05 May, 2024 this repository has 1674 stars and 139 forks.

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding This is the repo for the Video-LLaMA project, which is working on empowering large language models with video and audio understanding capabilities. News [08.03] 🚀🚀 Release Video-LLaMA-2 with Llama-2-7B/13B-Chat as language decoder NO delta weights and separate Q-former weights anymore, full weights to run Video-LLaMA are all here 👉 [7B][13B] Allow further customization starting from our pre-trained checkpoints...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
mikel-brostrom/yolo_trackingPythonDockerfile5.4k01.6k0
hydro-project/hydroflowRustPythonOther3460260
antoinemadec/coc-fzfVim ScriptPythonShell3900270
huchenlei/sd-webui-openpose-editorVueTypeScriptPython3350350
BradyFU/Awesome-Multimodal-Large-Language-Models4.7k03460
Octoberfest7/DropSpawn_BOFCPythonMakefile2020210
Azure-Samples/miyagiC#TypeScriptJupyter Notebook3430990
Kelvinskell/terra-tierHCLHTMLPython190140
reriiasu/speech-to-textHTMLPythonJavaScript1340240
shibing624/MedicalGPTPythonJupyter NotebookShell1.6k02600