tatsu-lab/alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter NotebookPythonnlpdeep-learningleaderboardevaluationinstruction-followingfoundation-modelslarge-language-modelsrlhf
This is stars and forks stats for /tatsu-lab/alpaca_eval repository. As of 05 May, 2024 this repository has 658 stars and 90 forks.

AlpacaEval : An Automatic Evaluator for Instruction-following Language Models Evaluation of instruction-following models (e.g., ChatGPT) typically requires human interactions. This is time-consuming, expensive, and hard to replicate. AlpacaEval in an LLM-based automatic evaluation that is fast, cheap, replicable, and validated against 20K human annotations. It is particularly useful for model development. Although we improved over prior automatic evaluation pipelines, there are still fundamental...
Read on GithubGithub Stats Page
repotechsstarsweeklyforksweekly
cszn/FFDNetMATLABOther41201240
facebookresearch/ijepaPython2.3k03690
Victorwz/LongMemPythonShellCuda654+41050
uzh-rpg/RVTPython2260290
sinsinology/CVE-2023-20887RubyPython2200440
spyglass-search/spyglassRustHTMLJavaScript2.2k0450
aorumbayev/autogpt4allPythonShell350+4500
openSIL/openSILCPythonAssembly2500160
iburzynski/jambhalaHaskellPythonShell18+133+11
shade-econ/nber-workshop-2023HTMLJupyter NotebookPython570280