LLaMA-13B 4bit is around 7GB and beats GPT-3.5 175B in benchmarks. It runs at ac... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		MacsHeadroom on March 12, 2023 \| parent \| context \| favorite \| on: ChatGPT's API is so good and cheap, it makes most ... LLaMA-13B 4bit is around 7GB and beats GPT-3.5 175B in benchmarks. It runs at acceptable speeds on a Macbook Air M1 or a $100 consumer video card.

diimdeep on March 12, 2023 [–]

There is a drop in performance after quantization, so initial benchmarks aren't relevant. Here are some benchmarks for quantized https://github.com/qwopqwop200/GPTQ-for-LLaMa

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact