You can run llama-30b right now on high-end consumer hardware (RTX 3090+) using ...

		int_19h on March 12, 2023 \| parent \| context \| favorite \| on: ChatGPT's API is so good and cheap, it makes most ... You can run llama-30b right now on high-end consumer hardware (RTX 3090+) using int4 quantization. With two GPUs, llama-65b is within reach. And even 30b is surprisingly good, although it's clearly not as well trained as ChatGPT specifically for dialog-like task setting.