A couple months might be too soon imho. But I hope that in 2-3 years there will ...

soulofmischief · on March 11, 2023

I actually don't think that we will make significant advancements in reducing model size before we make significant advances in increasing available power and compute.

One reason is that the pressure is still on for models to be bigger and more power hungry, as many believe compute will continue to be the deciding factor in model performance for some time. It's not a coincidence that OpenAI's CEO, Sam Altman, also runs a fusion energy r&d company.

flangola7 · on March 11, 2023

But processing hardware has been seeing diminishing returns for years. My CPU from 2013 is still doing what I need; a 1993 processor in 2003 would have been useless.

Where do you see hardware improvements coming from?

stormfather · on March 11, 2023

Specifically in AI there is huge room for improvement with things like optical computing. AI processing doesn't need to be completely deterministic, as shown by the fact that we are quantizing llama down to 4 bit without too much of a drop in performance. Once you drop that requirement you open the door to using much, much more efficient analog circuits. How do I invest in optical computing...

Karrot_Kream · on March 12, 2023

Training these nets mostly occurs on GPUs. CPUs are often hamstrung by their serial performance (e.g. operations that depend on the output of a previous operation end up stalling instruction pipelines.) GPUs still have a decent amount of room to parallelize and maximize compute.

ChatGTP · on March 11, 2023

Can’t ChatGPT design better hardware ?

Xorakios · on March 12, 2023

Only Deep Thought can do that. And the answer is still 42.

ChatGTP · on March 12, 2023

Good to know!

sebzim4500 · on March 12, 2023

No but maybe GPT-5 will be able to

MacsHeadroom · on March 12, 2023

LLaMA-13B 4bit is around 7GB and beats GPT-3.5 175B in benchmarks.

It runs at acceptable speeds on a Macbook Air M1 or a $100 consumer video card.

diimdeep · on March 12, 2023

There is a drop in performance after quantization, so initial benchmarks aren't relevant. Here are some benchmarks for quantized https://github.com/qwopqwop200/GPTQ-for-LLaMa