Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A couple months might be too soon imho. But I hope that in 2-3 years there will be a model with similar performance but much smaller size, small enough to run incredibly fast inference + training on my laptop. OpenAI might need to rethink their moat in case that happens.

Think about all the smart ML researchers in academia. They can't afford training large models on large datasets, and their decades of work is made obsolete by OpenAI's bruteforce approach. They've got all the motivation in the world to work on smaller models.



I actually don't think that we will make significant advancements in reducing model size before we make significant advances in increasing available power and compute.

One reason is that the pressure is still on for models to be bigger and more power hungry, as many believe compute will continue to be the deciding factor in model performance for some time. It's not a coincidence that OpenAI's CEO, Sam Altman, also runs a fusion energy r&d company.


But processing hardware has been seeing diminishing returns for years. My CPU from 2013 is still doing what I need; a 1993 processor in 2003 would have been useless.

Where do you see hardware improvements coming from?


Specifically in AI there is huge room for improvement with things like optical computing. AI processing doesn't need to be completely deterministic, as shown by the fact that we are quantizing llama down to 4 bit without too much of a drop in performance. Once you drop that requirement you open the door to using much, much more efficient analog circuits. How do I invest in optical computing...


Training these nets mostly occurs on GPUs. CPUs are often hamstrung by their serial performance (e.g. operations that depend on the output of a previous operation end up stalling instruction pipelines.) GPUs still have a decent amount of room to parallelize and maximize compute.


Can’t ChatGPT design better hardware ?


Only Deep Thought can do that. And the answer is still 42.


Good to know!


No but maybe GPT-5 will be able to


LLaMA-13B 4bit is around 7GB and beats GPT-3.5 175B in benchmarks.

It runs at acceptable speeds on a Macbook Air M1 or a $100 consumer video card.


There is a drop in performance after quantization, so initial benchmarks aren't relevant. Here are some benchmarks for quantized https://github.com/qwopqwop200/GPTQ-for-LLaMa




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: