Also, next year, there will be GPT 5. I find it fascinating how much attention s...

svnt · on Sept 21, 2024

I’ll flip this around a bit:

If I’ve raised $1B to buy GPUs and train a “bigger model”, a major part of my competitive advantage is having $1B to spend on sufficient GPUs to train a bigger model.

If, after having raised that money it becomes apparent that consumer hardware can run smaller models that are optimized and perform as well without all that money going into training them, how am I going to pivot my business to something that works, given these smaller models are released this way on purpose to undermine my efforts?

It seems there are two major possibilities: one, people raising billions find a new and expensive intelligence step function that at least time-locally separates them from the pack, or two (and significantly more likely in my view) they don’t, and the improvements come from layering on different systems such as do not require acres of GPUs, while the “more data more GPUs” crowd is found to have hit a nonlinearity that in practical terms means they are generations of technology away from the next tier.

rvnx · on Sept 21, 2024

Mining cryptos, some "AI" companies already do that (knowingly or not... and not necessarily telling investors)

svnt · on Sept 21, 2024

Is it still even worth the electricity to do this on a GPU? It wouldn’t surprise me if some startups were renting them out, but is anyone still mining any volume of crypto on GPUs?

edit: I guess to your point if it is not knowingly then the electricity costs are not a factor either.

ComputerGuru · on Sept 21, 2024

> Is it still even worth the electricity to do this on a GPU?

Only with memcoins.

jstummbillig · on Sept 21, 2024

What you suggest is not impossible but simply flies in the face of all currently available evidence and what all leading labs say and do. We know they are actively looking for ways to do things more efficiently. OpenAI alone did a couple of releases to that effect. Because of how easy it is to switch providers, if only one lab found a way to run a small model that competed with the big ones, it would simply win the entire space, so everyone has to be looking for that (and clearly they are, given that all of them do have smaller versions of their models)

Scepticism is fine, if it's plausible. If not it's conspiratorial.

svnt · on Sept 21, 2024

There are at least two different optimizations happening:

1) optimizing the model training

2) optimizing the model operation

The $1B-spend holy grail is that it costs a lot of money to train, and almost nothing to operate, a proprietary model that benchmarks and chats better than anyone else’s.

OpenAI’s optimizations fall into the latter category. The risk to the business model is in the former — if someone can train a world-beating model without lots of money, it’s a tough day for the big players.

ComputerGuru · on Sept 21, 2024

I disagree. Not axiomatically because you’re kind of right, but enough to comment. OpenAI doesn’t believe in optimizing the traisning costs of AI but believes in optimizing (read: maxing) the training period. Their billions go to collecting, collating, and transforming as much training data as they can get their hands on.

To see what optimizing model operation looks like, groq is a good example. OpenAI isn’t (yet) obviously in that kind of optimization, though I’m sure they’re working on it internally.

svnt · on Sept 22, 2024

My argument wasn’t that the well-funded entities were optimizing to reduce training costs, but the opposite: they need creative ways to spend $1B that provide some tangible advantage. But they need operating costs to be low or they lose money and try to somehow make it up on volume.

I would roll data acquisition/cleaning processes into training costs for purposes of this because what else is the data for if not training?

If 4o wasn’t an optimization for model operation costs what was it?

Larrikin · on Sept 21, 2024

Why would anyone buy a Raspberry Pi when they can get a fully decked out Mac Pro?

There are different use cases and computers are already pretty powerful. Maybe your local model won't be able to produce tests that check all the corner cases of the class you just wrote for work in your massive code base.

But the small model is perfectly capable of summarizing the weather from an API call and maybe tack on a joke that can be read out to you on your speakers in the morning.

talldayo · on Sept 21, 2024

> Why would anyone buy a Raspberry Pi when they can get a fully decked out Mac Pro?

They want compliant Linux drivers?

MrDrMcCoy · on Sept 21, 2024

Since when did Broadcom provide those?

talldayo · on Sept 21, 2024

Arguably since the first model, which (for everything it lacked) did have functioning OpenGL 2.0-compliant drivers.

MrDrMcCoy · on Sept 22, 2024

My memory is fuzzy, but I recall that some models had very limited hardware acceleration support in the driver stack for things like video codecs, OpenCL, and Vulcan, unless you used the official kernel with the Broadcom blob. I never liked running that due to bloat and the age of the kernel/Debian they ship. All that combined with the performance of the SOC compared to its peers from Rockchip/Mediatek/Samsung and lack of eMMC support pretty much drove me away from Raspberry Pi devices in favor of Radxa and ODROID boards.

archagon · on Sept 21, 2024

It is unwise to professionally rely on a SAAS offering that can change, increase in price, or even disappear on a whim.

jabroni_salad · on Sept 21, 2024

One of the reasons I run local is that the models are completely uncensored and unfiltered. If you're doing anything slightly 'risky' the only thing APIs are good for is a slew of very politely written apology letters, and the definition of 'risky' will change randomly without notice or fail to accommodate novel situations.

It is also evident in the moderation that your usage is subject to human review and I don't think that should even be possible.

stevenhuang · on Sept 21, 2024

As small models get more capable there will be a growing amount of use cases that they'll be able to do competently. Is that so hard to believe?

Leave the problems that require competent reasoning ability to the larger models.

Tempest1981 · on Sept 21, 2024

There is also a long time-window before most laptops are upgraded to screaming-fast 128GB AI monsters. Either way, it will be fun to watch the battle.