There's no moat on the technology, especially with China around. So the only moat is distribution. We're still in the crazy phase of serious changes and if you're betting too deep on one architecture, ah well... https://howlin-wang.github.io/svg/
Its from Character AI but they used Wan, and MMaudio. I'm not sure their licenses disallow creating a closed model from their work for commercial purposed but either way they've done nothing with a true moat, they were merely first to the table for something this all-inclusive. Even apart from their efforts, assorted tools, all open, can be used to achieve these effects , but requires more techinical knowledge to setup, and each new gen would require a fair amount of reconfiguration of modules. But this is still significantly easier than similarly available tools 9-12 months ago. As an approach it also trades turnkey from tons of control and flexibility such that competent use will still often be simpler or get to a more refined result than Sora and others.
I think the moat here will ened up being value adds for convenience, tooling, IP licensing, integration into the rest of the pipeline used for content production, etc.
Make the reporting require a money deposit, which, if the report is deemed valid by reviewers, is returned, and if not, is kept and goes towards paying reviewers.
You're asking people to risk losing their own money for the chance to... Improve someone else's LLM?
I think this could possibly work with other things of (minor) value to people, but probably not plain old money. With money, if you tried to fix the incentives by offering a potential monetary gain in the case where reviewers agree, I think there's a high risk of people setting up kickback arrangements with reviewers to scam the system.
... You want users to risk their money to make your product better? Might as well just remove the report button, so we're back at the model being poisoned.
Your solutions become more and more unfeasable. People would report less or anything at all if it costs money to do so, defeating the whole purpose of a report function.
And if you think you're being smart by gifting them money or (more likely) your "in-game" currency for "good" reports, it's even worse! They will game the system when there's money to be made, who stops a bad actor from reporting their own poison? Also who's going to review the reports and even if they finance people or AI systems to do that, isn't that bottlenecking new models if they don't want the poison training data to grow faster than it can be fixed? Let me make a claim here: nothing beats fact checking humans to this day or probably ever.
You got to understand that there comes a point when you can't beat entropy! Unless of course you live on someone else's money. ;)
Also R&D, for tax purposes, likely includes everyone at the company who touches code so there's probably a lot of operational cost being hidden in that number.
It's pretty well accepted now that for pre-training LLMs the curve is S not an exponential, right? Maybe it's all in RL post-training now, but my understanding(?) is that it's not nearly as expensive as pre-training. I don't think 3-6 months is the time to 10X improvement anymore (however that's measured), it seems closer to a year and growing assuming the plateau is real. I'd love to know if there are solid estimates on "doubling times" these days.
With the marginal gains diminishing, do we really think they're (all of them) are going to continue spending that much more for each generation? Even the big guys with the money like google can't justify increasing spending forever given this. The models are good enough for a lot of useful tasks for a lot of people. With all due respect to the amazing science and engineering, OpenAI (and probably the rest) have arrived at their performance with at least half of the credit going to brute-force compute, hence the cost. I don't think they'll continue that in the face of diminishing returns. Someone will ramp down and get much closer to making money, focusing on maximizing token cost efficiency to serve and utility to users with a fixed model(s). GPT-5 with it's auto-routing between different performance models seems like a clear move in this direction. I bet their cost to serve the same performance as say gemini 2.5 is much lower.
Naively, my view is that there's some threshold raw performance that's good enough for 80% of users, and we're near it. There's always going to be demand for bleeding edge, but money is in mass market. So if you hit that threshold, you ramp down training costs and focus on tooling + ease of use and token generation efficiency to match 80% of use cases. Those 80% of users will be happy with slowly increasing performance past the threshold, like iphone updates. Except they probably won't charge that much more since the competition is still there. But anyway, now they're spending way less on R&D and training, and the cost to serve tokens @ the same performance continues to drop.
All of this is to say, I don't think they're in that dreadful of a position. I can't even remember why I chose you to reply to, I think the "10x cheaper models in 3-6 months" caught me. I'm not saying they can drop R&D/training to 0. You wouldn't want to miss out on the efficiency of distillation, or whatever the latest innovations I don't know about are. Oh and also, I am confident that whatever the real number N is for NX cheaper in 3-6 months, a large fraction of that will come from hardware gains that are common to all of the labs.
The incremental compute costs will scale with the incremental data added, therefore training costs will grow at a much slower rate compared to when training was GPU limited.
reply