I would guess the “secret sauce” here is distillation: pretraining on an extreme...

asadm · 2025-08-05T21:10:40 1754428240

> research results have shown that highly curated technical problem solving data is unreasonably effective at boosting smaller models’ performance.

same seems to be true for humans

throw310822 · 2025-08-05T22:03:37 1754431417

Yes, if I understand correctly, what it means is "a very smart teacher can do wonders for their pupils' education".

tempaccount420 · 2025-08-05T21:30:59 1754429459

Wish they gave us access to learn from those grandmother models instead of distilled slop.

ashdksnndck · 2025-08-05T22:19:58 1754432398

It behooves them to keep the best stuff internal, or at least greatly limit any API usage to avoid giving the goods away to other labs they are racing with.

saurik · 2025-08-06T02:53:00 1754448780

Which, presumably, is the reason they removed 4.5 from the API... mostly the only people willing to pay that much for that model were their competitors. (I mean, I would pay even more than they were charging, but I imagine even if I scale out my use cases--which, for just me, are mostly satisfied by being trapped in their UI--it would be a pittance vs. the simpler stuff people keep using.)