Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My impression is that Sonnet and Haiku 4.5 are the same "base models" as Sonnet and Haiku 4, the improvements are from fine tuning on data generated by Opus.

I'm a user who follows the space but doesn't actually develop or work on these models, so I don't actually know anything, but this seems like standard practice (using the biggest model to finetune smaller models)

Certainly, GPT-4 Turbo was a smaller model than GPT-4, there's not really any other good explanation for why it's so much faster and cheaper.

The explicit reason that OpenAI obfuscates reasoning tokens is to prevent competitors from training their own models on them.



These frontier model companies are bootstrapping their work by using models to improve models. It’s a mechanism to generate fake training data. The rationale is the teacher model is already vetted and aligned so it can reliably “mock” data. A little human data gets amplified.


Which is all to say that I think the reason they went from Opus 3 to Opus 4 is because there was no bigger model to fine tune Opus 3.5 with.

And I would expect Opus 4 to be much the same.


But sonnet 4.5 outperforms opus 4 on most benchmarks and tasks that can't be all that's to it


that's not all there is to it, but I think that "the rest of it" is just additional fine tuning.

Benchmarks are good fixed targets for fine tuning, and I think that Sonnet gets significantly more fine tuning than Opus. Sonnet has more users, which is a strategic reason to focus on it, and it's less expensive to fine tune, if API costs of the two models are an indicator.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: