Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried this (gpt-oss-120b with Cerebras) with Roo Code. It repeatedly failed to use the tools correctly, and then I got 429 too many requests. So much for the "as fast as I can think" idea!

I'll have to try again later but it was a bit underwhelming.

The latency also seemed pretty high, not sure why. I think with the latency the throughout ends up not making much difference.

Btw Groq has the 20b model at 4000 TPS but I haven't tried that one.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: