Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like Kimi too, but they definitely have some benchmark contamination: the blog post shows a substantial comparative drop in swebench verified vs open tests. I throw no shade - releasing these open weights is a service to humanity; really amazing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: