Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I suspect this has emerged organically from the user given RLHF via thumb voting in the apps. People LIKE being treated this way so the model converges in that direction.

Same as social media converging to rage bait. The user base LIKES it subconsciously. Nobody at the companies explicitly added that to content recommendation model training. I know, for the latter, as I was there.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: