Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Layman question: Does the ability to "understand" such orders arise from one of the many layers which transform the tokens or is it hidden in the parameters/weights or is this due to some special type of attention head?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: