Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even large hosted models fail at that task regularly. It's a silly anecdotal example, but I asked the Gemini assistant on my Pixel whether [something] had seen a new release to match the release of [upstream thing].

It correctly chose to search, and pulled in the release page itself as well as a community page on reddit, and cited both to give me the incorrect answer that a release had been pushed 3 hours ago. Later on when I got around to it, I discovered that no release existed, no mention of a release existed on either cited source, and a new release wasn't made for several more days.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: