Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs do not model "certainty". This is illogical. It models the language corpus you feed the model.


Essentially all modern machine learning techniques have internal mechanisms that are very closely aligned with certainty. For example, the output of a binary classifier is typically a floating point number in the range [0, 1], with 0 being one class, and 1 representing the other class. In this case, a value of 0.5 would essentially mean "I don't know," and answers in between give both an answer (round to the nearest int) as well as a sense of certainty (how close was the output to the int). LLMs offer an analogous set of statistics.

Speaking more abstractly or philosophically, why could a model never internalize something read between the lines? Humans do, and we're part of the same physical system — we're already our own kinds of computers that take away more from a text than what is explicitly there. It's possible.


You don't have to teach an transformer model using a language corpus even if that was the pretraining. You can e.g. write algorithms directly and merge them into the model.

https://github.com/yashbonde/rasp

https://github.com/arcee-ai/mergekit


Recent research using SAEs suggest that some neurons regulate confidence/certainty: https://arxiv.org/abs/2406.16254




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: