It seems like he's setting temperature=0 which also means it is deterministic. Anecdotally, I've been playing with it since he posted an earlier link & it does shockingly well on 3.5 and nearly perfectly on 4 for my use cases.
(to be clear: I submitted but not the author of the library myself)
Setting temperature to 0 does not make it completely deterministic, from their documentation:
> OpenAI models are non-deterministic, meaning that identical inputs can yield different outputs. Setting temperature to 0 will make the outputs mostly deterministic, but a small amount of variability may remain.
My understanding of LLMs is sub-par at best, could someone explain where the randomness comes from in the event that the model temperature is 0?
I guess I was imagining that if temperature was 0, and the model was not being continuously trained, the weights wouldn’t change, and the output would be deterministic.
Is this a feature of LLMs more generally or has OpenAI more specifically introduced some other degree of randomness in their models?
It's not the LLM, but the hardware. GPU operations generally involve concurrency that makes them non-deterministic, unless you give up some speed to make them deterministic.
Specifically, as I ubderstand it, the accumulation of rounding errors differs with the order in which floating point values are completed and intermediate aggregates are calculated, unless you put wait conditions in so that the aggregation order is fixed even if the completion order varies, which reduces efficient use of available compute cores in exchange for determinism.
Can you elaborate on the temperature parameter? Is this something you can configure in the standard ChatGPT web interface or does it require API access?
GPT basically reads the text you have input, and generates a set of 'likely' next words (technically 'tokens').
So for example, the input:
Bears like to eat ________
GPT may effectively respond with Honey (33% likelihood that honey is the word that follows the statement) and Humans (30% likelihood that humans is the word that follows this statement). GPT is just estimating what word follows next in the sequence based on all it's training data.
With temperature = 0, GPT will always choose "Honey" in the above example.
With temperature != 0, GPT will add some randomness and would occasionally say "Bears like to eat Humans" in the above example.
Strangely a bit of randomness seems to be like adding salt to dinner - just a little bit makes the output taste better for some reason.
It requires API access, but once you have access you can easily play around with it in the openai playground.
Setting temperature to 0 makes the output deterministic, though in my experiments it's still highly sensitive to the inputs. What I mean by that is while yes, for the exact same input you get the exact same output, it's also true that you can change one or two words (that may not change the meaning in any way) and get a different output.
It requires API access, temperature=0 means completely deterministic results but possibly worse performance. Higher temperature increases "creativity" for lack of a better word, but with it, hallucination & gibberish.
(to be clear: I submitted but not the author of the library myself)