My conversation: https://imgur.com/a/KzLKdQF
All the answers in the picture are true.
I like to think that I understand LLMs pretty well. Which is why I was so underwhelmed by most of the mainstream "AI" news. But this threw for a loop.
As a predictor, how can it model base64? It surely can't just be "pretending" like it does with all other stuff.
The precision feels the most wrong to me, it does long random strings perfectly.
Why does it then fail at simple arithmetic?
It's not perfect though. I tested it on a few sentences of text and it made a few mistakes. Due to the way that GPT tokenizes the input text, it can't really generalize the pattern, as mappings of text to tokens is somewhat random. It effectively has to learn how to map every unique combination of 3 characters to 4 base64 digits, of which there are up to 2^24=16,777,216 distinct mappings. Otherwise, the number of characters in each token varies, which can also lead to mistakes.
You can use this tool to see how GPT3 maps text to tokens and token IDs: https://platform.openai.com/tokenizer
As an example, the alphabet "abcdefghijklmnopqrstuvwxyz" maps to [39305, 4299, 456, 2926, 41582, 10295, 404, 80, 81, 301, 14795, 86, 5431, 89]. This is what I mean by it's fairly random.