Possibility: The human brain and GPT-3 are doing radically different things and aren't even comparable. GPT-3 is merely memorizing enough language to pass a turing test, whereas human brains are actually learning how to use language to communicate with other humans.
Evidence: Have GPT-3 write your work emails for a day, and then live with the consequences of that for the next week. It's going to produce text that makes sense, but only in a very particular way. You will end up with email threads where an outsider says "yeah looks like work emails. I believe this is two humans" And, that's very impressive! But your actual interlocutor who understands the full context of the conversation and relationship will legitimately worry for your mental health and maybe even contact your manager.
Conclusions:
1. Any time you invent a test, people will find ways to pass the test with flying colors but completely miss the point of the test. The turing test is no different.
2. Being able to imitate a human well enough in a five minute general english conversation is only so useful. There's a reason we don't pay people living wages to write internet comments. This isn't to say that GPT-3 is useless, though. There is certainly demand for very specialized five minute conversers that come at zero marginal cost. I'm worried.
3. We still have no clue how to even begin to approach AGI.
I think that you make an interesting point - that these transformer models can produce reasonable text only freed from context, and generally the useful thing is to produce reasonable text taking context into account.
But how far are we really from going beyond the five minute conversation you mention? What if you combine a transformer text model with a chess engine to produce a chess tutor? It doesn't seem like we are too far from a computer program that could teach chess from first principles (i.e. learn chess itself, and teach it in a meaningful way).
Maybe in combination with economic models, it could teach economics?
What else? Perhaps in combination with wikipedia it could teach subjects such as physics or biology? But it would just be regurgitating in that case, not "understanding" what is in wikipedia. We need an existing model to tie to the text generating capability. So what if you take an alpha-go-zero approach to model generation - instead of a human creating the model, it is developed through adversarial learning? Theoretically then, anything that can be formulated in a contest or game could be "learned" and modeled, and then combined with text generation to be taught. That seems pretty powerful, and not so far out of reach.
(Also, I love the idea of letting GPT reply to work emails for a day. Someone should set up some experiments to actually do that, and find out what happens. I bet that even though it would be a disaster, we would learn a ton.)
Yeah, I think that sort of stuff is exactly the future of computing. Just because GPT-{N+1} isn't an AGI doesn't mean that GPT-{N+1} won't cause an exciting step-change in what's possible with computing.
> Someone should set up some experiments to actually do that, and find out what happens. I bet that even though it would be a disaster, we would learn a ton.
Ha! Agreed! Unfortunately I do too much soft external-facing stuff these days to do this myself :(. Someone who interacts mostly internally with engineers/techy types who might appreciate the experiment should totally do this.
It depends how data hungry those models get. It might be the case that to train GPT-10 you actually need more data than there is available in the universe, so you have no chance of training it.
Possibility: The human brain and GPT-3 are doing radically different things and aren't even comparable. GPT-3 is merely memorizing enough language to pass a turing test, whereas human brains are actually learning how to use language to communicate with other humans.
Evidence: Have GPT-3 write your work emails for a day, and then live with the consequences of that for the next week. It's going to produce text that makes sense, but only in a very particular way. You will end up with email threads where an outsider says "yeah looks like work emails. I believe this is two humans" And, that's very impressive! But your actual interlocutor who understands the full context of the conversation and relationship will legitimately worry for your mental health and maybe even contact your manager.
Conclusions:
1. Any time you invent a test, people will find ways to pass the test with flying colors but completely miss the point of the test. The turing test is no different.
2. Being able to imitate a human well enough in a five minute general english conversation is only so useful. There's a reason we don't pay people living wages to write internet comments. This isn't to say that GPT-3 is useless, though. There is certainly demand for very specialized five minute conversers that come at zero marginal cost. I'm worried.
3. We still have no clue how to even begin to approach AGI.