This exact scenario is what I described to a friend of mine who is an AI researc...

visarga · on March 25, 2023

Current LLMs decode in a greedy manner, token by token. In some cases this is good enough - namely for continuous tasks, but in other cases the end result means the model has to backtrack and try another approach, or edit the response. This doesn't work well with the way we are using LLMs now, but could be fixed. Then you'd get a model that can do discontinuous tasks as well.

>> Write a response that includes the number of words in your response.

> This response contains exactly sixteen words, including the number of words in the sentence itself.

It contains 15 words.

The model would have to plan everything before outputting the first token if it were to solve the task correctly. Works if you follow up with "Explicitly count the words", let it reply, then "Rewrite the answer".

Dzugaru · on March 25, 2023

> but could be fixed

How? The problem is known for a while, for example this article [0] mentions it (as Chain of Thought reasoning). You could think that just having a scratchpad of tokens is enough - you can arguably plan, backtrack and rewrite there [1], right? But this doesn't really work, at least yet - maybe because it wasn't trained for that - and maybe ChatGPT massive logs (probably available only for OpenAI) can help. But the Microsoft report [2] suggests we need a different architerture and/or algorithms? They mention lack of planning and retrospective thinking as a huge problem for GPT-4. Maybe you know some articles on the ideas how to fix this? Backtracking, trying again seems to be linked to human thought - and very well can give us AGI.

[0] https://arxiv.org/abs/2201.11903

[1] https://www.reddit.com/r/ChatGPT/comments/120fi8e/chatgpt_4_...

[2] https://arxiv.org/abs/2303.12712

bertday · on March 25, 2023

You may be shocked to hear this but dijkstra’s short path algorithm is the technical answer to this question. We just don’t use it because it’s expensive.

visarga · on March 25, 2023

Language chains or tool use where it can also call on itself to solve subproblems. If you don't have to do just one round of LLM interaction you can do complex stuff.

yorwba · on March 25, 2023

Backtracking to edit the response is theoretically easily solved by training on a masked language modeling objective instead of an autoregressive one, but using it to actually generate text is a bit expensive because you can't just generate one token at a time and be done, you might have to reevaluate each output token every time another token is changed. So I expect autoregressive generation to remain the default until the recomputation effort can be significantly reduced or hardware advances make the cost bearable.

YeGoblynQueenne · on March 25, 2023

>> Backtracking to edit the response is theoretically easily solved by training on a masked language modeling objective instead of an autoregressive one, but using it to actually generate text is a bit expensive because you can't just generate one token at a time and be done, you might have to reevaluate each output token every time another token is changed.

I can't imagine how training on masked tokens can "easily" solve backtracking, even in theory. Do you have some literature I could read on this?

kastnerkyle · on March 25, 2023

Discrete diffusion with rewriting can work well. It feels loosely similar to backtracking, if you assume n_steps large enough - need to be able to rewrite any non-provided position though I think (not all setups do this). Downside is the noise in discrete diffusion (in simplest case randomizing over all vocabulary space) is pretty harsh and makes things very difficult practically. Don't have an exact reference on the relationship, but feels similar to backtracking type mechanics in my experience. I found things tend to "lock in" quickly once a good path is found, which feels a lot like pathfinding to me.

Some early personal experiments with adding "prefix-style" context by a cross-attention (in the vein of PerceiverAR) seemed like it really helped things along, which would kind of point to search-like behavior as well.

Probably the closest theory I can think of is orderless NADE, which builds on the "all orders" training of https://arxiv.org/abs/1310.1757 , which in my opinion closely relates to BERT and all kinds of other masked language work. There's a lot of other NAR language work I'm skipping here that may be more relevant...

On discrete diffusion:

Continuous diffusion for categorical data shows some promise "walking the boundary" between discrete and continuous diffusion https://arxiv.org/abs/2211.15089 , personally like this direction a lot.

If you have a pre-made embedding space, SSD-LM is a straightforward method https://arxiv.org/abs/2210.17432

SUNDAE worked well for translation https://arxiv.org/abs/2112.06749 and many other tasks.

My own contribution, SUNMASK, worked reasonably well for symbolic music/small datasets (https://openreview.net/forum?id=GIZlheqznkT), but really struggled with anything text or moderately large vocabulary, maybe due to training/compute/arch issues. Personally think large vocabulary discrete diffusion (thinking of the huge vocabs in modern universal LM work) will continue to be a challenge.

Decoding strategies:

As a general aside, I still don't understand how many of the large generative tools aren't exposing more decoding strategies, or hooks to implement them. Beam search with stochastic/diverse group objectives, per-step temperature/top-k/top-p, hooks for things like COLD decoding https://arxiv.org/abs/2202.11705, minimum Bayes risk https://medium.com/mlearning-ai/mbr-decoding-get-better-resu..., check/correct systems during decode based on simple domain rules and previous outputs, etc.

These kinds of decoding tools have always been a huge boost to model performance for me, and having access to add in these hooks to "big API models" would be really nice... though I guess you would need to limit/lock compute use since a full backtracking search would pretty swiftly crash most systems. Maybe the new "plugins" access from OpenAI will allow some of this.

bertday · on March 25, 2023

Backtracking is easily solved with a shortest path algorithm. I don’t see any need for masking if you are simply maximizing likelihood of the entire sequence.

SanderNL · on March 30, 2023

I don't think humans can do this either. What's the problem with producing a result and then fixing it? It's exactly how we do it.

andsoitis · on March 25, 2023

> This exact scenario is what I described to a friend of mine who is an AI researcher. He was convinced that if we trained the AI on enough data, GPT-x would become sentient. My opinion was similar to yours. I felt like the hallucinating the AI does was insufficient in performing true extrapolating thought.

It turns out it isn’t just AIs that hallucinate; AI researchers do as well.

physPop · on March 25, 2023

"researcher".

Majromax · on March 25, 2023

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

Is there enough data?

As I understand it, the latest large language models are trained on almost every piece of available text. GPT-4 is multimodal in part because there isn't an easy way to increase its dataset with more text. In the meantime, text is already quite information dense.

I'm not sure that future models will be able to train on an order of magnitude more information, even if the size of their training sets has a few more zeroes added to the end.

call-me-al · on March 26, 2023

what about all the content not yet in text form (e.g. YouTube videos)?

psychphysic · on March 25, 2023

The threshold for sentience is continually falling.

So he might be right but due to time and not due to improved performance.

I believe in the UK all vertibrates are considered sentient (by law not science). That includes goldfish.

And good luck even getting a goldfish to reverse a linked list. Even after 1000 implementations are provided.

d4v3 · on March 25, 2023

Goldfish are likely more intelligent than you give them credit for

https://petkeen.com/can-goldfish-be-trained/

https://www.npr.org/2022/01/11/1072095219/goldfish-driving-c...

delusional · on March 25, 2023

I don't think that when people commonly discuss sentience they mean to include goldfish. I don't think the legal definition (which probably exists due to external legal implications) has any bearing on the intellectual debate of AI sentience.

InfamousRece · on March 25, 2023

Sentience is just the capacity to experience feelings and sensations. Goldfish can do it and AI can’t (so far).

narwally · on March 25, 2023

If I were talking about sentience I would definitely be including goldfish. What about them is so different to us that we would have sentience while they would not?

wslh · on March 25, 2023

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

Not saying your friend is right or wrong, but imagine if civilization gives more information, in realtime, to an AI system through sensors: will be at least sentient as the civilization? Seems like a scifi story, a competitor to G-d.

antonvs · on March 25, 2023

Isaac Asimov wrote a story along those lines, “The Last Question”, which he described as “by far my favorite story of all those I have written.” Full text here:

https://xpressenglish.com/our-stories/the-last-question/

danaris · on March 25, 2023

Some versions of divinity (both from real-world beliefs and sci-fi/fantasy) have it being essentially a gestalt of either all the souls that have ever died, or all those alive now—a kind of "oversoul" or collective consciousness.

While that's an interesting thought experiment, I don't think it can meaningfully apply to any kind of AI we have the capability to make today, even if we could hook it up directly to all our knowledge. Information alone can't make something sentient; it requires a sufficiently complex and sophisticated information processing system, one that can reason about its knowledge and itself.

kaba0 · on March 25, 2023

I’m not at all an expert on the topic, but from what I gathered LLMs are fundamentally limited in the kind of problems they can approximate. They can approximate any integrable function quite well, but we can only come up with limits on a case-by-case basis for non-integrable ones, and I believe most interesting problems are of this latter kind.

Correct me if I’m wrong, but doesn’t it mean that they can’t recursively “think”, on a fundamental basis? And sure I know that you can pass “show your thinking” to GPT, but that’s not general recursion, just “hard-coded to N iterations” basically, isn’t it? And thus no matter how much hardware we throw at it, it won’t be able to surpass this fundamental limit (and without proof, I firmly believe that for a GAI we do need the ability to basically follow through a train of thought)

sebzim4500 · on March 25, 2023

How is it "hard-coded to N iterations"? We don't instruct the model how many lines of working it should show.

Obviously there is a limit to how much it can fit in the context, but that seems to be rising fast (went from 4k to 32k in not that long)

kaba0 · on March 25, 2023

It fundamentally can’t recurse into a thought process. Let’s say I give you a symbol table where each symbol means something and ask you to “evaluate” this list of symbols. You can do that just fine, but even in theory not even GPT-10384 will be able to do that without changing the whole underlying model itself.

sebzim4500 · on March 25, 2023

I don't understand the task. What does evaluating the list of symbols mean?

Do you mean you define a programming language/bytecode and then feed it into the model?

He's an example where GPT-4 did this perfectly for a very sinple language. This was my first attempt, I did not have to do any trial an error.

https://pastebin.com/4YA5wpie

kaba0 · on March 25, 2023

Could you try writing even in this simple language a longer program? Just simply increase the input to 20x or something around that. I’m interested in whether it will break and if it does, at what length.

sebzim4500 · on March 25, 2023

Interesting, it screwed up at step 160. I think it probably ran out of context, if I explicitly told it to output each step in a more compact way it might do better. Or if I had access to the 32k context length it would probably get 4x further.

Actually it might be worth trying to get it to output the original instructions again every 100 steps, so that the instructions are always available in the context. The ChatGPT UI still wouldn't let you output that much at once but the API would.

aiphex · on March 25, 2023

If they aren't already, AIs will be posting content on social media apps. These apps measure the amount of attention you pay to each thing presented to you. If it's more than a picture or a video, but something interactive, then it could also learn how we interact with things in more complex ways. It also gets feedback from us through the comments section. Like biological mutations, AIs will learn which of its (at first) random novel creations we find utility in. It will then better learn what drives us and will learn to create and extrapolate at a much faster pace than us.

danaris · on March 25, 2023

> If they aren't already, AIs will be posting content on social media apps.

No, people will be posting content on social media apps that they asked LLMs to write.

It may be done through a script, or API calls, but it's 100% at the instigation, direct or indirect, of a human.

LLMs have no ability to decide independently to post to social media, even if you do write code to give them the technical capability to make such posts.

galleywest200 · on March 25, 2023

With the new ChatGPT Plugins, it seems they may actually be able to make POST requests to social media APIs soon. It is likely that an LLM could have "I should post a tweet about this" in its training data.

Granted... currently it is likely humans that have written the code that the new Plugins are allowed to call -- but they have given ChatGPT the ability to execute rudimentary Python scrips and even ffmpeg so I think it is only a matter of time before one outputs a Tweet written by its own code.

danaris · on March 25, 2023

> It is likely that an LLM could have "I should post a tweet about this" in its training data.

That only matters if a human has explicitly hooked it up so that when ChatGPT encounters that set of tokens, it executes the "post to Twitter" scripts.

ChatGPT doesn't comprehend the text it's producing, so without humans making specific links between particular bundles of text and the relevant plugin scripts, it will never "decide" to use them.

aiphex · on March 26, 2023

At a high level, all that would have to happen is a person gives GPT, or something like it, access to a social media page and tells it to post to it with the objective of getting the highest level of interaction and followers.

danaris · on March 26, 2023

...which in no way grants GPT sapience, nor would it prove that it has it.

The human is still providing the capability to post, the timing script to trigger posting, and the specific heuristic to be used in determining how to choose what to post.

dmichulke · on March 25, 2023

> Yet despite lacking knowledge, us humans still come up with consistently original thoughts and expressions of our intelligence daily.

I think there is some sampling bias in your observation ;-)

oliveiracwb · on March 25, 2023

More data will only mean more inference. But at some unexpected moment, the newly created "senseBERT" breaks the barrier between intelligence and consciousness.

antonvs · on March 25, 2023

> He was convinced that if we trained the AI on enough data, GPT-x would become sentient.

It sounds like he doesn't even understand the basics of what GPT is, or what sentience is. GPT is an impressive manipulator/predictor of language, but we have evidence from all sorts of directions that there's more to sentience or consciousness than that.

braindead_in · on March 25, 2023

I would like to propose a thought experiment concerning the realm of knowledge acquisition. Given that the scope of human imagination is inherently limited, it is inevitable that certain information will remain beyond our grasp; these are the so-called "known unknowns." In the event that an individual generates a piece of knowledge from this inaccessible domain, how might it manifest in our perception? It is likely that such knowledge would appear incomprehensible to us. Consequently, it is worth considering the possibility that the GPT model is not, in fact, experiencing hallucinations; rather, our human understanding is simply insufficient to fully grasp its output.

meindnoch · on March 25, 2023

Yeah. Maybe when a baby says "gabadigoibygee", he is using an extremely efficient language that is too sophisticated for our adult brains to comprehend.

Yeah, maybe.

andsoitis · on March 25, 2023

> In the event that an individual generates a piece of knowledge from this inaccessible domain, how might it manifest in our perception? It is likely that such knowledge would appear incomprehensible to us.

If what a person says cannot be comprehended by any other person, we usually have a special term for it.

pharmakom · on March 25, 2023

But the hallucinated code doesn’t work.

ChatGTP · on March 25, 2023

This is ridiculously “meta”, but I’ve said the same thing, at some point GPT-x will be useless as it will be beyond our comprehension, that’s if it’s actually “smart”.

My honest opinion is the hallucinations are just gibberish, but are they useful gibberish? Maybe we’re saying the same thing ?

andsoitis · on March 25, 2023

> GPT-x will be useless as it will be beyond our comprehension, that’s if it’s actually “smart”.

Things don’t have to be comprehensible before they’re useful. But they have to work to be useful.

goatlover · on March 25, 2023

Not hard to check whether code compiles or runs.