Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Choose-Your-Own-Adventure AI Dungeon Games (gwern.net)
93 points by eversowhatev on Jan 3, 2022 | hide | past | favorite | 23 comments


In this article, Gwern proposes replacing AI Dungeon's free-text parser with a menu of CYOA options. I don't think it will work, at least not with the current generation GPT-3.

GPT-3 won't be able to generate a good CYOA until it can generate a good non-interactive novel. Today, GPT-3 can generate text, so you could try to use it to generate a whole novel. People have, but the novels it generates aren't worth reading.

"CYOAs" are interactive novels. I run a company, Choice of Games, that publishes hand-written interactive novels. Our novels use hidden state to provide an experience that's longer, richer, and deeper than traditional paper-based CYOA books. (The author of TFA links to one of our blog posts in the footer of this article. https://www.choiceofgames.com/2011/07/by-the-numbers-how-to-... )

Our approach is to pay professional authors to write these interactive novels. (And we have professional editors who read and edit their work.)

Could we use GPT-3 to avoid paying professional authors? No, of course not. Writing an interactive novel with interesting, dramatic choices is harder than writing a non-interactive novel, and GPT-3 can't even do that.

The feature that makes AI Dungeon interesting and unique is its ability to improvise in response to player's actions. Presenting a pre-computed menu of options is what hand-written interactive novels already do; to succeed, it would have to compete with hand-written interactive novels on quality.

In other words, when GPT-3 computes an entire novel, it's indistinguishable to the reader from a (bad) hand-written novel. If AI Dungeon were to auto-generate an interactive novel, particularly in the way Gwern describes here, with the choices already pre-computed and crowdsourced in advance, the result would be indistinguishable from a (bad) hand-written interactive novel.

I think it's possible that if some future generation of GPT could generate a good-enough novel, then it could also generate a good-enough interactive novel, but this thing has gotta learn how to crawl before it can learn how to walk.


I think what Gwern is proposing is perhaps better seen by you as an optimization for the current AID rather than a "true" CYOA novel implementation. Because most of what you said applies to AID as well. I toyed with it a bit, but even short snippets of text are often incoherent and I have to give it a lot of grace to operate. In particular, while I don't necessarily mind the way it'll just introduce a new character, I don't like the way they disappear equally quickly and without fanfare.

While AID's nominal attraction is the ability to react to anything, I think based on the evidence only a small percentage of the users end up using it that way. (It's possible they're all the long term users, in which case they are important, but I still think it's a small proportion of the users.) The vast majority of the users the vast majority of the time will be selecting from a small handful of options that would constitute the vast majority of responses.

To the extent you'd find the resulting CYOA rather unappealing, I'd say the current AID is pretty much unappealing in exactly the same way, for exactly the same reasons, and given that AID hasn't exactly taken the world by storm I imagine this is the majority view. (Though it may have the sort of inner core rabid fanbase that you can still build on as a business.)


> Today, GPT-3 can generate text, so you could try to use it to generate a whole novel. People have, but the novels it generates aren't worth reading.

Novels are definitely out of reach for these language models. GPT-3 currently has a limit of 2048 tokens. This is short-story territory but definitely not "book" length. I agree in this regard.

> GPT-3 won't be able to generate a good CYOA until it can generate a good non-interactive [story].

I disagree here, however. One of the best things about AI-generated text is that it can generate a multitude of options for story progression. Sure, not all progressions will be good, but the most important thing in CYOA is the _choice_. As long as there are some good options, users can choose which progressions they like best, even if the other options suck.

For example, with this toy example [1], not all the options for the progression of the story are great. But if you pick and choose which progressions you like best, you can arrive at a pretty good ending, such as [2]. -- Albeit this is entirely subjective to what each individual finds entertaining

[1] https://toldby.ai/arK_3OpvpkG

[2] https://toldby.ai/aQAXlq3LNku/end

(the site in the links above is GPT-3 generated CYOA)


Amazing who you run into on this site. In High School (~2010 on) I played through your games as fast as you could make them. Thanks for the hours!


I'm an AI dungeon subscriber, and what I like about it is that it reacts to whatever I can think of to say. If it was mostly predefined options, I'd probably not be interested in it: there are plenty of interactive fiction games already, and they're crafted by humans and generally better-written than what AID could ever come up with. Personally, I enjoy the looseness, how you have to kind of hold the AI's hand and steer the scene yourself. I doubt I'm representative of most people, but I think the number of people who would pay for a CYOA text adventure game is pretty low anyway. Latitude might as well stick with weirdos like me for awhile, and pray that the overhead costs come down.


I didn't personally enjoy AI dungeon too much because of how loose it felt - nothing was defined and anything goes - but I hear people describe it like you did, and it sounds similar to how I heard lucid dreaming described - you're simultaneously driving but also part of the experience.


Wow, I am saddened that I've only now found this post, I wish I had seen it sooner! This article perfectly describes my motivations for building [1] - exactly what is described here, although as a smaller toy example

[1] https://toldby.ai

I launched this weekend project in August, but it looks like Gwern's post preceded it in June.

There's unfortunately a delicate balance between caching, ranking, and novelty. Caching absolutely helps keeps costs low - no need to regenerate new text for every user. Ranking also helps bring the popular branches to the top. But with naive ranking, the site quickly becomes stale and the same branches always show up. My girlfriend complained that she's bored of the same branches that always appear on the home-page. There needs to be some sort of novelty factor to keep the site fresh. But implementing this would increase costs!

Something like per-user ranking is needed to avoid showing the same branches to the same users over and over, but this is currently beyond my level of time commitment :)


You would have to retrain the model to incorporate a global context if you wanted to get a coherent adventure / DMing experience out of it. Once the text tokens (256-2048 depending on the model I believe) go out of scope, they are no longer used for future responses. There's no reason you couldn't build a model around an encoding of a coherent world and then let the GPT part assemble the prose other than actually training and designing such a model. If no man's sky can do procedural worlds, I see no reason why you couldn't do the same thing for text adventures.


I think people are in agreement with this conceptually, and the challenge is figuring out how to effectively implement it. I'd certainly love to see a hybrid like this with a coherent world model, but for example, how do you avoid the existing challenges with procedural worlds where you're basically limited to combinatorial combinations of a small number of primitive features? If the free-form text can't affect more than toggling of predetermined procedural state combinations, it's basically just an interface to a traditional world sim with more window dressing.


Maybe "an interface to a traditional world sim with more window dressing", is all you need?

Especially if the traditional world sim is fairly rich itself (not like, as rich as the most rich that had been made before, but still decent), and the way the GPT part connects to it is good enough.


I would start with a rogue-like game and the hard part becomes generating the training data to describe in beautifully flowing prose what happens on a per move basis. Enter your patient friend who offers to do this and has run lots of campaigns previously.

It seems like something that could be trained starting with an existing GPT model and specialized to this task to me, but I'm just brain farting here. And by all means steal this idea if you're so inclined.


It seems like the hardest part isn't world model -> prose (still hard tbc), but prose -> world model. There have been games for a long time that let you interact with a world model via simple verb + object statements (e.g. "get key", "search chest"), but the power of GPT3-style AI dungeons isn't just that it handles longer, more natural text input, it's that there are no restrictions on the (short-term, incoherent) world model they use to generate responses. If you want to "throw key at the biggest goblin", you'll get a response that's coherent (in the short term) instead of an error saying that you can't do that with keys.

When you pair a better, prosier text input-response system with a procedural world model, if the procedural world model is built like existing ones you'll still have to deal with restrictions in the world's content and the valid interactions between constituents. At which point all that fancy AI makes for a more natural parser but doesn't add any richness to the game world, which is the thing we'd like to get at.

If anyone has ideas for how to solve the prose -> world model problem, that could be super fun and rewarding to work on, probably so much so that they wouldn't be sharing them here :(


There's a really interesting cross-over here with literary structure. You need to teach a model to recognize tone, theme, and setting in addition to world "facts."

What makes a world like Westeros different than the Shire isn't just facts like "seasons are really long" but elements of tension, political unrest, and the outlook of the people who inhabit it. To do this effectively, you need to be building simulations of literary works. There's always going to be an art to this, but instead of the manuscript, it's going to be about the parts of the model.


I think in the context of natural language for game worlds, "prose" is targeting the much lower bar of something that sounds like a human might write it, vs like "you PLACED KEY in THE LEFT DOOR. Proceed to A STAIRCASE" from games of yore. Understanding the much richer space of literary prose, such as the difference between the Shire and Westeros, seems like a harder problem, but also one that might be well suited for existing NLP training pipelines, because you can label a large amount of text with only a few descriptors. It might be tough to come to a consensus amongst literary experts what those descriptors should be, but e.g. if LOTR book 1 is "pastoral" and ASOIAF book 1 is "gritty", you've now got a lot of text associated to each of those labels. I wonder if anyone is working on this?


SHRDLU was a simple world you could manipulate with English prose back in the 1960s.

https://en.wikipedia.org/wiki/SHRDLU

Unless you feel you need flowery expressive input as well as output. But do we really need input more sophisticated than that of Zork here?


The last time I played AI Dungeon, I burned down a hotel, then checked in, went upstairs and connected my laptop to the wifi. When I summoned Cthulhu, a bottle of red wine arrived.


It is funny how dream-like deep learning produced content can get. Like how in https://thispersondoesnotexist.com/, you will sometimes find someone wearing a cap which looks like it has a logo but it's just a jumble of nonsense, like when you try to read text in a dream.


Many years ago I did some experimentation with the idea of slapping a very crude "parser" (no AI involved, just extremely basic string matching) on top of a Lone Wolf CYOA book which was made public via Project Aon [0]. I wrote about it here [1], and the result is still playable here [2].

[0] https://www.projectaon.org/en/Main/Books

[1] https://github.com/cjauvin/gamebook.js

[2] https://projectaon.org/staff/christian/gamebook.js/


With a branching factor of 5 it won't take long before every person playing has made an unique sequence of choices, so I don't think caching would help you as much as you'd think. You could maybe cheaply serve the people who try a couple of decisions and then get bored.

It could make a good landing page though - play for 10 steps and then spring the "subscribe to continue your adventure".


I'm wondering how everyone else felt browsing this website, and using links in particular?

For me, on mobile, they open (I assume) inside an iframe or such, which is counter to how standard links work. This made it really frustrating for me to use. The content seems interesting overall, but I gave up due to how links work. Ridiculous, I know.


The experience is quite good on desktop!


I'm reasonably confident I've seen a cyoa implemented like this get posted to HN, but I can't find the link offhand.


This is amazing!! I have to get back to work now, but this is very very cool!

here's the first 5 minutes of my adventure:

Pick a setting... Apocalyptic

Select a character... Courier

You are Chris, a courier trying to survive in a post apocalyptic world by scavenging among the ruins of what is left. You have a parcel of letters and a small pistol. It's a long and dangerous road from Boston to Charleston, but you're one of the only people who knows the roads well enough to get your parcel of letters there. You set out in the morning and hope to have it in your hands by the time the sun goes down, or else you know you'll never get it there.

You walk down the road and realize that you must be close to the highway. You hold the parcel of letters in your hands and read them.

Dear Chris, It's time for you to get to Charleston. The old man was insistent on this, so you'll be making the trip. You search for a vehicle.

You walk onto the highway and look for a vehicle for yourself. After a time, you see one in the distance, a small hatchback. You check if the small hatchback is operational.

You walk over to the small hatchback and look inside. It looks like it has never been used, and the interior is very clean. You open the door and climb inside. You use the relative position of the rising sun to orient yourself and drive south.

You turn the key and start the small car. It sounds like a lawn mower and comes to life with a burst of smoke. It purrs to life and you back out of the crowded parking lot. You continue driving south.

You drive down the road, glancing at the fuel gauge as you go. It seems to indicate half a tank, which seems unlikely, but you cannot be sure. You continue driving south and look for fuel.

You continue driving south and look for fuel at every station. You come up empty at each one. It is a long and boring drive, and you find yourself staring at the ceiling of the car more and more often. courier Created: Jan 3rd 2022 at 4:20PM Updated: Jan 3rd 2022 at 4:26PM Actions: 14 0 You are Chris, a courier trying to survive in a post apocalyptic world by scavenging among the ruins of what is left. You have a parcel of letters and a small pistol. It's a long and dangerous road from Boston to Charleston, but you're one of the only people who knows the roads well enough to get your parcel of letters there. You set out in the morning and hope to have it in your hands by the time the sun goes down, or else you know you'll never get it there.

You walk down the road and realize that you must be close to the highway.

> You hold the parcel of letters in your hands and read them.

Dear Chris, It's time for you to get to Charleston. The old man was insistent on this, so you'll be making the trip.

> You search for a vehicle.

You walk onto the highway and look for a vehicle for yourself. After a time, you see one in the distance, a small hatchback.

> You check if the small hatchback is operational.

You walk over to the small hatchback and look inside. It looks like it has never been used, and the interior is very clean. You open the door and climb inside.

> You use the relative position of the rising sun to orient yourself and drive south.

You turn the key and start the small car. It sounds like a lawn mower and comes to life with a burst of smoke. It purrs to life and you back out of the crowded parking lot.

> You continue driving south.

You drive down the road, glancing at the fuel gauge as you go. It seems to indicate half a tank, which seems unlikely, but you cannot be sure.

> You continue driving south and look for fuel.

You continue driving south and look for fuel at every station. You come up empty at each one. It is a long and boring drive, and you find yourself staring at the ceiling of the car more and more often. View Adventure Ai Dungeon Logo




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: