The Game of Dialog: Simulating Conversation in Games

Matt Barton's picture

Although games have certainly come a long way since the days of Spacewar! and William Crowther's Adventure, the great bulk of these advancements have been in the realm of graphics. Games definitely look a lot more sophisticated than they ever have before. However, one area that is still painfully lacking in games is artificial intelligence, particulary regarding dialog between players and computer-controlled characters. What I intend to do here is discuss a few approaches game developers have taken to address this issue--and why sometimes less is more.

Joseph Weizenbaum: The Father of ELIZAJoseph Weizenbaum: The Father of ELIZAPerhaps one of the most sought after goals in all of computing is the achievement of artificial intelligence (AI). Although the nature of AI and whether such a goal is possible have long been hotly contested by both computer scientists and psychologists (John Searle springs to mind), game developers have long joined the fray. The idea there is that games must continue to challenge players and adapt; they must essentially "learn" from experience and behave in ways that imitate actual human behavior. Today, the term "AI," at least as it applies to gaming, is usually in reference to tactics in fighting or strategy games. There's nothing as disruptive as seeing some enemies in a first-person shooter continuously running into a wall--or just standing on the corner as someone shoots their elbow fifty times until they die. Obviously, it takes a great deal of good programming to make enemies behave more believably, but it always seem to be in the realm of the possible. At least, I don't doubt that one day we'll have first-person shooters in which the computer opponents behave just as cunningly as a human. Indeed, it will probably be indistinguishable which characters are being controlled by the computer and which are actual human opponents (particularly in the case of multiplayer online games).

However, no matter how well a character might duck, jump, and devise strategies to beat even the most manipulative human, there is one easy way to tell if there is silicone or gray matter at the helm: say hello. To put it simply, even in 2006, a five-year old child has a much better language system in place than even the most sophisticated computer.

Chatterbots and Natural Language Processing
Although there certainly have been cleverly coded chatter bots, or programs that attempt to mimic humans in a chat room (usually a text-only environment), no one with any sense would confuse these parlor tricks with actual intelligence. They usually work by ferreting out keywords and responding in what is hopefully a believable manner. The famous program ELIZA, for instance, used the conceit of a Rogerian therapist. Since Rogerian therapists are thought to possess a very strange and contrived manner of conversation anyway, Eliza's odd questions and responses seemed believable enough. Other chatterbots have relied on various other techniques, such as simulating bad typing, spelling and the like. The classic test these bots were subjected to is the "Turing Test." The idea there is that a human wouldn't be able to tell the difference between a bot and another human. Needless to say, no bot has yet managed to pass that test in a truly satisfactory manner.

Perhaps one area where game developers have worked the hardest in achieving believable dialog is interactive fiction (or, "adventure games," or "interactive drama," or what have you). With the typical text adventure, the goal might be to simulate "natural language" to the point where the player and narrator could function like a dungeon master and player of a tabletop role-playing game. Rather than rely on strict, highly artificial syntax and a database of stock responses, the narrator could "talk" with the player to figure out what he wanted to try, then act accordingly.

Obviously, such a parsing system would be extremely difficult to program. The reasons are apparent to anyone who has really thought much about language. I really think that Searle's Chinese Room Argument puts the issue best. Does knowing all the appropriate symbolic responses to a string of Chinese characters really mean that you understand Chinese? Of course not. Readers familiar with Noam Chomsky might also cite the nature of a generative grammar as an impossible obstacle. To put it simply, even seriously limited native speakers of a language can produce a near-infinite variety of acceptable statements and responses. Imagine how many different ways you could really say "take lantern": "Make that lantern mine." "Lantern; oh, lantern, joinest my inventory thou now must do." "That brass lantern that was on the table? Now it's in my hand." You get the idea. And that's not even taking into consideration other possible responses, such as "How'd that lantern get there?" or "That's a really ugly lantern" or "Damn. That's so cliche. A brass lantern?" An actual human could respond to each of these statements with little difficulty, but a computer trying to parse them would probably retort something like, "I don't see a 'cliche' here."

A recent return to earlier forms of dialog can be seen in a free game called Facade, which returns to the tradition of simulating natural language processing in order to fool players into thinking its parser is much more sophisticated than it actually is.

There are a few ways a game developer can approach this problem. As I mentioned before, several text adventure game engines have worked ever more diligently towards providing better natural language processing (NLP). What you'll find as you read more about these systems is that the creation of a "natural language processor" has resulted mostly in the creation of an "artificial language" (i.e., jargon) that these folks use to befuddle themselves and the outside world into thinking they know what they're talking about. To make a long story short, the whole process of language acquisition is about as poorly understood to us at the inner workings of the human brain. Suffice it to say, NLP is a very long way away, and certainly not a practical option for a modern game developer interested in better dialog.

Menu-Based Approaches
Another approach that became common in graphical adventure games is to present dialog as a series of menu options. For instance, a player encountering a troll might be shown a menu like the following:
(a) Die, troll! (attack)
(b) Excuse me, sir, do you mind if I go past?
(c) Hello, my name is Fred. What's yours?

Once the player has made a selection, the game can respond accordingly. Usually in these situations there is really only one "correct" response, which either must be chosen for the game to progress or is actually the inevitable choice regardless. In the above exchange, for example, it might be pre-ordained that the player must fight the troll. Selecting options B or C might result merely in a grunt from the troll and an attack. This technique presents the illusion of choice without having to bother with more dynamic gameplay.

Cave of Time: 40 possible endings...Yeah.Cave of Time: 40 possible endings...Yeah.However, other possibilities exist. The game could follow a "branching tree structure." Maybe choices B or C could lead independently to a series of other choices, and so on, until at last there were millions of possible outcomes. The problem with this structure is the exponential growth of the structure. Of course, everyone will remember those great "Choose Your Own Adventure Books," which generally solved this problem by quickly ending the narrative if the player made the "wrong" decision. It's really a form of cheating to tell a player to "choose his own adventure" when, in reality, we all know that it's a highly linear narrative with a single correct trajectory through a series of pre-defined decision-making moments.

The number of games that take the menu-based approach is large indeed, and it seems to be the most common approach even today. Perhaps the most well-known adventure games of this type are LucasArts', particularly Ron Gilbert's Secret of Monkey Island series. These developers always seemed aware of the limitations of the menu-based system and made light of it, often to quite humorous effect. Indeed, anyone who has played the first Secret of Monkey Island game will remember the sword-fighting scenes, where the player must exchange insults with his opponents in order to progress. However, the player can only succeed if Guybrush Threepwood (the avatar) offers the reponse that correctly corresponds to the one hurled by the opponent. Unfortunately, Guybrush can only learn what the right responses are by a combination of trial and error and experience in many such battles, so that at last he has the whole reportiore at his disposal. To my mind, this is the most clever use ever made of a menu-based dialog system. At the very least, it was the most fun. Revolution attempted, well, a revolution, by creating a very complex dialog system in its game Lure of the Temptress, where players could string together very long commands and responses to characters. Unfortunately, this resulted more in confusion and frustration than enjoyment.

Of course, one of the obvious problems with a menu-based system is that a player might very well desire to order a dish that's not on the menu. This is often the case in rigidly linear games that try to force a player along a narrow route. There is also a problem in that the resulting conversations (particularly when they're spoken out by actors) feel tremendously artificial when things are said out of order. For instance, a player might hear a character say, "Ah, good point" or "Well, now that you mention it" at a moment that makes no sense. Furthermore, we don't use the same tone of voice and inflections at the beginning as we do the end of a conversation. Game developers have tried to get around these limitations by having the characters break off with an automated "goodbye" of some sort, but these canned responses often seem repetitive and unconvincing. The game that springs to my mind here is Sierra's Gabriel Knight 2.

Sam & Max: A Dialog System Based on IconsSam & Max: A Dialog System Based on IconsSome games try to shore up some of these weaknesses by offering icon-based menu options rather than strings of dialogue. This is true of many later LucasArts games, such as Sam & Max Hit the Road, but also many modern games such as Revolution's Broken Sword 3: The Sleeping Dragon or Unknown Identity's The Broken Mirror. The idea here seems to be that presenting the player with icons allows for a bit more room for abstraction and generalization--and also a bit of suspense, since the player won't know exactly what the avatar will say. For instance, in The Broken Mirror, players are frequently presented with two choices for responding to dialog: A smiley mask ("positive") and a scowling mask ("negative"). This technique offers the player significant control without totally relinquishing the element of surprise. I might add that it also greatly simplifies translation, since there will obviously be much less text involved at the interface level.

One of the most interesting systems of abstract dialogue can be found in SSI's Gold Box Games. There, players could choose among "haughty," "meek," "sly," "nice," and "abusive" options in dialog. This five-tier system, while perhaps somewhat overly complex and a bit redundant, nevertheless makes for some very interesting possibilities when dealing with computer-controlled characters. However, again we have the problem of the branching narratives. The solution is either to link most of the choices to the same fixed result (i.e., four will cause the creature to attack, one will cause him to flee) or limit the dialog to one or two turns. Most games seem to opt for both, with the result that only a highly linear and narrow set of "correct" decisions will result in the desired outcome.

The norm now seems to be tending towards either a purely icon-based menu system or a hybrid of icons and text. Fun Com took an interesting approach with Dreamfall, in which the avatars seem to be thinking aloud about the dialog options (i.e., "I should try to show some interest in his job.") Cinemaware attempted a similar approach to dialog in its 1987 game King of Chicago. In this game, the characters would talk using pop-up bubbles like those seen in comic books. The player made choices about the dialogue by selecting one of two "thought clouds" that would appear above the avatar's head, again in the fashion of comic books. More interesting still, the player's mouse arrow was replaced by a fly during these segments, and sometimes Pinky would make remarks indicating that he was being pestered by the fly. Unlike most other games, the player could elect to do nothing, and after brief pauses Pinky would make a random choice and the game would move forward, thus preserving the game's frantic pace. Both Cinemaware's and Fun Com's approaches really seem to help the player "get into the head" of the avatar and seems to represent some interesting possibilities.

Future Possibilities
Ostensibly, we might think that what's really called for is a more dynamic system in which the computer could generate believable responses to any combination-- a sort of "on-the-fly" approach to dialog. However, with a highly limited icon-based menu system, it quickly becomes apparent just how unfeasible this is--particularly if the game is to have any sort of narrative guiding the action. Should we allow a hopelessly abusive player to nevertheless win the game? Or do we want to punish that player by refusing to grant him victory? Several CRPGs have tried to allow as much flexibility as possible, even to the point of allowing a player to go on a "killing spree," killing enemies and friendly characters alike--and still "complete" the game. Increasingly, games are offering many variations on their end sequences, so that "good" players will see a different set of screens and narrated resolutions than "bad" ones. However, such a degree of flexibility seems to rob the game of any authorial intent, which, at least in most adventure games, is one of the reasons why the game is enjoyable in the first place.

The Dig: Moments like this would be impossible in a truly open-ended game.The Dig: Moments like this would be impossible in a truly open-ended game.Games like LucasArts' 1995 classic The Dig, for instance, rely on a fairly linear series of events. If the player could just kill off all the non-player characters at the beginning and still manage to win the game, the integrity of the game would fall apart. All of the wonderful interactions and character development that make that game special would be irrelevant.

What I suppose this all boils down to is that for dialog to be any fun, it has to be interesting, non-trivial (i.e., it has to serve some real purpose in the game), and dramatic (here, I'm treading on Brenda Laurel's work). There's hardly anything dramatic about a game that let you do whatever you wanted, with no thought of anything being "inevitable" or "determined by fate," i.e., those thing we associate with good drama. Of course, the problem is that for a game to be any fun, the player has to make choices. My contention is that those choices should be purposeful, but constrained by the author's overall intention. To use a somewhat bad example, consider Unknown Identity's game The Black Mirror, which ends by the player's avatar committing suicide. Undoubtedly, many players would like to see a different ending, and some may even resent the fact that they are forced to watch their avatar leap off the edge of a tower. However, allowing some other choice would ruin the effect the developers intended. I might liken this to the death of Floyd in Planetfall. Surely, most players would have opted to save Floyd had they been given the chance. However, the reason the moment is remembered so poignantly is that it was fated to be; i.e., it was not a choice to be made but a situation to be felt.

To put it simply, players had better be careful what they wish for when they wish for AI. A truly dynamic dialog system would be about as much fun as reading a "novel" that was just a stack of blank papers upon which the reader was supposed to "choose his own adventure." Can you see the Emporer's new clothes?

In reality, a good dialog system would have to allow for a certain leeway while maintaining the author's control over the narrative as a whole (i.e., the meta-narrative). Certainly, there ought to be nearly unlimited ways for players to complete the games; hundreds and hundreds of small choices would nevertheless lead up to a satisfying conclusion (punctuated by set developments).

A precedent for what I'm talking about can be seen in Greek theater. While all the Greeks knew perfectly well what would happen to Oedipus in Sophocles' famous play, they had no idea how the playwright would arrange events--or the exact dialogue he would have coming out of the character's mouths. To my mind, a great dialog system would work the same way. While players could experience the game in greatly different ways depending on the way they wanted to play the characters, the overall outcome would always be the same. The fun would be in discovering all the varied possibilities of getting there.

Comments

Mark Vergeer
Mark Vergeer's picture
Offline
Joined: 01/16/2006
Faking comprehension by approximating verbal behaviour

If you were to create a program that really understands human natural language it would still take a huge amount of processing power the likes of Data's positronic matrice (Star Trek pun). But there might be small shortcuts when you want to achieve something similar. One can fake comprehension by approximating appropriate behaviour and a little syntax, subject checking very much like Elisa does. That might take a little less computing power. Human behaviour and human conversation can follow pretty predictable paths that can be taken advantage of in terms pattern recognition.

Look how far those computer dictation programs have come that actually type in the stuff you speak into the computer's microphone. Those have come a long way, with help of clever pattern recognition and predicting ways people speak. Of course true understanding is a long way off, but add 'a little Elisa' to such a program and the sky is the limit ;)
A very interesting subject.

-= Mark Vergeer - Armchair Arcade editor =-

n/a

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.