MS TextWorld: open-source engine that both generates and simulates text games

Maimedpuppet · on March 19, 2019

They won a text adventure AI competition as well [1][2] and is now running their own with $2000 reward at stake [3]. Getting an agent to solve unseen RPG games like zork or modern games like Zelda and Witcher would be awesome and seriously impressive, but I don't think anybody is even close to that right now.

[1] https://arxiv.org/abs/1902.04259

[2] https://github.com/Microsoft/nail_agent

[3] http://aka.ms/textworld-challenge

chongli · on March 19, 2019

I would be genuinely surprised if anyone can build an agent that can solve Zelda for the NES without any foreknowledge of the game. So much of that game is bound up in the human cultural tropes associated with the monomyth. To finish the game you need to have at least a basic working knowledge of what it means to be a hero. There is no obvious, short-term indicator of progress the way there is in Mario (increase the x position to win).

The process of obtaining just a single piece of the Triforce (the primary indicator of progress) is so complicated that you'll essentially never see it happen with random button-pressing. So any agent attempting to play and win Zelda is going to need to develop its own model of the game world and try to understand things that way. I don't see how that is supposed to happen though, without any context.

wholemoley · on March 19, 2019

As I mentioned in the other post, the curiosity bots can help with this.

They're rewarded by exploring, not by values in game.

Maybe alone they're not enough, but in conjunction with other things, I bet we could beat Zelda. I had the bot exploring enough to find the first dungeon of LoZ.

chongli · on March 20, 2019

I just took a look at the curiosity video. It's funny, I think with enough refinement something like this could beat Zelda. Except it wouldn't actually know it beat the game! I feel like that is cheating, somehow.

Like maybe you could get American Fuzzy Lop [1] to beat Zelda. Isn't that the same thing, in principle?

[1] http://lcamtuf.coredump.cx/afl/

lzybkr · on March 20, 2019

AFL might eventually get lucky, but I'm guessing you'd want a hybrid approach that combines symbolic execution with a fuzzer like AFL, e.g. QSYM [1].

[1] https://www.usenix.org/system/files/conference/usenixsecurit...

taftster · on March 20, 2019

Zelda was simply a means for Nintendo to sell Nintendo Magazine subscriptions. They had to have made a killing on subscriptions, and they were absolutely required for solving games like Zelda. For example, you have to place a bomb in this one obscure location without any in-game clues as to said location. Or burn a tree.

Here's the connection to your post. I think a human and a computer have about the same chance of solving Zelda. At least a computer could be programmed to drop bombs in every rock wall in the game. For a human, that would be too brain-numbing hard.

chongli · on March 20, 2019

At least a computer could be programmed to drop bombs in every rock wall in the game. For a human, that would be too brain-numbing hard.

Programming the computer to drop bombs everywhere is giving it way too much foreknowledge. I'm envisioning a computer agent not even knowing what a bomb is. Having to figure out how to beat Zelda without any context for the shapes and colours on the screen, let alone what bush to burn to find level 8 or where to play the recorder to open level 7.

This would be a test of general intelligence so I have no problem giving the agent access to the Internet. If it can figure out how to find an FAQ for Zelda and make sense of that information well enough to win the game then more power to it! That's much harder to achieve than building a bot with all of the rules of Zelda pre-programmed in.

feanaro · on March 20, 2019

You definitely do not need external hints to solve Zelda games. I've finished many Zelda titles without ever getting any hints. I've even finished A Link to the Past in German by using a dictionary to translate dialogue. Before that, I had not known a word of German.

korla · on March 20, 2019

He is talking specifically about Legend of Zelda, or Zelda for short, the first game in the series. https://en.wikipedia.org/wiki/The_Legend_of_Zelda_(video_gam...

feanaro · on March 20, 2019

Fair enough. The tree burning should had given me the hint.

That said, I did finish the original LOZ without hints too.

jtolmar · on March 20, 2019

Riffing on this, what about Zelda-like games /with/ foreknowledge? Let's say the game has items that serve as unique keys (like boss keys or the hammer), items that serve as interchangeable keys (like regular keys or triforce pieces), and maybe specific items that modify movement directly (ladder, hookshot, pegasus boots). Additionally it has switches that toggle certain walls.

This is a stateful pathfinding problem, you can solve it with Dijkstra's. But how fast can you find the shortest route through 100 Zelda-likes?

Or take a page from maze-running bots and not provide the map up-front, but require an agent that actually explores to get information, and judge results on the sum of three times through the same game.

chongli · on March 20, 2019

I don't think the with-foreknowledge case is all that interesting. We already have an expert system agent [1] that can win at the game NetHack [2]. This is a far more complicated task than any of the Zeldas, given foreknowledge.

Without foreknowledge, on the other hand, Zelda becomes a whole mess of meaningless pixels and sound waves whereas NetHack retains a level of intelligibility due to its ASCII interface.

[1] https://www.reddit.com/r/nethack/comments/2tluxv/yaap_fullau...

[2] http://www.nethack.org/

wholemoley · on March 19, 2019

This person https://github.com/pathak22/noreward-rl and his curiosity AI https://github.com/openai/large-scale-curiosity can do some cool stuff.

I used it to train an AI to play Mario Kart ( https://www.youtube.com/watch?v=A8oSnh0M864 )

I also had it playing The Legend of Zelda (got as far as finding the first dungeon, but with more power, I'm certain it could explore the whole map.).

PaulHoule · on March 19, 2019

I dunno, I was thinking about making a bot that could beat Hyperdimension Neptunia...

coldacid · on March 19, 2019

Do it. Please.

cwyers · on March 19, 2019

> TextWorld requires Python 3 and only supports Linux and macOS systems at the moment.

Truly, this is a new Microsoft.

verst · on March 19, 2019

Probably works with Python3 in WSL (let's say Ubuntu). That's good enough for Windows support right? ;)

Disclaimer: I work at Microsoft on Python, linux and container things. This however is just my personal opinion and I'm not affiliated with this project.

pault · on March 19, 2019

I have to use Windows at work after ten years of development exclusively on Mac and Linux. Once I set up wsl at work I found that almost all of my gripes about developing on Windows were gone and a month ago I switched my home office to Windows. If you had told me two years ago that I'd be using Windows full time as a Dev environment I would have asked you what were you smoking and can I have some. I have to say though, the first time I started up my "pro" operating system and saw an ad for Candy crush in my start menu I almost bailed.

escapecharacter · on March 19, 2019

What program do you use for command line work? I’ve found Powershell and Cmder clunky compared to macOS terminal and any Ubuntu terminal.

daeken · on March 19, 2019

I run an OpenSSH server inside WSL and then connect with Putty. It works perfectly, it's fast as hell, and it's super simple; it's just like connecting to any other box, really.

falcor84 · on March 19, 2019

I personally am very satisfied with ConEmu, launching Ubuntu 18.04's bash under WSL. My only gripe is with how messy it is to set up arrow keys to work in vim, and the solution I found doesn't work over ssh.

EDIT: just to preempt suggestions that I should just use the home-row, I am, but it's really annoying for those of us not using qwerty (Colemak in my case).

bashinator · on March 20, 2019

I’m using gnome-terminal running in VcXsrv. The scrolling and font rendering is far better than any of the windows terninal apps I’ve tried.

karate-fu · on March 25, 2019

WSLtty is the only viable one if you ask me with Vim running a tad slow, but acceptable. Alacritty might soon overtake it.

coldacid · on March 19, 2019

PSCore isn't bad so long as you don't load it down with Powershell modules. It only takes two or three beyond the built-ins to make it start crawling.

gcb0 · on March 26, 2019

latelly the only way I can touch windows is with the shell from git-for-windows. it is surprisingly excellent.

tracker1 · on March 20, 2019

Hyper for bash, powershell ise for ps (really limited)

freeone3000 · on March 20, 2019

I work on this project.

We needed to pull in a few ways of running the backend game and communicating with it that weren't quite convenient on Windows. A number of integrated third-party components related to generating and running IF games are either linux-only or behave differently in windows and linux. In addition, the target audience is reinforcement learning researchers, who overwhelmingly use Python on Linux, so that focus was prioritized.

Ubuntu on WSL actually is a test platform, and guided a few technical decisions (in part, WSL issues #902 and #162 narrowed what we could do). Honestly, that comment is not too far from the mark. :)

verst · on March 19, 2019

For anyone using Docker I created an image for convenience. https://hub.docker.com/r/berndverst/mstextworld

Source at: https://github.com/berndverst/mstextworld/

4thaccount · on March 20, 2019

Yes and No. I think they understand the very real limitations of Windows and how developers and power users hate it.

Let's compare and contrast the two. Linux has a usable shell that has GCC, Bash, Awk, dozens of specific commands that can be piped together, Perl, Python, TCL...etc all built-in and easily composable. Windows has well...CMD which is horrible, Batch files, VBScript (deprecated), VBA, and Powershell (a mess). If you install Visual Studio Tools you can get a C++ compiler and .NET languages like C# and VB.NET. Linux might have its faults, but the design is coherant and you can get stuff done. With Windows your only options are use a GUI or write at least a 1/2 page of code for something that is a oneliner in Linux. Powershell is a sometimes great (mostly meh) attempt to right this, but the everything is an object design is just simply more complex than the Unix philosophy of strings, pipes, and interoperable pieces. It seems like the Windows approach is to recognize the failing, while keeping in mind that a lot of desktop software uses Windows. So give developers Linux via WSL and people can still run their other Windows only software without Wine.

debaserab2 · on March 20, 2019

Or maybe this came out of an MS R&D arm that had free reign on design and it’s not some overall signal that MS has admitted defeat to Linux on the shell.

I have a hard time believing this couldn’t have just as easily been windows based with the exact same amount of effort. Windows has fantastic developer tooling that I think you’re downplaying quite a bit.

> Powershell is a sometimes great (mostly meh)

Powershell has a different philosophy for sure - but I have seen sys admins do some pretty powerful things with it, and quickly. What can a Linux shell do that power shell can’t (that isn’t specific to Linux platform needs)?

4thaccount · on March 20, 2019

Quite a bit in my opinion. Of course if both are Turing complete the answer is nothing, but efficiency in development time and speed should be a factor.

To give an example, suppose you have a directory with 20 .CSV files that are all 200 MB. This is extremely common for me and the files are much larger for many developers. Say you want to extract certain data matching a string and then sort the whole thing into a new file. In Linux this is really simple with a oneliner and it also runs really fast. Powershell has a oneliner capability, but it is quite frankly abysmally slow. The Linux solution takes about 5 seconds and the Powershell takes 15 minutes. Now there is a way to speed up the powershell code. Running it in parallel helps, but the true performance aid comes in calling some C# within the script and now it is definitely not a oneliner. That is the overall issue. The Unix philosophy is simple to execute, understand, and fast to run. The Windows is complex to understand and often slow. Powershell is great in some instances. The language is certainly better than Bash, but if I have to constantly use C# with it...you might as well just use C#.

tracker1 · on March 20, 2019

for anything more complex, I often just defer to a node/js script... mostly out of more comfort, but also because it runs everywhere I want it to (linux, windows, mac), usually with minimal effort.

4thaccount · on March 20, 2019

There's obviously nothing inherantly wrong with that and it's great that you can run it anywhere. However, isn't it a shame you have to resort to that? Shouldn't there be a way to easily specify to the computer what you want without having to go so far outside the OS? Obviously this isn't Star Trek where you can ask the computer to do something and it knows exactly what you want. At some level of complexity you will have to start writing real code and at that point I agree it doesn't matter a whole lot whether you're using Python, Bash, C#, NodeJS, or Haskell. However, if you're doing something really simple like loop over some files that match a string in a directory, parse the files, and make a report, that should be easy. If your first thought is "let me go open Visual Studio"...then your OS has failed in my opinion.

tracker1 · on March 20, 2019

Fair enough... tbh, if Git for Windows, registered `.sh` and `.bash` file extensions to execute via that version of Bash, I'd be content with that... as it is, I have re/used a little bash even in windows at work (everyone has git for windows) and have certain things installed via script+chocolatey. Second pass/configuration, etc, calls a bash script in windows... The first part I also have a bash script for mac, and the second script is shared.

That said, JS/Node is often my goto... I also do a fair amount of `npm i -g UTIL` mainly out of convenience between platforms. There isn't a really universal scripting system I can completely rely on where at least one of the three aren't oddities.

I do use bash pretty much everywhere, and that's what I use ... it's just when I'm writing a script for reuse, I'm more inclined to use NodeJS.

4thaccount · on March 21, 2019

Gotcha. Thanks for the additional thoughts!

tavianator · on March 20, 2019

> Or maybe this came out of an MS R&D arm that had free reign on design and it’s not some overall signal that MS has admitted defeat to Linux on the shell.

Bingo (I'm on the TextWorld team).

userbinator · on March 20, 2019

I am a developer (Win32; actually started with Win16 and DOS before that) and consider myself a "power user" of Windows, yet never had the feeling you describe. True, the -nix world has a great CLI shell, and some things are much easier there, but on the other hand, the GUI experience is (or at least used to be...) incomparable. AFAIK tiny single-exe-file no-install portable GUI applications just don't exist in the -nix world. I agree that PowersHell is a mess --- I tend to stay away from it and use CMD instead.

In this particular case, it is very strange that they claim something which appears like it shouldn't be OS-dependent at all (Python is supposed to be portable, and there's no GUI stuff) would require a specific OS. A new Microsoft indeed, but unfortunately not one I'm a fan of...

4thaccount · on March 20, 2019

You can certainly create GUI apps on Linux, but I've never found the need or desire as the terminal works so well. Even with a GUI designer, it adds a bunch of code you wouldn't likely need if you had a decent environment. Of course if your app is for non technical folks, I sympathize and understand.

freeone3000 · on March 20, 2019

This is a python framework for reinforcement learning, and has very little to do with the shell or development tooling on linux. Linux was the development focus because the vast majority of the reinforcement learning community use python3 on linux.

tracker1 · on March 20, 2019

I mostly use the msys bash that comes with git for windows in Hyper (a couple minor configs, but generally works well enough) or the VS Code terminal (same base). I do have most of what you mention setup beyond the git tools, a lot of it is transpiled for windows, though sometimes the path escaping gets annoying and there's a couple other windows isms.

I tend to use mac at home, windows at work, and linux via docker, ssh and/or VM for pretty much all targeted/test work.

Not a big fan of PS, other than some of the commands available there that aren't generally outside PS.

hanniabu · on March 20, 2019

Just goes to show how shitty Windows is. Impossible to program anything on Windows without it taking 5x as long.

benj111 · on March 19, 2019

Indeed, I thought the MS was referring to something else, just based on the title.

Of course they may want to be targeting Linux and MacOS for nefarious purposes, or maybe its a peace offering to make up for skype?

wuschel · on March 20, 2019

What is the difference between TextWorld and a design system for interactive fiction games like Inform [1]?

It seems to be that the former has natural language processing/semantics integrated, while the latter relies on a rather old school approach to text-based games. Or am I mistaken?

[1] http://inform7.com/

marccote19 · on March 20, 2019

Hi, I'm with the TextWorld team.

TextWorld actually uses Inform7. The framework first builds a game object (e.g. objects, locations, quests) according to some settings. Then, it converts it into Inform7 code before being compiled into a playable game. See https://imgur.com/oOcy5kk

malloryerik · on March 20, 2019

How could you run a TextWorld game in a web browser? Use a Glulx file interpreter like Quixe?

https://eblong.com/zarf/glulx/quixe/

yorwba · on March 20, 2019

The "try it" example on the TextWorld page uses Parchment: https://github.com/curiousdannii/parchment

taodav · on March 20, 2019

The main addition is the ability to randomly generate text-based games and a gym-like interface that the RL community is used to. TextWorld was originally (not sure if it is anymore) based on inform7!

4thaccount · on March 20, 2019

I was wondering the same thing

macca321 · on March 20, 2019

There are interesting parallels between problem solving textworld agents and smart hypermedia client agents.

rainydaybook · on March 20, 2019

Why not show a simple map of the room you're in, at least? It's a fairly simple change, but would make the orientation in space and remembering of location and few previous locations much more intuitive.

taodav · on March 20, 2019

TextWorld environments have a render function that renders an SVG of the room and your inventory (https://github.com/Microsoft/TextWorld#extras)

marccote19 · on March 20, 2019

Also, those games are intended for Reinforcement Learning agents, not Humans.

zyngaro · on March 19, 2019

Amazing! Yesterday while taking a walk I thought exactly about that ! A text based game and a text game engine.

_jgvg · on March 19, 2019

Is this Microsoft's response to Google's Stadia?

tellarin · on March 20, 2019

Microsoft's approach would likely be xCloud. https://www.theverge.com/2019/3/13/18263405/microsoft-xcloud...

ninju · on March 20, 2019

Nope...this has do with quickly creating "worlds" that can been used for generating training sets on AI systems that process text input