-
Notifications
You must be signed in to change notification settings - Fork 599
FAQ
Many of these issues are dealt with in the tutorial - this is a good place to start, as it introduces the main Malmo concepts and includes many hints and tips. For those too impatient to follow the tutorial, or who would like an explanation with a little more depth, read on.
- What is a useful first step when using Malmo?
- Why do I get "Failed to find an available client for mission" when I try to run a tutorial?
- Why can't I control Minecraft? It's broken!
- Why do stale mission elements persist into later missions?
Before getting to grips with Malmo, and certainly before reporting any issues, it is worth switching on the Malmo mod diagnostic information. It is highly recommended that you do this - in fact, I'm not sure why it's not enabled by default. To switch it on, do the following:
- In the Minecraft GUI, go to the Mod options and find Malmo in the list:
- Press config, and cycle through the options for debug diagnostic level until "Show all diagnostics" is displayed:
- You should now be able to see helpful diagnostic information in the Minecraft window, including:
- the Mission Control Port (10000 by default)
- whether Minecraft mouse control is set to "AI" or "human"
- the current state of the client and server state machines
- certain warning/error messages will also appear when relevant
This information will almost certainly prove useful at some point.
There are two parts to Malmo - the mod, which runs inside Minecraft, and the platform, which runs with your agent code. Both these things need to be running. The platform code runs when you launch your agent - eg by typing python tutorial_1.py
. When this runs, it will try to establish a connection with the Minecraft mod. If that connection can't be made, for whatever reason, you will see something like this message:
Failed to find an available client for this mission - tried all the clients in the supplied client pool
The first thing to check is that the mod is running. Inside the Malmo Minecraft folder is a script called launchClient
- run this (either .sh or .bat, depending on your system).
If you have done step one in this FAQ you will hopefully see a Minecraft instance running, with the words "CLIENT: DORMANT" at the top-left of the screen.
At this point, you should be able to run your agent code.
If this still fails, the next thing to check is the Mission Control Port
. At the bottom right of the Minecraft screen, you should see the words "Mouse:AI" and "MCP: 10000". MCP is the Mission Control Port - the port which the Mod is listening on for all instructions about running missions. By default, Malmo uses 10000. If 10000 is already in use, the Mod will check 10001, then 10002... and so on, until it finds a free port. (This behaviour allows you to launch, say, four Minecraft clients, and communicate easily with each one, on the port range 10000-10003.)
On the platform side (in your agent code) it is possible to specify the MCP, but if no port is specified, the default of 10000 will be used. Many of the python samples - including the tutorial files - use this default. All of this means that if some other program on your machine is sitting on port 10000, the communication between the platform and the mod will be broken.
To fix this, either free up port 10000 (you may need to rerun Minecraft, or reset the MCP in the Mod configs GUI before this takes effect), or add something like the following to your agent code:
my_client_pool = MalmoPython.ClientPool()
my_client_pool.add(MalmoPython.ClientInfo("127.0.0.1", 10001)) #10000 in use - try 10001
...
...
agent_host.startMission( my_mission, my_client_pool, my_mission_record, 0, experimentID )
Because Malmo exists to allow AIs to control Minecraft, not humans. To this end, it deliberately disables human mouse control, to make sure there is no accidental interference. Vanilla Minecraft captures the mouse, and any mouse movements change the yaw/pitch of the player. If this was left enabled in Malmo, it would be almost impossible to avoid disrupting experiments (unless the Minecraft clients were run on dedicated machines that no one had direct access to.) So, by default, mouse control is disabled.
However, it can be extremely useful to temporarily take control of Minecraft, during debugging/testing/mission design etc, so we introduced a toggle, to switch between AI and human control of the mouse. Simply click on the Minecraft window during game play and press the Enter key. (If you have diagnostic output switched on you will be able to see who is currently in control in the bottom-right corner of the screen.)
Traditional reinforcement learning techniques work by running the same small task repeatedly, possibly thousands or millions of times. Ideally, the turnaround time between these "episodes" should be as small as possible, and the initial state of each episode should be the same. In Minecraft, the cost of creating a fresh new world - a clean initial state - is very high, taking many seconds even on a fast machine. To make reinforcement learning practical, Malmo needs to compromise between cleanliness and turnaround time.
By default, Malmo aims for high turnaround by reusing the current world (provided the current world matches the world requirements). This makes turnaround significantly quicker, but means that any changes made in the course of the episode (or "mission", in Malmo parlance) will persist into subsequent missions.
This is most commonly seen in the items present in the mission. For example, if a diamond is drawn at a specific location, and the agent fails to collect it during the course of the mission, that diamond will persist into the next mission, where it will be joined by a new diamond, and so on. After 100 missions, there might be 100 diamonds all occupying the same space. (Note that Minecraft is very inefficient at drawing free-floating items, so filling your environment with multiple diamonds will eventually cripple performance.) Aside from the items, any changes the agent makes to the environment will persist too - if they dig a hole in one mission, they might fall into it in the next...
Assuming this behaviour is undesirable (there may be cases where it's not!), there are two main ways of dealing with it. The first is to force Malmo to provide a clean initial state for each mission. This is done via the forceReset
flag in the world generator XML - eg:
<ServerSection>
<ServerHandlers>
<FlatWorldGenerator generatorString="3;168:1;8;" forceReset="true"/>
...
...
As remarked above, this will create an enormous amount of work for Minecraft, and will make restarting the mission very slow indeed.
The second method is to do your own cleaning - this is the approach that most of the Python samples take. Make sure your drawing code creates its own blank state before drawing any extra features required by the experiment. For instance, before drawing any items for the mission, draw a block of air large enough to "blank out" any stale items from previous missions:
<DrawingDecorator>
<DrawCuboid x1="0" y1="46" z1="0" x2="7" y2="52" z2="7" type="quartz_block" /> <!-- limits of our arena -->
<DrawCuboid x1="1" y1="47" z1="1" x2="6" y2="51" z2="6" type="air" /> <!-- hollow it out with air -->
<DrawCuboid x1="1" y1="50" z1="1" x2="6" y2="49" z2="6" type="glowstone" /> <!-- glowstone ceiling for light -->
<DrawItem x="4" y="47" z="2" type="diamond" />
</DrawingDecorator>
A third method is to "shift" your environment to a fresh location at the start of each episode - eg create your XML code dynamically, moving the origin to a new spot each time. patchwork_quilt.py
uses this approach to build up a vast landscape from small mazes:
for iRepeat in range(num_reps):
# Find the point at which to create the maze:
xorg = (iRepeat % 64) * 16
zorg = ((iRepeat / 64) % 64) * 16
yorg = 200 + ((iRepeat / (64*64)) % 64) * 8
...
...
<MazeDecorator>
<SizeAndPosition length="16" width="16" xOrigin="''' + str(xorg) + '''" yOrigin="''' + str(yorg) + '''" zOrigin="''' + str(zorg) + '''" height="8"/>
...
...
The main advantage to this approach is that it provides some sort of history - the end states of previous missions are left in the world. In terms of performance there is little to recommend it - it will force Minecraft to load more chunks, and will increase file and memory consumption. patchwork_quilt.py
only does it because it's fun (and because it's a stress test).