Rendering OpenAI Gym Envs on Binder and Google Colab
Notes on solving a mildly tedious (but important) problem
Getting OpenAI Gym environments to render properly in remote environments such as Google Colab and Binder turned out to be more challenging than I expected. In this post I lay out my solution in the hopes that I might save others time and effort to work it out independently.
!apt-get install -y xvfb x11-utils
Install additional Python dependencies
Now that you have installed Xvfb, you need to install a Python wrapper
pyvirtualdisplay
in order to interact with Xvfb
virtual displays from within Python. Next you need to install the Python bindings for
OpenGL: PyOpenGL and
PyOpenGL-accelerate. The former are the actual
Python bindings, the latter is and optional set of C (Cython) extensions providing acceleration of
common operations for slow points in PyOpenGL 3.x.
!pip install pyvirtualdisplay==0.2.* PyOpenGL==3.1.* PyOpenGL-accelerate==3.1.*
Install OpenAI Gym
Next you need to install the OpenAI Gym package. Note that depending on which Gym environment you are interested in working with you may need to add additional dependencies. Since I am going to simulate the LunarLander-v2 environment in my demo below I need to install the box2d
extra which enables Gym environments that depend on the Box2D physics simulator.
!pip install gym[box2d]==0.17.*
!echo $DISPLAY
The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the size
of the virtual buffer as you like but you must set visible=False
when working with Xvfb.
This code only needs to be run once per session to start the display.
import pyvirtualdisplay
_display = pyvirtualdisplay.Display(visible=False, # use False with Xvfb
size=(1400, 900))
_ = _display.start()
After running the cell above you can echo out the value of the DISPLAY
environment variable again to confirm that you now have a display running.
!echo $DISPLAY
For convenience I have gathered the above steps into two cells that you can copy and paste into the top of you Google Colab notebooks.
%%bash
# install required system dependencies
apt-get install -y xvfb x11-utils
# install required python dependencies (might need to install additional gym extras depending)
pip install gym[box2d]==0.17.* pyvirtualdisplay==0.2.* PyOpenGL==3.1.* PyOpenGL-accelerate==3.1.*
import pyvirtualdisplay
_display = pyvirtualdisplay.Display(visible=False, # use False with Xvfb
size=(1400, 900))
_ = _display.start()
No additional installation required!
Unlike Google Colab, with Binder you can bake all the required dependencies (including the X11 system dependencies!) into the Docker image on which the Binder instance is based using Binder config files. These config files can either live in the root directory of your Git repo or in a binder
sub-directory as is this case here. If you are interested in learning more about Binder, then check out the documentation for BinderHub which is the underlying technology behind the Binder project.
# config file for system dependencies
!cat ../binder/apt.txt
# config file describing the conda environment
!cat ../binder/environment.yml
# config file containing python deps not avaiable via conda channels
!cat ../binder/requirements.txt
!echo $DISPLAY
The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the size
of the virtual buffer as you like but you must set visible=False
when working with Xvfb.
This code only needs to be run once per session to start the display.
import pyvirtualdisplay
_display = pyvirtualdisplay.Display(visible=False, # use False with Xvfb
size=(1400, 900))
_display.start()
After running the cell above you can echo out the value of the DISPLAY
environment variable again to confirm that you now have a display running.
!echo $DISPLAY
import typing
import numpy as np
# represent states as arrays and actions as ints
State = np.array
Action = int
# agent is just a function!
Agent = typing.Callable[[State], Action]
def uniform_random_policy(state: State,
number_actions: int,
random_state: np.random.RandomState) -> Action:
"""Select an action at random from the set of feasible actions."""
feasible_actions = np.arange(number_actions)
probs = np.ones(number_actions) / number_actions
action = random_state.choice(feasible_actions, p=probs)
return action
def make_random_agent(number_actions: int,
random_state: np.random.RandomState = None) -> Agent:
"""Factory for creating an Agent."""
_random_state = np.random.RandomState() if random_state is None else random_state
return lambda state: uniform_random_policy(state, number_actions, _random_state)
In the cell below I wrap up the code to simulate a single epsiode of an OpenAI Gym environment. Note that the implementation assumes that the provided environment supports rgb_array
rendering (which not all Gym environments support!).
import gym
import matplotlib.pyplot as plt
from IPython import display
def simulate(agent: Agent, env: gym.Env) -> None:
state = env.reset()
img = plt.imshow(env.render(mode='rgb_array'))
done = False
while not done:
action = agent(state)
img.set_data(env.render(mode='rgb_array'))
plt.axis('off')
display.display(plt.gcf())
display.clear_output(wait=True)
state, reward, done, _ = env.step(action)
env.close()
Finally you can setup your desired environment...
lunar_lander_v2 = gym.make('LunarLander-v2')
_ = lunar_lander_v2.seed(42)
...and run a simulation!
random_agent = make_agent(lunar_lander_v2.action_space.n, random_state=None)
simulate(random_agent, lunar_lander_v2)
Currently there appears to be a non-trivial amount of flickering during the simulation. Not entirely sure what is causing this undesireable behavior. If you have any idea how to improve this, please leave a comment below. I will be sure to update this post accordingly if I find a good fix.