Why is episode done after 200 time steps (Gym environment MountainCar)?

needRhelp picture needRhelp · Mar 14, 2017 · Viewed 7.5k times · Source

When using the MountainCar-v0 environment from OpenAI-gym in Python the value done will be true after 200 time steps. Why is that? Because the goal state isn't reached, the episode shouldn't be done.

import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
    env.render()
    res = env.step(env.action_space.sample())
    print(_)
    print(res[2])

I want to run the step method until the car reached the flag and then break the for loop. Is this possible? Something similar to this:

n_episodes = 10
done = False
for i in range(n_episodes):
    env.reset()
    while done == False:
        env.render()
        state, reward, done, _ = env.step(env.action_space.sample())

Answer

Scitator picture Scitator · Mar 15, 2017

The current newest version of gym force-stops environment in 200 steps even if you don't use env.monitor. To avoid this, use env = gym.make("MountainCar-v0").env