When using the MountainCar-v0 environment from OpenAI-gym in Python the value done will be true after 200 time steps. Why is that? Because the goal state isn't reached, the episode shouldn't be done.
import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
env.render()
res = env.step(env.action_space.sample())
print(_)
print(res[2])
I want to run the step method until the car reached the flag and then break the for loop. Is this possible? Something similar to this:
n_episodes = 10
done = False
for i in range(n_episodes):
env.reset()
while done == False:
env.render()
state, reward, done, _ = env.step(env.action_space.sample())
The current newest version of gym force-stops environment in 200 steps even if you don't use env.monitor.
To avoid this, use
env = gym.make("MountainCar-v0").env