LYH
LYH

Reputation: 88

How to undo action in OpenAI Gym?

In OpenAI Gym, I would like to know the next states for different actions on the same state. For example, I want to get s_1, s_2 where the dynamics of my environment are:

(s, a_1) -> s_1, (s, a_2) -> s_2

I can not find a method that undoes an action, or shows me the next state without changing the environment. Is there something obvious that I'm missing?

If it helps, I am doing this to differentiate the dynamics and reward for LQR, and using the InvertedPendulum environment.

Upvotes: 3

Views: 1216

Answers (3)

wer
wer

Reputation: 1

For Atari you can use clone_full_state and restore_full_state

def true_predict(env,a):
    old_state=env.unwrapped.clone_full_state()
    state=env.step(a)
    env.unwrapped.restore_full_state(old_state)
#     old_state=env.unwrapped.clone_state()
#     state=env.step(a)
#     env.unwrapped.restore_state(old_state)
    return state

import gym

env1 = gym.make("Pong-v4")
s = env1.reset()

for i in range(10):
    env1.step(0)

pred=true_predict(env1,0)

new=env1.step(0)

(pred[0]-new[0]).mean()

Upvotes: 0

Alex Van de Kleut
Alex Van de Kleut

Reputation: 150

Try cloning the env.

from copy import deepcopy
import gym

env1 = gym.make("InvertedPendulum-v1")
s = env.reset()

env2 = deepcopy(env1)

s_1 = env.step(a_1)
s_2 = env.step(a_2)

Upvotes: 0

LYH
LYH

Reputation: 88

I found a method named set_state that does exactly this. It can be found at: https://github.com/openai/gym/blob/12e8b763d5dcda4962cbd17887d545f0eec6808a/gym/envs/mujoco/mujoco_env.py#L86-L92

Upvotes: 2

Related Questions