openai gym observation space representation

Question

I have a question around the representation of an observation in a gym environment. I have actually several observation spaces with different dimensions, let's say for example I have one camera with 24x24 pixels, then a xray machine with a 1x25 values, then 10 temperature sensors so 1x1 10 times. So currently I have represented that with a spaces.Dict encapsulating the continuous values with some space.Box

class MyEnv(gym.Env):
    def __init__(self, ...):
        spaces = {
                'xray': gym.spaces.Box(low=-np.inf, high=np.inf, shape=(nbcaptors, )),
                'cam1': gym.spaces.Box(low=-np.inf, high=np.inf, shape=(cam1width, cam1height)),
                'cam2': gym.spaces.Box(low=-np.inf, high=np.inf, shape=(cam2width, cam2height)),
                'thermal': gym.spaces.Box(low=-np.inf, high=np.inf, shape=(thermalwidth, thermalheight))
                    }
        self.observation_space = gym.spaces.Dict(spaces)

A custom agent can then process the data with: observation['cam1'] or obervation['xray'] etc...

The problem is when i want to use third party algorithm, for example from stable-baselines3, they don't support spaces.Dict. So my question is: how to solve that? should I just represent my obervation_space with a 1xn Box such as:

self.observation_space = 
    gym.spaces.Box(low=-np.inf, high=np.inf, 
                   shape=(nbcaptors*cam1width*cam1height*cam2width*cam2height*thermalwidth*thermalheight,)

Does that make sense? Even if it does I see 3 problems with this approach:

the low and high of my 1d space might be not good enough since for example other spaces could have some defined bounds.
it will be easier to make a mistake in the implementation
really those are 2d matrices, so i would have to convert 4 matrices to a location in the 1d obervation_space, and a custom agent would have then to rebuild the 4 matrices from the 1d observation. The original fast non RL based implementation already takes forever to run, so I'm afraid this overhead is going to slow down things.

At this point I see only 2 ways to go:

to map all my 4 matrices to a 1d array
to encapsulate my spaces.Dict gym.Env with another gym.Env which will handle the conversion from spaces.Dict to spaces.Box and use one agent or the other depending if I want to use a custom agent or a third party one.

Would be grateful from some input on how to best tackle this problem, in term of performance and simplicity.

Thanks!

Will · Accepted Answer

Actually it seems the encapsulation part is exactly what the good folks of OpenAI did:

from gym.wrappers import FlattenObservation
from gym.spaces.utils import unflatten
wrapped_env = FlattenObservation(env)
obs1 = wrapped_env.reset()
unflatted_obs = unflatten(wrapped_env.unwrapped.observation_space, obs1)

openai gym observation space representation

Answers (2)

Related Questions