Error while defining observation space in gym custom environment

Question

I am working on a reinforcement algorithm, I am very new to this and trying to get a hold of things.

Player1Env looks upon a 7x6 Connect4 playing grid. I am initializing the class as follows:

def __init__(self):
    super(Player1Env, self).__init__()
    self.action_space = spaces.Discrete(7)
    self.observation_space = spaces.Box(low=-1, high=1, shape=(7, 6), dtype=np.float32)

checking if the class is instantiated correctly with

env = Player1Env()
check_env(env)

returns the error

AssertionError: The observation returned by the `reset()` method does not match the given observation space

printing the observation returned by the reset function and its shape:

[[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]
(7, 6)

low and high are defined as -1 and 1 respectively since the grid represents the current boardstate, with 1 being the stones dropped in by player 1 and -1 the stones dropped in by player 2. This part of the code has been tested extensively, but even changing the boundaries to -np.inf and np.inf does not change the error message.

The reset function itself:

def reset(self):
    self.board = np.zeros((7, 6))
    self.player = 1

    self.reward = 0
    self.done = False

    observation = self.board

    return observation

The stepping function is pitting the rl algorithm against a preprogrammed agent, but the error should be coming from the reset function anyways.

Could you help me out with where the error is coming from?

Edit: There was a UserError with the numpy API compiling against the wrong version that didn't seem to impact usability (everything worked in the premade gym environments). I managed to fix that error, but the observation space definition problem still persists.

Error while defining observation space in gym custom environment

Answers (1)

Your solution:

General answer: minimal example of custom env with Box observation space in gym

Example about numpy and gym.spaces.Box dtype:

Related Questions