Bitron
Bitron

Reputation: 1

Python Gymnasium Render being forced

I'm new to gym and I tried to do a simple qlearning programm but for some (weird) reason it won't let me get rid of the rendering part (which is taking forever)...

Here is my programm:

import gymnasium as gym
import numpy as np

env = gym.make("MountainCar-v0", render_mode="human")

LEARNING_RATE = 0.1
DISCOUNT = 0.95
EPISODES = 25000

SHOW_EVERY = 500

DISCRETE_OS_SIZE = [20] * len(env.observation_space.low)
discrete_os_win_size = (env.observation_space.high - env.observation_space.low) / DISCRETE_OS_SIZE

q_table = np.random.uniform(low=-2, high=0, size=(DISCRETE_OS_SIZE + [env.action_space.n]))


def get_discrete_state(state):
    discrete_state = (state - env.observation_space.low) / discrete_os_win_size
    return tuple(discrete_state.astype(int))


for episode in range(EPISODES):
    if episode % SHOW_EVERY == 0:
        render = True
    else:
        render = False
    
    print("Episode:", episode)
    
    discrete_state = get_discrete_state(tuple(env.reset()[0].astype(int)))
    
    done = False
    while not done:
        action = np.argmax(q_table[discrete_state])
        new_state, reward, terminated, truncated, _ = env.step(action)
        done = truncated or terminated
        
        new_discrete_state = get_discrete_state(new_state)
        
        # Rendering the episode
        # (Even removing this part does not help)
        if render:
            env.render()

        if not done:
            # Updating the Q-table
            max_future_q = np.max(q_table[new_discrete_state])
            current_q = q_table[discrete_state + (action, )]
            new_q = (1-LEARNING_RATE)* current_q + LEARNING_RATE * (reward + DISCOUNT * max_future_q)
            
            q_table[discrete_state + (action, )] = new_q
       
        # If the car made it to the goal
        elif new_state[0] >= env.unwrapped.goal_position:
            q_table[discrete_state + (action, )] = 0
            print("MADE IT ON EPISODE:", episode)
        discrete_state = new_discrete_state
    
env.close()

I tried:

Upvotes: 0

Views: 146

Answers (1)

HyeAnn
HyeAnn

Reputation: 98

In Gymnasium Documentation, it says:

By convention, if the render_mode is:

  • “human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.

As long as you set the render_mode as 'human', it is inevitable to be rendered every step.

Upvotes: 0

Related Questions