achandra03
achandra03

Reputation: 121

How do I get Pygame rendering to work for my neural network?

I'm trying to build a neural network to play snake. Here's the training code:

def train(self):
    self.build_model()
    for episode in range(self.max_episodes):
        self.current_episode = episode
        env = SnakeEnv(self.screen)
        episode_reward = 0
        for timestep in range(self.max_steps):
            env.render(self.screen)
            state = env.get_state()
            action = None
            epsilon = self.current_eps
            if epsilon > random.random():
                action = np.random.choice(env.action_space) #explore
            else:
                values = self.policy_model.predict(env.get_state()) #exploit
                action = np.argmax(values)
            #print(action)
            experience = env.step(action)
            if(experience['done'] == True):
                break
            episode_reward += experience['reward']
            if(experience['done'] == True):
                continue
            if(len(self.memory) < self.memory_size):
                self.memory.append(Experience(experience['state'], experience['action'], experience['reward'], experience['next_state']))
            else:
                self.memory[self.push_count % self.memory_size] = Experience(experience['state'], experience['action'], experience['reward'], experience['next_state'])
            self.push_count += 1
            self.decay_epsilon(episode)
            if self.can_sample_memory():
                memory_sample = self.sample_memory()
                #q_pred = np.zeros((self.batch_size, 1))
                #q_target = np.zeros((self.batch_size, 1))
                #i = 0
                for memory in memory_sample:
                    memstate = memory.state
                    action = memory.action
                    next_state = memory.next_state
                    reward = memory.reward
                    max_q = reward + self.discount_rate * self.replay_model.predict(next_state)
                    #q_pred[i] = q_value
                    #q_target[i] = max_q
                    #i += 1
                    self.policy_model.fit(memstate, max_q, epochs=1, verbose=0)
            env.render(self.screen)
        print("Episode: ", episode, " Total Reward: ", episode_reward)
        if episode % self.target_update == 0:
            self.replay_model.set_weights(self.policy_model.get_weights())
    pygame.quit()

The screen initialization code looks like this:

pygame.init()
self.screen = pygame.display.set_mode((600, 600))
pygame.display.set_caption("Snake") 

The environment rendering code looks like this:

def render(self, screen):
    screen.fill((0, 0, 0))
    for i in range(20):
        pygame.draw.line(screen, (255, 255, 255), (0, 30*i), (600, 30*i))
        pygame.draw.line(screen, (255, 255, 255), (30*i, 0), (30*i, 600))
    self.food.render()
    self.snake.render()
    pygame.display.flip()

The food and snake render methods just draw simple squares at the appropriate coordinates. When I run the training code, I just get a white screen. When I end the program by hitting ctrl C, I see the screen rendered properly for a brief moment and then it abruptly closes. How do I get it to render properly?

Upvotes: 1

Views: 176

Answers (1)

sloth
sloth

Reputation: 101072

Your code may work on annother OS, but generally, you have to let pygame process your window manager's events by calling pygame.event.get() (or .pump()). Otherwise, nothing will be drawn on the screen.

So, in your loop, you should process the events in the event queue, and at least handle the QUIT event, e.g.:

def render(self, screen):
    ...
    # or create a new function, it's up to you, just to this once per frame
    events = pygame.events.get()
    for e in events:
        if e.type == pygame.QUIT:
            sys.exit() # or whatever to quit the program

You coud also do more fancy things to seperate your training code and the drawing code, like using callbacks or coroutines, but that's another topic.

Upvotes: 1

Related Questions