Training Slowdown Issue in Q-learning Model - TensorFlow

Question

I'm encountering a training slowdown issue in my Q-learning model implemented using TensorFlow. I've simplified my code to focus on the training loop and saving the model after each episode. The problem is that the training speed significantly decreases in the subsequent episodes.

I'm using a Q-learning agent with a convolutional neural network (CNN) architecture. The model is saved at the end of each episode, and I continue training from the model without loading the saved model in the next episode.

Here's a condensed version of the relevant code:

import numpy as np
import tensorflow as tf

MAX_EPISODES = 50
CONTINUE = True

class QLearningAgent:
    def __init__(self, state_size, action_size):
        self.state_size = state_size
        self.action_size = action_size
        self.epsilon = 0.9
        self.epsilon_decay = 0.995
        self.epsilon_min = 0.1
        self.learning_rate = 0.01
        self.gamma = 0.95
        self.model = self.build_model()

def build_model(self):
    # Your model architecture here
    # ...

def act(self, state):
    # Epsilon-greedy action selection
    # ...

def train(self, state, action, reward, next_state, done):
    # Q-learning training logic
    # ...

def save_model(self, filename):
    self.model.save(filename)

def update_epsilon(self):
    self.epsilon = max(self.epsilon * self.epsilon_decay, self.epsilon_min)


 env = BallEnvironment(max_steps=1000)`



for episode in range(MAX_EPISODES):
     obs = env.reset()

     while True:
         env.render()

         left_action = env.left_ball.q_agent.act(np.reshape(obs, [1, *env.state_size]))
    
         next_obs, rewards, done, _ = env.step(left_action, right_action)

         left_state = np.reshape(obs, [1, *env.state_size])
         left_next_state = np.reshape(next_obs, [1, *env.state_size])
         env.left_ball.q_agent.train(left_state, left_action, rewards[0], left_next_state, done)



         obs = next_obs

        if done:
            env.left_ball.q_agent.save_model("left_trained_agent.h5")
            break

 env.close()

Training Slowdown Issue in Q-learning Model - TensorFlow

Answers (1)

Related Questions