Alex Robert Petrovič
Alex Robert Petrovič

Reputation: 19

Stable baselines 3 not generating tensorfiles for ppo, sac and td3

I am comparing a2c, dqn and ppo models. I need to have tensorboard graphs to show my teacher. The tensorboard only collect data for the a2cmodel, when using it for ppo, sac or td3 it creates the event file but doesn't add any info into it.

The code for ppop and a2c is the same:

ppo:

import os
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.logger import configure
from stable_baselines3.common.vec_env import SubprocVecEnv

log_path_ppo = "log/ppo_cartpole_tensorboard/"
#log_path_ppo = "log/ppo_lunar_tensorboard/"

model_ppo.set_logger(configure(log_path_ppo, ["tensorboard"]))
model_ppo.learn(total_timesteps=5000,log_interval=1000, progress_bar=True)

model_ppo.save("ppo_cartpole_model")
# model_ppo.save("ppo_lunar_model")

del model_ppo
del env

a2c:

import os
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.logger import configure
from stable_baselines3.common.vec_env import SubprocVecEnv

# log_path_a2c = "log/a2c_cartpole_tensorboard/lunar-env/"
log_path_a2c = "log/a2c_cartpole_tensorboard/"

model_a2c.set_logger(configure(log_path_a2c, ["tensorboard"]))
model_a2c.learn(total_timesteps=5000,log_interval=1000, progress_bar=True)

model_a2c.save("a2c_cartpole_model")
#model_a2c.save("a2c_lunar_model")

del model_a2c
del env

Both models and environments are being created in other snippets.

For a2c this works and creates me all the needed graphs for tensorboard but for ppo it doesn't.

I've tried using different syntax from the documentation which again worked only for the a2c ([https://stable-baselines3.readthedocs.io/en/master/guide/tensorboard.html#basic-usage]).

Also tried adding writer to ppop code instead of the logger, though that worked in actually writing info into the event code but tensorboard could load the graphs, the logistics were most likely wrong in that code.

Tried installing stable-baselines3 with the [EXTRA] tag but that also didn't work.

Upvotes: 1

Views: 44

Answers (1)

Alex Robert Petrovič
Alex Robert Petrovič

Reputation: 19

After a class with my teacher he told us where the problem is. It is the log_interval as with models DQN, PPO, SAC, TD3 the value of 1000 is to large. Default values can be found in library documentation, I just set it to 1 as that was fine for my goal in the assigment.

Keep in mind if you want to compare them at the same points in time you need to make another fix as for some models it counts as epoch so they dont make timestamps to tensorboard at the same time.

correct code:

model_ppo.learn(total_timesteps=350000,log_interval=1, progress_bar=True)

Upvotes: 0

Related Questions