Anton Anton
Anton Anton

Reputation: 1

Mlagents-learn giving errors

version information: ml-agents: 0.29.0, ml-agents-envs: 0.29.0, Communicator API: 1.5.0, PyTorch: 1.7.1+cpu

When I run mlagents-learn it gives me this huge error. A similar error appears also if I use --force, just without the last part.

Traceback (most recent call last):
  File "C:\Users\Anton\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Anton\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Anton\Desktop\Unity\Drone_Ai_v2\venv\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "c:\users\anton\desktop\unity\drone_ai_v2\venv\lib\site-packages\mlagents\trainers\learn.py", line 260, in main
    run_cli(parse_command_line())
  File "c:\users\anton\desktop\unity\drone_ai_v2\venv\lib\site-packages\mlagents\trainers\learn.py", line 256, in run_cli
    run_training(run_seed, options, num_areas)
  File "c:\users\anton\desktop\unity\drone_ai_v2\venv\lib\site-packages\mlagents\trainers\learn.py", line 75, in run_training
    checkpoint_settings.maybe_init_path,
  File "c:\users\anton\desktop\unity\drone_ai_v2\venv\lib\site-packages\mlagents\trainers\directory_utils.py", line 26, in validate_existing_directories
    "Previous data from this run ID was found. "
mlagents.trainers.exception.UnityTrainerException: Previous data from this run ID was found. Either specify a new run ID, use --resume to resume this run, or use the --force parameter to overwrite existing data.

(venv) C:\Users\Anton\Desktop\Unity\Drone_Ai_v2>

I tried using different versions of mlagents and pytorch, but I still get this every time. Could it maybe be my unity version? I'm on 2022.2.0b16

Upvotes: 0

Views: 1565

Answers (2)

VectorX
VectorX

Reputation: 667

This error : Previous data from this run ID was found. Either specify a new run ID, use --resume to resume this run, or use the --force parameter to overwrite existing data.

Mean that you used --run=id where id was already used before and has some old data....you need to use either --resume flag to continue training or --force to delete all old data and restart the training

Upvotes: 0

Kenny
Kenny

Reputation: 1

I am also a beginner, and I have encountered the same problem earlier. The cause of the problem is that the tutorial we read before is outdated, you can try to watch this tutorial: https://www.youtube.com/watch?v=Yix4iV_io6o&t=310s With some luck, you should be able to run this command successfully ->

mlagents-learn config/ppo/PushBlock.yaml --run-id=push_block_test_01

If there is a huge error message and at it's the bottom that says

TypeError: Descriptors cannot not be created directly. If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0. 
If you cannot immediately regenerate your protos, some other possible workarounds are: 
1. Downgrade the protobuf package to 3.20.x or lower. 
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

, you can use this command

pip install protobuf==3.20.3

to downgrade protobuf, and the problem will be solved. At least that's how my problem was solved.

This solution comes from this page:https://forum.unity.com/threads/typeerror-descriptors-cannot-be-created-directly-error-when-running-mlagents-learn.1399114/ Thank you hughperkins!

Upvotes: 0

Related Questions