Reputation: 87
sorry if my question is too basic, but cannot solve it. I am experimenting with mlflow currently and facing the following issue:
Even if I have set the tracking_uri, the mlflow artifacts are saved to the ./mlruns/... folder relative to the path from where I run mlfow run path/to/train.py
(in command line). The mlflow server searches for the artifacts following the tracking_uri (mlflow server --default-artifact-root here/comes/the/same/tracking_uri
).
Through the following example it will be clear what I mean:
I set the following in the training script before the with mlflow.start_run() as run:
mlflow.set_tracking_uri("file:///home/@myUser/@SomeFolders/mlflow_artifact_store/mlruns/")
My expectation would be that mlflow saves all the artifacts to the place I gave in the registry uri. Instead, it saves the artifacts relative to place from where I run mlflow run path/to/train.py
, i.e. running the following
/home/@myUser/ mlflow run path/to/train.py
creates the structure:
/home/@myUser/mlruns/@experimentID/@runID/artifacts
/home/@myUser/mlruns/@experimentID/@runID/metrics
/home/@myUser/mlruns/@experimentID/@runID/params
/home/@myUser/mlruns/@experimentID/@runID/tags
and therefore it doesn't find the run artifacts in the tracking_uri, giving the error message:
Traceback (most recent call last):
File "train.py", line 59, in <module>
with mlflow.start_run() as run:
File "/home/@myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/fluent.py", line 204, in start_run
active_run_obj = client.get_run(existing_run_id)
File "/home/@myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/client.py", line 151, in get_run
return self._tracking_client.get_run(run_id)
File "/home/@myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/client.py", line 57, in get_run
return self.store.get_run(run_id)
File "/home/@myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 524, in get_run
run_info = self._get_run_info(run_id)
File "/home/@myUser/miniconda3/envs/mlflow-ff56d6062d031d43990effc19450800e72b9830b/lib/python3.6/site-packages/mlflow/store/tracking/file_store.py", line 544, in _get_run_info
"Run '%s' not found" % run_uuid, databricks_pb2.RESOURCE_DOES_NOT_EXIST
mlflow.exceptions.MlflowException: Run '788563758ece40f283bfbf8ba80ceca8' not found
2021/07/23 16:54:16 ERROR mlflow.cli: === Run (ID '788563758ece40f283bfbf8ba80ceca8') failed ===
Why is that so? How can I change the place where the artifacts are stored, this directory structure is created? I have tried mlflow run --storage-dir here/comes/the/path
, setting the tracking_uri, registry_uri. If I run the /home/path/to/tracking/uri mlflow run path/to/train.py
it works, but I need to run the scripts remotely.
My endgoal would be to change the artifact uri to an NFS drive, but even in my local computer I cannot do the trick.
Thanks for reading it, even more thanks if you suggest a solution! :) Have a great day!
Upvotes: 4
Views: 4053
Reputation: 484
Here's my solution
mlflow server -h 0.0.0.0 -p 5000 --backend-store-uri postgresql://DB_USER:DB_PASSWORD@DB_ENDPOINT:5432/DB_NAME --default-artifact-root s3://S3_BUCKET_NAME
mlflow.set_tracking_uri("http://my-tracking-server:5000/")
Upvotes: 1
Reputation: 87
This issue was solved by the following:
I have mixed the tracking_uri with the backend_store_uri.
The tracking_uri is where the MLflow related data (e.g. tags, parameters, metrics, etc.) are saved, which can be a database. On the other hand, the artifact_location is where the artifacts (other, not MLflow related data belonging to the preprocessing/training/evaluation/etc. scripts). What led me to mistakes is that by running mlflow server from command line one should set up for the --backend-store-uri the tracking_uri (also in the script by setting the mlflow.set_tracking_uri()) and for --default-artifact-location the location of the artifacts. Somehow I didn't get that the tracking_uri = backend_store_uri.
Upvotes: 2