Reputation: 6781
Once an MLflow run is finished, external scripts can access its parameters and metrics using python mlflow
client and mlflow.get_run(run_id)
method, but the Run
object returned by get_run
seems to be read-only.
Specifically, .log_param
.log_metric
, or .log_artifact
cannot be used on the object returned by get_run
, raising errors like these:
AttributeError: 'Run' object has no attribute 'log_param'
If we attempt to run any of the .log_*
methods on mlflow
, it would log them into to a new run with auto-generated run ID in the Default
experiment.
Example:
final_model_mlflow_run = mlflow.get_run(final_model_mlflow_run_id)
with mlflow.ActiveRun(run=final_model_mlflow_run) as myrun:
# this read operation uses correct run
run_id = myrun.info.run_id
print(run_id)
# this write operation writes to a new run
# (with auto-generated random run ID)
# in the "Default" experiment (with exp. ID of 0)
mlflow.log_param("test3", "This is a test")
Note that the above problem exists regardless of the Run
status (.info.status
can be both "FINISHED" or "RUNNING", without making any difference).
I wonder if this read-only behavior is by design (given that immutable modeling runs improve experiments reproducibility)? I can appreciate that, but it also goes against code modularity if everything has to be done within a single monolith like the with mlflow.start_run()
context...
Upvotes: 7
Views: 6244
Reputation: 21
This question can be divided into two
For instance:
old: mlflow.log_metric("accuracy", 0.85)
new: mlflow.log_metric("accuracy", 0.90)
For instance:
old: mlflow.log_metric("accuracy", 0.85)
new: mlflow.log_metric("dropout", 0.05)
In the previous version of mlflow v1.13 - v1.19, you can do both, that is modify params in previous run, and add new params to previous ran. You need need the run_id for that previous run to do that
# initialize mlflow client
mlflow.set_tracking_uri(track_uri)
mlflow.set_experiment(experiment_name)
# get run_id with run name
run_object = mlflow.search_runs(filter_string=f"attributes.run_name = '{run_name}'")
run_id = run_object["run_id"]
# start previous run with run_id
with mlflow.start_run(run_id=run_id):
# print(mlflow.active_run().info)
# add new param
mlflow.log_param("accuracy", 0.90)
mlflow.log_param("droupout", 0.05)
This code work for both cases if you have between v1.13 - v1.19 version of mlflow. mlflow.log_param("accuracy", 0.90)
will fail if your version is more recent than v.1.19. You will get the error "mlflow.exceptions.RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed." This github issue explains why the feature was depreciated.
Adding new params mlflow.log_param("dropout", 0.05)
to previous run will work for all versions.
Upvotes: 0
Reputation: 6781
As it was pointed out to me by Hans Bambel and as it is documented here mlflow.start_run
(in contrast to mlflow.ActiveRun
) accepts the run_id
parameter of an existing run.
Here's an example tested to work in v1.13 through v1.19 - as you see one can even overwrite an existing metric to correct a mistake:
with mlflow.start_run(run_id=final_model_mlflow_run_id):
# print(mlflow.active_run().info)
mlflow.log_param("start_run_test", "This is a test")
mlflow.log_metric("start_run_test", 1.23)
mlflow.log_metric("start_run_test", 1.33)
mlflow.log_artifact("/home/jovyan/_tmp/formula-features-20201103.json", "start_run_test")
Upvotes: 7