Alex P
Alex P

Reputation: 12487

How to configure YOLOv8 yaml file to access blob storage dataset on Azure?

Context

I want to train a custom model using Yolo (v8). I've got it working on my local machine, but it is very slow, and want to run the job on Azure Machine Learning Studio for efficiency. I am using Azure ML SDK v2.

Issue

When I run on Azure ML, I get an error saying that YOLO cannot locate my training images.

Traceback (most recent call last):
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/ultralytics/yolo/engine/trainer.py", line 125, in __init__
  self.data = check_det_dataset(self.args.data)
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/ultralytics/yolo/data/utils.py", line 243, in check_det_dataset
  raise FileNotFoundError(msg)
FileNotFoundError: 
Dataset 'custom.yaml' not found ⚠️, missing paths ['/mnt/azureml/cr/j/18bdc3371eca4975a0c4a7123f9adaec/exe/wd/valid/images']

Code / analysis

Here is the code I use to run the job:

command_job = command(
    display_name='Test Run 1',
    code="./src/",
    command="yolo detect train data=custom.yaml model=yolov8n.pt epochs=1 imgsz=1280 seed=42",
    environment="my-custom-env:3",
    compute=compute_target
)

On my local machine (using visual studio code), the custom.yaml file is in the ./src/ directory. When I run the job above, the custom.yaml is uploaded and appears in the Code section of the job (viewed in Azure ML Studio). From investigating I think this is the compute working directory which has the path:

'/mnt/azureml/cr/j/18bdc3371eca4975a0c4a7123f9adaec/exe/wd/'

My custom.yaml looks like this:

path: ../
train: train/images
val: valid/images

nc: 1
names: ["bike"]

So what is happening is that YOLO is looking at my custom.yaml, using the root directory as the path, and the trying to find valid/images within that directory:

'/mnt/azureml/cr/j/18bdc3371eca4975a0c4a7123f9adaec/exe/wd/valid/images'

My images are in my Datastore, not that directory, hence the error.

What I have tried - updating custom.yaml path

All my data (train and valid) is contained on AzureBlobStorage. In Azure ML Studio I have created a Datastore and added my data as a Dataset (references my AzureBlobStorage account). My file structure is:

Dataset/
   - Train/
        - Images
        - Labels
   - Valid/
        - Images
        - Labels

Within my custom.yaml file I have tried replacing path with the following:

 **Storage URI**: https://mystorageaccount.blob.core.windows.net/my-datasets
 **Datastore URI**: azureml://subscriptions/XXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourcegroups/my-rg/workspaces/my_workspage/datastores/my_datastore/paths/Dataset/

If I do this I get the same error. This time it appends the path to the end of the working directory. Example:

    '/mnt/azureml/cr/j/18bdc3371eca4975a0c4a7123f9adaec/exe/wd/https://mystorageaccount.blob.core.windows.net/my-datasets/valid/images'

What I have tried - mounting / download dataset

I've read the Microsoft docs - (e.g. here and here) - and it says things like:

For most scenarios, you'll use URIs (uri_folder and uri_file) - a location in storage that can be easily mapped to the filesystem of a compute node in a job by either mounting or downloading the storage to the node.

It feels like I should be mapping my data (in my Datastore) to the compute filesystem. Then I could can use that path in my custom.yaml. The documents are not clear on how I do that.

In brief: how do I set up my data on Azure ML so that the path in my custom.yaml points to the data?

Upvotes: 5

Views: 2849

Answers (1)

ouphi
ouphi

Reputation: 272

A solution is to create a folder data asset with a path azureml://datastores/<data_store_name>/paths/<dataset-path> and pass it as an input to your AzureML job. AzureML jobs resolve the path of uri_folder inputs at runtime, so the custom.yaml can programmatically be updated to contain this path.

Here is an example of an AzureML job implementing this solution:

from azure.ai.ml import command
from azure.ai.ml import Input

command_job = command(
    inputs=dict(
        data=Input(
            type="uri_folder",
            path="azureml:your-data-asset:version-number",
        )
    ),
    command="""
    echo "The data asset path is ${{ inputs.data }}" &&
    # Update custom.yaml to contain the correct path
    sed -i "s|path:.*$|path: ${{ inputs.data }}|" custom.yaml &&
    # Now custom.yaml contains the correct path so we can run the training
    yolo detect train data=custom.yaml model=yolov8n.pt epochs=1 imgsz=1280 seed=42 project=your-experiment name=experiment
    """,
    code="./src/",
    environment="your-environment",
    compute="your-compute-target",
    experiment_name="your-experiment",
    display_name="your-display-name",
)

Note that you need to have the latest mlflow and azureml-mlflow libraries installed to make sure your model, parameters and metrics are logged with mlflow:

ultralytics==8.0.133
azureml-mlflow==1.52.0
mlflow==2.4.2

Edit: Note that I published tutorials explaining all the steps to run a yolov8 training with AzureML:

In the blogpost I create the azureML dataset from a local folder. In your case the dataset is already stored in a datastore so you need to specify a path azureml://datastores/<data_store_name>/paths/<dataset-path> instead of a local path when you create the azureml data asset.

Upvotes: 2

Related Questions