Reputation: 10383
I created a file dataset from a data lake folder on Azure ML Studio, at the moment I´m able to download the data from the dataset to the compute instance with this code:
subscription_id = 'xxx'
resource_group = 'luisdatapipelinetest'
workspace_name = 'ml-pipelines'
workspace = Workspace(subscription_id, resource_group, workspace_name)
dataset = Dataset.get_by_name(workspace, name='files_test')
path = "/mnt/batch/tasks/shared/LS_root/mounts/clusters/demo1231/code/Users/luis.rramirez/test/"
dataset.download(target_path=path, overwrite=True)
With that I'm able to access the files from the notebook.
But copying the data from the data lake to the compute instance is not efficient, how can I mount the data lake directory in the vm instead of copying the data each time?
Upvotes: 2
Views: 1165