nrcjea001
nrcjea001

Reputation: 1057

How to unzip and open pickle file in Azure Jupyter Notebook

I have a saved dataset (zip file) in Azure ML studio. Inside the zip file is a .pickle file. I am now using (Python 3.5) Jupyter in Azure's notebook service.

I would like to open and load the .pickle file in my Jupyter notebook from the saved zip file in the Azure ML Studio. Any ideas on how to do that? My code is as follows (with error):

from azureml import Workspace
from six.moves import cPickle as pick
from six.moves import range

ws = Workspace(workspace_id = '...', authorization_token='...')

with ws.datasets['xxx.zip'].open() as zf:
    with open(zf, 'rb') as p:
        pload = pick.load(p)
        train_dataset = pload['train_dataset']
        del pload
print(train_dataset.shape)

---> 14 with open(zf, 'rb') as p:

TypeError: invalid file: requests.packages.urllib3.response.HTTPResponse object at 0x7fe739589ef0

Upvotes: 0

Views: 2969

Answers (2)

rudolph daniel
rudolph daniel

Reputation: 21

The normal shell script unzip can work to unzip the file to dbfs, but the output of the sh command gets moved to the default directory file:/databricks/driver/

Create the folder in dbfs dbcks root

dbutils.fs.mkdirs("/test_unzipping/")
dbutils.fs.ls("/test_unzipping/")

copy the file from adls to dbfs

dbutils.fs.cp("/mnt/file_uploads/site_metric_test.zip", "/test_unzipping/")
//dbutils.fs.ls("/test_unzipping/")

location where the file will be available after unzip


//dbutils.fs.ls("file:/databricks/driver")
//dbutils.fs.ls("file:/test_unzipping/")

unzip the file using the shell magic command in notebook

%sh
ls /dbfs/test_unzipping/site_metric_test.zip
unzip "dbfs:/test_unzipping/site_metric_test.zip"

once the above is done the file will be available in the default folder "databricks/driver/"

dbutils.fs.ls("file:/databricks/driver/site_metric_test.csv")

Upvotes: 1

nextofsearch
nextofsearch

Reputation: 406

I don't use Azure ML Studio so this might not be what you want but it can be a workaround. You can upload a data file to Azure notebook using the 'Data >> Upload' menu. Please remember that your file will be stored in the upper relative directory, not in the current working directory. So you can unzip your file by the following code:

!unzip -o ../data.zip

The file will be unzipped into your working directory, you can check it using file browser. Hope it helps.

Upvotes: 0

Related Questions