Reputation: 30045
Am working on .tif
files stored in Azure Data Lake Gen2. Want to open this files using rasterio
from Azure Databricks.
Example:
when reading the image file from Data Lake as spark.read.format("image").load(filepath)
works fine.
But trying to open same as
with rasterio.open(filepath) as src:
print(src.profile)
getting error:
RasterioIOError: wasbs://xxxxx.blob.core.windows.net/xxxx_2016/xxxx_2016.tif: No such file or directory
Any clues what am doing wrong?
Update:
As suggest by Axel R, mounted files on Databricks file system but still getting same issue and cannot open the file from rasterio, but can read as df.
Also tried by created shared access signature to the file in Datalake and tried to access the file through URI. Now getting error below error:
CURL error: error setting certificate verify locations: CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none
To test further tried to open a sameple file from web which is @
filepath = 'http://landsat-pds.s3.amazonaws.com/c1/L8/042/034/LC08_L1TP_042034_20170616_20170629_01_T1/LC08_L1TP_042034_20170616_20170629_01_T1_B4.TIF'
works fine
Upvotes: 2
Views: 1172
Reputation: 1300
I believe it is because rasterio is using the Local APIs and can only read from a path that starts with /dbfs/.
Is it possible for you to mount the blob storage ? That would allow you to access it with rasterio with a path starting with /dbfs/mnt/
Upvotes: 1