Reputation: 107
How would I go about creating and return a file new_zarr.zarr
from a xarray Dataset?
I know xarray.Dataset.to_zarr()
exists but this returns a ZarrStore
and I must return a bytes-like
object.
I have tried using the tempfile
module but am unsure how to proceed, how would I write an xarray.Dataset to a bytes-like object
that reurns a .zarr
file that can be downloaded?
Upvotes: 0
Views: 1174
Reputation: 6444
Zarr supports multiple storage backends (DirectoryStore, ZipStore, etc.). If you are looking for a single file object, it sounds like the ZipStore is what you want.
import xarray as xr
import zarr
ds = xr.tutorial.open_dataset('air_temperature')
store = zarr.storage.ZipStore('./new_zarr.zip')
ds.to_zarr(store)
The zip file can be thought of as a single file zarr store and can be downloaded (or moved around as a single store).
If you want to do this all in memory, you could extend zarr.ZipStore
to allow passing in a BytesIO object:
class MyZipStore(zarr.ZipStore):
def __init__(self, path, compression=zipfile.ZIP_STORED, allowZip64=True, mode='a',
dimension_separator=None):
# store properties
if isinstance(path, str): # this is the only change needed to make this work
path = os.path.abspath(path)
self.path = path
self.compression = compression
self.allowZip64 = allowZip64
self.mode = mode
self._dimension_separator = dimension_separator
# Current understanding is that zipfile module in stdlib is not thread-safe,
# and so locking is required for both read and write. However, this has not
# been investigated in detail, perhaps no lock is needed if mode='r'.
self.mutex = RLock()
# open zip file
self.zf = zipfile.ZipFile(path, mode=mode, compression=compression,
allowZip64=allowZip64)
Then you can create the create the zip file in memory:
zip_buffer = io.BytesIO()
store = MyZipStore(zip_buffer)
ds.to_zarr(store)
You'll notice that the zip_buffer
contains a valid zip file:
zip_buffer.read(10)
b'PK\x03\x04\x14\x00\x00\x00\x00\x00'
(PK\x03\x04
is the Zip file magic number)
Upvotes: 3