How can one write lock a zarr store during append?

Question

Is there some way to lock a zarr store when using append?

I have already found out the hard way that using append with multiple processes is a bad idea (the batches to append aren't aligned with the batch size of the store). The reason I'd like to use multiple processes is because I need to transform the original arrays before appending them to the zarr store. It would be nice to be able to block other processes from writing concurrently but still perform the transformations in parallel, then append their data in series.

Edit:

Thanks to jdehesa's suggestion, I became aware of the synchronization part of the documentation. I passed a ProcessSynchronizer pointing to a folder on disk to my array at creation in the main thread, then spawned a bunch of worker processes with concurrent.futures and passed the array to all the workers for them to append their results. I could see that the ProcessSynchronizer did something, as the folder I pointed it to filled with files, but the array that my workers write to ended up missing rows (compared to when written from a single process).

How can one write lock a zarr store during append?

Answers (1)

Related Questions