Reputation: 21
I have a large parquet file. I am currently scattering it on my workers. This parquet file does not change often for me. Can I just copy it to my workers and reference it somehow? I would love to just copy this parquet file to all my workers and then some how get a future for it.
Upvotes: 1
Views: 107
Reputation: 28684
Certainly, you could copy your data file to every worker, or otherwise put it in a location that all workers can see (NFS, or cloud storage like S3). If you choose to copy to every worker's local storage, then all you need do is make sure it has the same path everywhere (including your client machine), and then you can use the standard dd.read_parquet
. If you have it in different locations, you'll have to make a custom function to read it.
Upvotes: 2