newtothis
newtothis

Reputation: 525

How to work with large .jld data files in Julia

I have some files of the .jld Julia file format which have multidimensional arrays. In my drive, the files totally take up around 60 GB. I want to concatenate some of them together using hcat() and then do further calculations and plots from these data files.

However, just to read these files either takes very long, or I get an "Out of Memory Error" so I'm not sure how to work with these files. I have 8 GB RAM in my device, and I am loading the data out of an external HDD (I also prepared the data from simulations and wrote them directly to this HDD but there were no errors then).

How do I deal with files this large?

Upvotes: 4

Views: 432

Answers (1)

Przemyslaw Szufel
Przemyslaw Szufel

Reputation: 42234

In order to work with data larger than your memory you need to use to have disk-based data structures. Julia supports this functionality via mmap (see https://docs.julialang.org/en/v1/stdlib/Mmap/)

Fortunately, a higher level interface is also available via SharedArrays (see https://docs.julialang.org/en/v1/stdlib/SharedArrays/):

Hence, you can do:

using SharedArrays
a = SharedArray{Float64}("c:\\temp\\file.dat", (100,100,100))

Now you have a disk-backed array. You can copy your JLD data into it and perform the aggregation.

Upvotes: 3

Related Questions