Mehrab Shahriar
Mehrab Shahriar

Reputation: 33

How to avoid reading a very large Array in Matlab multiple times?

I have a large Array/Matrix with 5899091 rows and 11 columns. I am storing it in a text file. Using dlmread() method in matlab i am reading it everytime i need it. However,it is taking a lot of time(more than 1 minute). And i need to read the file again and again. I got stuck in this situation. My question is:

1) Is there any way to read the file just once and save it in any kind of global/persistent Matrix?

2) Is there a better way to read a text file and convert it into a matrix in a more efficient way?

Thanks in advance.

Upvotes: 1

Views: 877

Answers (4)

yuk
yuk

Reputation: 19870

You can read the file once and save it to MATLAB's MAT file. Then you can access the saved variables fully or partially (basically as any variable in MATLAB workspace) directly from the file using MATFILE. I have answered a similar question about it here. Please have a look.

Upvotes: 0

reve_etrange
reve_etrange

Reputation: 2571

The best option is almost certainly to simply read the file once in a script or control function and then pass it as a variable to any subsequent functions which require that data. This is just as much work as adding the global declarations and is cleaner, more maintainable and more flexible.

You can also save the variable to a MAT file. If each element in your file is of type double, it should be a bit over 4GB in size. The MAT format is efficient, but the major benefit is from storing your numbers as numbers instead of text. With 5 or 8 significant digits the same numbers in ASCII take 6.2 or 9.3 GB respectively.

If for some reason you really don't want to pass the data as a variable, I would recommend nested functions over global variables:

function aResult = aFunction(var)

    data = dlmread(...);

    var4 = bFunction(var);

    function bResult = bFunction(var)

        var4 = cFunction(data);

    end

end

Of course at this point you are still wrapping the business functions in something. The scoping rules are helpful.

Now, if the real problem is just the size of this file - that is, it's too big for memory and you are using range arguments to dlmread to access the file in chunks - then you should probably take the time to design a format for use with memmapfile. This Wikipedia page explains the potential benefits.

Then there is the brute force solution.

Upvotes: 1

High Performance Mark
High Performance Mark

Reputation: 78316

You might get the performance you want from a memory-mapped file. Investigate the Matlab function memmapfile. It's not something I use much so won't offer any further advice which is likely to be wrong.

Upvotes: 1

Mikhail
Mikhail

Reputation: 8028

  1. You want to use global variables. Declare the global at the top of the function and it will be shared by the functions it is declared in: see http://www.mit.edu/people/abbe/matlab/globals.html
  2. Use a .mat file. It will be slightly faster. Also, if the matrix is easy to create (large identity or eye matrix) it maybe quicker to generate it on the fly. Lastly, if your matrix is sparse use the sparse matrix operations.

Upvotes: 0

Related Questions