littleO
littleO

Reputation: 1002

Keeping huge matrix in memory across multiple runs of a C++ program

I'm writing some C++ code (using the Eigen3 matrix library) to solve a convex optimization problem involving a huge sparse matrix. It takes a minute or so to read in the matrix from a file, and I don't want to have to read in the matrix from a file every single time I run my program. (I'm going to be tuning the parameters in my optimization algorithm, which involves running my code many times in a row, and I don't want to have to wait one minute to read in the big matrix each time.)

Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again?

This kind of thing is easy in Matlab, but I don't know how it's handled in C++ (although this is a common situation so there must be a standard approach that people take).

Upvotes: 0

Views: 286

Answers (3)

YePhIcK
YePhIcK

Reputation: 5866

Your case is the perfect example for why the mmap() exists :)

mmap() (available on all modern platforms) allows you to treat a file on disk as regular RAM, with "direct" random read/write access and OS-backed paging support (much like what happens to your memory when it is swapped out by OS's memory manager)

Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again?

Well, yes... But I have a feeling its implementation would be way outside the scope of your project. In essence this is what you'd do:

  1. Create a "loader" that would load the data into memory and make that memory "shared" (available to other processes)
  2. Launch your code, providing it with that memory's handle (or address, depending on your platform) so it can request access to it
  3. When done your code will quit, detaching from that shared memory, which is still going to be held by the loader process for the next launch of your code

Upvotes: 3

Daniel Jour
Daniel Jour

Reputation: 16156

Is there a way that I can keep this big matrix in memory while I change some parameters in my code then recompile my code and run it again?

AFAIK keeping the memory of a process while it is not running, and then "rerun" the process is not supported by any operating system.

You could try to:

  • improve the reading code for the matrix (or the representation it is stored in, like suggested by chtz).
  • keep the matrix loaded by a helper process, and use inter-process communication to work with it from the process containing your "main code" (which can then be (re)started and stopped at will).
  • try to implement some sort of "hot swapable module" / hot code reloading.

But most of these will (though fun) be extremely complex to implement.

I'm going to be tuning the parameters in my optimization algorithm, which involves running my code many times in a row, and I don't want to have to wait one minute to read in the big matrix each time.

How about getting those parameters from user input instead of hard coding them? That would allow you to specify the parameters, run your code, read in another set of parameters, do another run, ... without having to recompile your program or stop and restart the process.

Upvotes: 3

chtz
chtz

Reputation: 18827

You can dump the data of your matrix in binary form -- just dump everything pointed to from S.outerIndexPtr(), S.innerIndexPtr(), S.valuePtr() (perhaps write all sizes at the start, if they are not always the same).

To read it again, just mmap your file and create a Map<SparseMatrix> from the correct start addresses.

Upvotes: 1

Related Questions