gioaudino
gioaudino

Reputation: 575

C++ creating huge vector

For a process I'm trying to run I need to have a std::vector of std::tuple<long unsigned int, long unsigned int>. The test I'm doing right now should create a vector of 47,614,527,250 (around 47 billion) tuples but actually crashes right there on creation with the error terminate called after throwing an instance of 'std::bad_alloc'. My goal is to use this script with a vector roughly twice that size. The code is this:

arc_vector = std::vector<std::tuple<long unsigned int, long unsigned int>>(arcs);

where arcs is a long unsigned int with the cited value.

Can I, and in that case how do I, increase the memory size? This script is running on a 40-core machine with something like 200GB of memory so I know memory itself is not an issue.

Upvotes: 4

Views: 3290

Answers (2)

pqnet
pqnet

Reputation: 6588

47 billion tuples times 16 bytes each tuple is 780 billion bytes, which is about 760 gb. Your machine has less than 1/3 of the memory required for that, so you really need another approach, regardless of the reason your program crashes.

A proposal I can give you is to use a memory mapped file of 1TB to store that array, and if you really need to use a vector as interface you might write a custom allocator for it that uses the mapped memory. That should sort out your lack of main memory in a quasi-transparent way. If your interface requires a standard vector, with standard allocators, you are better re-designing that.

Another point to add, check what value you have for ulimit for the user running the process, because it might have a more strict limit of virtual memory than 760 gb.

Upvotes: 11

Bathsheba
Bathsheba

Reputation: 234715

You may well have a machine with a lot of memory but the problem is that you require that memory to be contiguous.

Even with memory virtualisation, that's unlikely.

For that amount of data, you'll need to use a different storage container. You could roll your own based on a linked list of vectors that subdivide the data, a vector of pointers to subdivided vectors of your tuples, or find a library that has such a construction already built.

Upvotes: 4

Related Questions