user1061293
user1061293

Reputation: 167

What is the maximum size of the map object in c++ and java?

What is the maximum size of hashmap/map object in c++ and java? I want to use hashmap, but i am working on huge data. I am worrying if i use this on large data, it may crash because of its capacity limit. Is that so? If so, what can be the alternative way?

Upvotes: 6

Views: 19457

Answers (9)

Adrian
Adrian

Reputation: 5681

Some information to keep in mind (the big picture):

If your data is huge you can't hold it in memory. You have to go to secondary storage: HDD. When you go to HDD you lose the speed optimizations of a hashmap. Every time you go to the HDD you incur a delay (seek time and such). Searching a hashmap stored on disk becomes linear time.

What I'm trying to say is that a map is useless if your data can't fit in memory.

A better solution is to index your data. Store the indices in memory, and have a pointer to where on disk that data you're looking for is. Retrieve the data from disk.

Improve this model further by using RAID for storage. Also going to DB results in the same delay as going to HDD.

I suggest you store all the values in a DB, and keep an in-memory dictionary with hashes as keys.

Upvotes: 2

Peter Lawrey
Peter Lawrey

Reputation: 533492

For Java:

HashMap has an underlying store is an array which is always a power of 2 in size. The largest it can be is 2^30. With a default load factor of 0.75 it will try to grow and fail at around 750 million entries.

TreeMap is not limited and can have more than 2^31 entries (however the size() will return MAX_VALUE) Similarly for ConcurrentSkipList and ConcurrentHashMap.

Upvotes: 2

Emilio Garavaglia
Emilio Garavaglia

Reputation: 20730

std::map and hashmap are dynamic structures. They grow as elements are added, until the system is able to provide memory for them.

The max_size() member function gives the upper limit the class implementation (in code) is able to sustain, but that limit is normally wider than the system capacity the code itself run onto.

The system available memory depends also on what else the system is doing other than running your application.

You can empirically come to a reasonable number by querying the OS about the amount of free memory it can give to your process and divide it for the size of an element as "key plus value plus some overhead (usually 20 / 24 bytes)".

Upvotes: 2

andrey
andrey

Reputation: 842

Java or C++ itself is not a limit. In practice you are limited only by resources.

Depending from your requirements approaches could be:

  • more compact structures like Patricia trie
  • database solutions or file based Maps
  • distributed DHT based solutions

Try to look here for some tips.

Upvotes: 0

Drew Dormann
Drew Dormann

Reputation: 63745

You are effectively going to be limited by the memory capacity of your system.

If you are working with huge data, consider where this huge data is coming from. And design your map in a way that leaves the huge data where it already is.

Upvotes: 0

Fred Foo
Fred Foo

Reputation: 363537

In Java, the size() of a HashMap is of type int, so there's an upper bound of 2^31-1 elements in the map.

In C++, map::max_size returns the max. number of elements. In a vanilla map, there's an upper bound of at most SIZE_T_MAX elements, which is 2^64-1 on modern hardware.

Upvotes: 6

John Humphreys
John Humphreys

Reputation: 39254

There isn't a maximum size explicitly - it depends on your platform and the implementation of your STL. For example, if you have highly fragmented memory and the implementation uses a contiguous buffer (which I doubt since usually only vectors do that) then you may run out of space long before your computer's memory is exhausted.

Alternatively, if small blocks are allocated as the container expands in the implementation, your memory limit is a combination of the memory your computer has, and the limits you've set up within your OS (if ulimit happens to be set in Linux or whatever the Windows variant of that is).

The class does have a max_size() member function, but if you haven't set that it shouldn't affect you. So, simple answer - there isn't a limit except those that are dependent on your own computer and OS.

Upvotes: 0

Sachin
Sachin

Reputation: 18747

In Java, the size of Hashmap is bounded by the JVM memory. It can grow in size. There is no hard limit, as far as I know.

Don't know about C++.

Upvotes: 0

Pubby
Pubby

Reputation: 53037

In C++, std::map has a max_size() member function (corresponding to the amount of data it can hold).

sizeof(std::map<...>) will give you the size of the actual object (corresponding to the size of the actual object, not the data it holds).

Upvotes: 2

Related Questions