Reputation: 167
What is the maximum size of hashmap/map object in c++ and java? I want to use hashmap, but i am working on huge data. I am worrying if i use this on large data, it may crash because of its capacity limit. Is that so? If so, what can be the alternative way?
Upvotes: 6
Views: 19457
Reputation: 5681
Some information to keep in mind (the big picture):
If your data is huge you can't hold it in memory. You have to go to secondary storage: HDD. When you go to HDD you lose the speed optimizations of a hashmap. Every time you go to the HDD you incur a delay (seek time and such). Searching a hashmap stored on disk becomes linear time.
What I'm trying to say is that a map is useless if your data can't fit in memory.
A better solution is to index your data. Store the indices in memory, and have a pointer to where on disk that data you're looking for is. Retrieve the data from disk.
Improve this model further by using RAID for storage. Also going to DB results in the same delay as going to HDD.
I suggest you store all the values in a DB, and keep an in-memory dictionary with hashes as keys.
Upvotes: 2
Reputation: 533492
For Java:
HashMap has an underlying store is an array which is always a power of 2 in size. The largest it can be is 2^30. With a default load factor of 0.75 it will try to grow and fail at around 750 million entries.
TreeMap is not limited and can have more than 2^31 entries (however the size() will return MAX_VALUE) Similarly for ConcurrentSkipList and ConcurrentHashMap.
Upvotes: 2
Reputation: 20730
std::map and hashmap are dynamic structures. They grow as elements are added, until the system is able to provide memory for them.
The max_size() member function gives the upper limit the class implementation (in code) is able to sustain, but that limit is normally wider than the system capacity the code itself run onto.
The system available memory depends also on what else the system is doing other than running your application.
You can empirically come to a reasonable number by querying the OS about the amount of free memory it can give to your process and divide it for the size of an element as "key plus value plus some overhead (usually 20 / 24 bytes)".
Upvotes: 2
Reputation: 842
Java or C++ itself is not a limit. In practice you are limited only by resources.
Depending from your requirements approaches could be:
Try to look here for some tips.
Upvotes: 0
Reputation: 63745
You are effectively going to be limited by the memory capacity of your system.
If you are working with huge data, consider where this huge data is coming from. And design your map in a way that leaves the huge data where it already is.
Upvotes: 0
Reputation: 363537
In Java, the size()
of a HashMap
is of type int
, so there's an upper bound of 2^31-1 elements in the map.
In C++, map::max_size
returns the max. number of elements. In a vanilla map
, there's an upper bound of at most SIZE_T_MAX
elements, which is 2^64-1 on modern hardware.
Upvotes: 6
Reputation: 39254
There isn't a maximum size explicitly - it depends on your platform and the implementation of your STL. For example, if you have highly fragmented memory and the implementation uses a contiguous buffer (which I doubt since usually only vectors do that) then you may run out of space long before your computer's memory is exhausted.
Alternatively, if small blocks are allocated as the container expands in the implementation, your memory limit is a combination of the memory your computer has, and the limits you've set up within your OS (if ulimit happens to be set in Linux or whatever the Windows variant of that is).
The class does have a max_size() member function, but if you haven't set that it shouldn't affect you. So, simple answer - there isn't a limit except those that are dependent on your own computer and OS.
Upvotes: 0
Reputation: 18747
In Java, the size of Hashmap is bounded by the JVM memory. It can grow in size. There is no hard limit, as far as I know.
Don't know about C++.
Upvotes: 0
Reputation: 53037
In C++, std::map
has a max_size()
member function (corresponding to the amount of data it can hold).
sizeof(std::map<...>)
will give you the size of the actual object (corresponding to the size of the actual object, not the data it holds).
Upvotes: 2