Reputation: 1042
I have a file and from file I am populating the HashMap<String, ArrayList<Objects>>
. HashMap size will be 25 for sure, means 25 keys, but the List will be huge say million records for each key.
So what I use to do now is for each key retrieve the list of records and process them parallel using threads. Things went on good until I faced the larger file and so I am facing the "java.lang.OutOfMemoryError: Java heap space".
I would like to ask you what is the best way instead populating the HashMap with the list of objects? What I am thinking is to get the 25 offsets of the file and instead of putting the lines I read from file into the arrayList, put the offset of the file and give each thread an iterator to iterate from its start offset to end offset. I still have to try this thought. But before I execute, I would like to know any better ways to optimize memory usage.
Upvotes: 0
Views: 1613
Reputation: 1058
Ditto what ares said. Need more information. What do you plan on doing with the map. Is it an operation that requires the whole file to be loaded onto memory ? Or can it be done in parts ?
Also, have you considered splitting the file into parts once its size surpasses a threshold size ?
Like Pshemo's answer here : How to break a file into pieces using Java?
Also, If you want to process in parallel, you could consider processing a map which covers a part of the file. Process that map in parallel and store the results in a queue of some sort. Provided the queue will contain a subset of the data you are processing(to avoid OutOfMemory exceptions).
Upvotes: 0
Reputation: 4413
I will populate the
HashMap<String, ArrayList<Objects>>
After populating the HashMap
what do you need to do with it? I believe that just populating the Map is not your task. Whatever the scenario, you don't need to read the whole file in memory.
Increasing the heap size may not be a good solution as someday you may get a file even bigger than your heap size.
Read the file in chunks using a BufferedReader or BufferedInputStream depending on your needs and do your task as you read. The two APIs only read a part of the file in memory at a time.
I read from file into the arrayList, put the offset of the file and give each thread an iterator to iterate from its start offset to end offset. I still have to try this thought.
Using multiple threads will not prevent java.lang.OutOfMemoryError
because all the threads will be in same JVM. Furthermore, no matter you read the file in one list or multiple lists, all the data from the file will be read into the same heap memory.
If you mention what you actually want to do with the data from file, this answer can be more specific.
Upvotes: 1