Somnath Musib
Somnath Musib

Reputation: 3714

Memory Issue: Storing high volume data in Map

I have the below scenario:

  1. A huge list of messages from external system (Message contains a id and a payload)
  2. I am filtering those messages based on the id and storing the payload in a list and finally the id and List in a map.
  3. Later on based on the id I am retrieving the list of payload from map and submitting the entire list of payload for further processing to an executor service.

Well, I do not like this approach as at run time I am having a map containing all data (Point 2). I might end up with memory related issue. Is there any good alternative of the above approach.

EDIT

I am using Java. I am getting the messages from some external system (I have no idea about the volume of messages that could come) and finally processing them based on their ID. After processing these are getting stored in the database. However, the problem is while I am loading the messages into Map based on the ID. I have to group the messages based on the ID and then send for processing. So I have to keep the entire Map in memory for certain period of time.

Thanks in advance.

Upvotes: 0

Views: 332

Answers (1)

Fermin Silva
Fermin Silva

Reputation: 3377

I remember using myself MapDB for this. Basically it gives you a Map interface, but backed up by off-heap memory (think memory mapped files in Linux).

You can find an example here: https://github.com/jankotek/mapdb/blob/master/src/test/java/examples/CacheOffHeap.java

Will copy relevant parts here for easier reference:

        final double cacheSizeInGB = 1.0;

        // Create cache backed by off-heap store
        // In this case store will use ByteBuffers backed by byte[].
        HTreeMap cache = DBMaker
                .memoryDirectDB()
                .transactionDisable()
                .make()
                .hashMapCreate("test")
                .expireStoreSize(cacheSizeInGB) //TODO not sure this actually works
                .make();

        //generates random key and values
        Random r = new Random();
        //used to print store statistics
        Store store = Store.forEngine(cache.getEngine());


        // insert some stuff in cycle
        for(long counter=1; counter<1e8; counter++){
            long key = r.nextLong();
            byte[] value = new byte[1000];
            r.nextBytes(value);

            cache.put(key,value);

            if(counter%1e5==0){
                System.out.printf("Map size: %,d, counter %,d, store size: %,d, store free size: %,d\n",
                        cache.sizeLong(), counter, store.getCurrSize(),  store.getFreeSize());
            }

        }

        // and release memory. Only necessary with `DBMaker.memoryDirect()`
        cache.close();

Upvotes: 2

Related Questions