Memory usage in loading a 226MB text file

Question

I have to read a text file of 226mb made like this:

the first number is a index, the second a value. I want to load a vector of short reading this file (8349328 positions), so I wrote this code:

    Short[] docsofword = new Short[8349328];

        br2 = new BufferedReader(new FileReader("TermOccurrenceinCollection.txt"));             
        ss = br2.readLine();
        while(ss!=null)
        {
            docsofword[Integer.valueOf(ss.split("\s+")[0])] = Short.valueOf(ss.split("\s+")[1]);  //[indexTerm] - numOccInCollection
            ss = br2.readLine();
        }
    br2.close();

It turns out that the entire load takes an incredible amount of memory of 4.2GB. Really i don't understand why, i expected a 15MB vector. Thanks for any answer.

Jitendra Kumar. Balla · Accepted Answer

If file is generated by you, use objectOutputStream, It very easy way to read the file.

As @Durandal, change the code accordingly. I am giving sample code below.

short[] docsofword = new short[8349328];

    br2 = new BufferedReader(new FileReader("TermOccurrenceinCollection.txt"));             
    ss = br2.readLine();
    int strIndex, index;
    while(ss!=null)
    {
       strIndex = ss.indexOf( ' ' );
       index = Integer.parseInt(ss.subStr(0, strIndex));
       docsofword[index] = Short.parseShort(ss.subStr(strIndex+1));
       ss = br2.readLine();
    }
br2.close();

Even you can optimise further. Instead of indexOf() we can write our own method, when char is matching to space, parse string as integer. After that we will get indexOf Space and index for get remain string.

Memory usage in loading a 226MB text file

Answers (2)

Related Questions