Rama Sharma
Rama Sharma

Reputation: 116

most efficient way to read huge file

I am not familiar with JAVA NIO APIs. I need help to get the answer of commonly asked interview questions. If there is file which contains 50 gb data, what is most efficient way that we can read data from file and find most frequent word.

BufferedReader.readLine() is better API than scanner . do we have any other way also apart from creating multiple threads to read this file in batches using BufferedReader.readLine() API ?

Upvotes: 1

Views: 710

Answers (2)

Abhinav
Abhinav

Reputation: 540

Perhaps, using the below class, you may achieve fastest way of taking/reading input:

 import java.io.DataInputStream; 
 import java.io.FileInputStream; 
 import java.io.IOException; 
 import java.io.InputStreamReader; 
 import java.util.Scanner; 
 import java.util.StringTokenizer; 

public class Main 
{ 
static class Reader 
{ 
    final private int BUFFER_SIZE = 1 << 16; 
    private DataInputStream din; 
    private byte[] buffer; 
    private int bufferPointer, bytesRead; 

    public Reader() 
    { 
        din = new DataInputStream(System.in); 
        buffer = new byte[BUFFER_SIZE]; 
        bufferPointer = bytesRead = 0; 
    } 

    public Reader(String file_name) throws IOException 
    { 
        din = new DataInputStream(new FileInputStream(file_name)); 
        buffer = new byte[BUFFER_SIZE]; 
        bufferPointer = bytesRead = 0; 
    } 

    public String readLine() throws IOException 
    { 
        byte[] buf = new byte[64]; // line length 
        int cnt = 0, c; 
        while ((c = read()) != -1) 
        { 
            if (c == '\n') 
                break; 
            buf[cnt++] = (byte) c; 
        } 
        return new String(buf, 0, cnt); 
    } 

    public int nextInt() throws IOException 
    { 
        int ret = 0; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 
        do
        { 
            ret = ret * 10 + c - '0'; 
        } while ((c = read()) >= '0' && c <= '9'); 

        if (neg) 
            return -ret; 
        return ret; 
    } 

    public long nextLong() throws IOException 
    { 
        long ret = 0; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 
        do { 
            ret = ret * 10 + c - '0'; 
        } 
        while ((c = read()) >= '0' && c <= '9'); 
        if (neg) 
            return -ret; 
        return ret; 
    } 

    public double nextDouble() throws IOException 
    { 
        double ret = 0, div = 1; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 

        do { 
            ret = ret * 10 + c - '0'; 
        } 
        while ((c = read()) >= '0' && c <= '9'); 

        if (c == '.') 
        { 
            while ((c = read()) >= '0' && c <= '9') 
            { 
                ret += (c - '0') / (div *= 10); 
            } 
        } 

        if (neg) 
            return -ret; 
        return ret; 
    } 

    private void fillBuffer() throws IOException 
    { 
        bytesRead = din.read(buffer, bufferPointer = 0, BUFFER_SIZE); 
        if (bytesRead == -1) 
            buffer[0] = -1; 
    } 

    private byte read() throws IOException 
    { 
        if (bufferPointer == bytesRead) 
            fillBuffer(); 
        return buffer[bufferPointer++]; 
    } 

    public void close() throws IOException 
    { 
        if (din == null) 
            return; 
        din.close(); 
    } 
} 

public static void main(String[] args) throws IOException 
{ 
    Reader s=new Reader(); 
    int n = s.nextInt(); 
    int k = s.nextInt(); 
    int count=0; 
    while (n-- > 0) 
    { 
        int x = s.nextInt(); 
        if (x%k == 0) 
        count++; 
    } 
    System.out.println(count); 
} 
} 

Upvotes: 0

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 135992

See java.nio.channels.FileChannel javadocs:

A region of a file may be mapped directly into memory; for large files this is often much more efficient than invoking the usual read or write methods.

Upvotes: 1

Related Questions