user1633823
user1633823

Reputation: 357

Regarding reading a file and optimizing the performance

I was doing some research on IO and I read the following article which talks about buffering techniques. To minimize disk accesses and work by the underlying operating system, buffering techniques use a temporary buffer that reads data in a chunk-wise manner, instead of reading data directly from the disk with every read operation.

Examples were given without and with buffering.

without buffering:

try 
{ 
  File f = new File("Test.txt");
  FileInputStream fis = new FileInputStream(f);
  int b; int ctr = 0; 

  while((b = fis.read()) != -1) 
  { 
    if((char)b== '\t') 
    { 
      ctr++; 
    } 
  } 
  fs.close();
 // not the ideal way
 } catch(Exception e)
 {}

With buffering:

try
{
  File f = new File("Test.txt");
  FileInputStream fis = new FileInputStream(f);
  BufferedInputStream bs = new BufferedInputStream(fis);
  int b;
  int ctr = 0;
  while((b =bs.read()) != -1)
  {
    if((char)b== '\t')
    {
      ctr++;
    }
  }
  fs.close(); // not the ideal way
}
catch(Exception e){}

The conclusion was:

Test.txt was a 3.5MB  file 
Scenario 1 executed between 5200 to 5950 milliseconds for 10 test runs 
Scenario 2 executed between 40 to 62 milliseconds for 10 test runs.

Is there any other way to do this in Java that is better? Or any other method / technique to give better performance?Please advise..!

Upvotes: 0

Views: 852

Answers (3)

Peter Lawrey
Peter Lawrey

Reputation: 533492

You can read blocks of data at a time which can still be faster than using a buffered input.

FileInputStream fis = new FileInputStream(new File("Test.txt"));
int len, ctr = 0;
byte[] bytes = new byte[8192];

while ((len = fis.read(bytes)) > 0)
    for (int i = 0; i < len; i++)
        if (bytes[len] == '\t')
            ctr++;
fis.close();

You can also try memory mapping.

FileChannel fc = new FileInputStream(new File("Test.txt")).getChannel();
ByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
int ctr = 0;
for (int i = 0; i < bb.limit(); i++)
    if (bb.get(i) == '\t')
        ctr++;
fc.close();

I would expect both of these options being about twice as fast.

Upvotes: 0

Gray
Gray

Reputation: 116858

Is there any other way to do this in Java that is better? Or any other method / technique to give better performance?

In terms of IO performance, that probably is going to be the best without a lot of other code. You are going to be IO bound most likely anyway.

while((b =bs.read()) != -1)

This is very inefficient to read byte-by-byte. If you are reading a text file then you should be using a BufferedReader instead. This converts a byte array into String.

BufferedReader reader = new BufferedReader(new InputStreamReader(fis));
...
while ((String line = reader.readLine()) != null) {
   ...
}

Also, with any IO, you should always do it in a try/finally block to make sure you close it:

FileInputStream fis = new FileInputStream(f);
BufferedReader reader;
try {
    reader = new BufferedReader(new InputStreamReader(fis));
    // once we wrap the fis in a reader, we just close the reader
} finally {
    if (reader != null) {
       reader.close();
    }
    if (fis != null) {
       fis.close();
    }
}

Upvotes: 1

jdevelop
jdevelop

Reputation: 12296

the problem with your code is that you're reading file by bytes (one byte per request). Read it into array chunk by chunk - and performance will be equal to one with Buffer.

you may want to try out NIO and memory-mapped files as well, see http://www.linuxtopia.org/online_books/programming_books/thinking_in_java/TIJ314_029.htm

Upvotes: 1

Related Questions