MBraiN
MBraiN

Reputation: 55

Java - Fastest Way to Reading Text Files Char by Char

I have nearly 500 text files with 10 million words. I have to index those words. What is the fastest way to read from a text file character by character? Here is my initial attempt:

InputStream ist = new FileInputStream(this.path+"/"+doc);
BufferedReader in = new BufferedReader(new InputStreamReader(ist));

String line;

while((line = in.readLine()) != null){


   line = line.toUpperCase(Locale.ENGLISH);
    String word = "";

    for (int j = 0; j <= line.length(); j++) {
         char  c= line.charAt(j);
     // OPERATIONS

}

Upvotes: 4

Views: 9573

Answers (3)

user207421
user207421

Reputation: 311052

Don't read lines and then rescan the lines char by char. That way you are processing every character twice. Just read chars via BufferedReader.read().

Upvotes: 1

zengr
zengr

Reputation: 38919

read() will not give considerable difference in performance.

Read more: Peter Lawery's comparison of read() and readLine()

Now, coming back to your original question:
Input string: hello how are you?
So you need to index the words of the line, i.e.:

BufferedReader r = new BufferedReader(new InputStreamReader(inputStream));
String line;
while ((line = r.readLine()) != null) {
   String[] splitString = line.split("\\s+");
   //Do stuff with the array here, i.e. construct the index.
}

Note: The pattern \\s+ will put delimiter in the string as any whitespace like tab, space etc.

Upvotes: 1

Mechkov
Mechkov

Reputation: 4324

InputStreamReader's read() method can read a character at a time.

You can wrap it around FileReader or a BufferedReader or example.

Hope this helps!

Upvotes: 0

Related Questions