user755806
user755806

Reputation: 6825

Issue while reading tab delimited text file?

I have a tab delimited file and i have to read the data from the file.

Col1    Col2    Col3
data1   data2   data3
data1   data2   data3

if values are present for all columns then no isssues. the problem is sometimes the few columns may not contain values as below.

Col1    Col2    Col3
data1           data3
data1   data2   

in above data, am able to read first rows data as col2's value will be empty string. But the second row's col3 has no data. here i get array index out of bounds exception. why am i not getting empty string for the second row's col3?

i am using the code as below:

String dataFileName = "C:\\Documents and Settings\\User1\\some.txt";

         /**
          * Creating a buffered reader to read the file
          */
         BufferedReader bReader = new BufferedReader(
                 new FileReader(dataFileName));

         String line;

         /**
          * Looping the read block until all lines in the file are read.
          */
         while ((line = bReader.readLine()) != null) {

             /**
              * Splitting the content of tabbed separated line
              */
             String datavalue[] = line.split("\t");
             String value1 = datavalue[0];
             String value2 = datavalue[1];
             String value3 = datavalue[2];
}

Thanks!

Upvotes: 3

Views: 4843

Answers (2)

OldCurmudgeon
OldCurmudgeon

Reputation: 65869

String.split by default merges duplicate separators into one. You should use a negative second parameter:

String datavalue[] = (line+"\t\t\t").split("\t",-1);

Also - in case the original file is missing trailing tabs you can add extra tabs to the line to stop this breaking your code.

Upvotes: 2

Francisco Spaeth
Francisco Spaeth

Reputation: 23913

The lazy way would be something like:

...
String datavalue[] = Arrays.copyOf(line.split("\t"),3);
String value1 = datavalue[0];
String value2 = datavalue[1];
String value3 = datavalue[2];
...

Basically you are splitting the content and copying it to a new array in which the padded elements are null as documented:

Copies the specified array, truncating or padding with nulls (if necessary) so the copy has the specified length. For all indices that are valid in both the original array and the copy, the two arrays will contain identical values. For any indices that are valid in the copy but not the original, the copy will contain null. Such indices will exist if and only if the specified length is greater than that of the original array. The resulting array is of exactly the same class as the original array.

Upvotes: 3

Related Questions