Reputation: 314
I'm working under Java and want to extract data according to column from a text file.
"myfile.txt" contents:
ID SALARY RANK
065 12000 1
023 15000 2
035 25000 3
076 40000 4
I want to extract the data individually according to any Column i.e ID, SALARY, RANK etc
Basically I want to perform operations on individual data according to columns.
I've listed the data from "myfile.txt" by using while loop and reading line-by-line:
while((line = b.readLine()) != null) {
stringBuff.append(line + "\n");
}
link: Reading selective column data from a text file into a list in Java
Under bove link it is written to use the following: String[] columns = line.split(" ");
But how to use it correctly, please any hint or help?
Upvotes: 0
Views: 12040
Reputation: 2503
You can use a regex to detect longer spaces, example:
String text = "ID SALARY RANK\n" +
"065 12000 1\n" +
"023 15000 2\n" +
"035 25000 3\n" +
"076 40000 4\n";
Scanner scanner = new Scanner(text);
//reading the first line, always have header
//I suppose
String nextLine = scanner.nextLine();
//regex to break on any ammount of spaces
String regex = "(\\s)+";
String[] header = nextLine.split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example header[0], header[1], header[2]...
System.out.println(Arrays.toString(header));
//reading the rows
while (scanner.hasNext()) {
String[] row = scanner.nextLine().split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example row[0], row[1], row[2]...
System.out.println(Arrays.toString(row));
System.out.println(row[0]);//first column (ID)
}
Upvotes: 4
Reputation: 19798
while((line = b.readLine()) != null) {
String[] columns = line.split(" ");
System.out.println("my first column : "+ columns[0] );
System.out.println("my second column : "+ columns[1] );
System.out.println("my third column : "+ columns[2] );
}
Now instead of System.out.println
, do whatever you want with your columns.
But I think your columns are separated by tabs
so you might want to use split("\t")
instead.
Upvotes: 3