Reputation: 4471
I need to deal with a CSV file that actually contains several tables, like this:
"-------------------- Section 1 --------------------"
"Identity:","ABC123"
"Initials:","XY"
"Full Name:","Roger"
"Street Address:","Foo St"
"-------------------- Section 2 --------------------"
"Line","Date","Time","Status",
"1","30/01/2013","10:49:00 PM","ON",
"2","31/01/2013","8:04:00 AM","OFF",
"3","31/01/2013","11:54:00 PM","OFF",
"-------------------- Section 3 --------------------"
I'd like to parse the blocks in each section with something like commons-csv, but it would be helpful to handle each section individually, stopping at the double-newline as if it was the end of file. Has anyone tackled this problem already?
NOTE: Files can be arbitrarily long, and can contain any number of sections, so I'm after a single pass if possible. Each section appears to start with a titled heading (------- title ------\n\n
) and end with two empty lines.
Upvotes: 0
Views: 1562
Reputation: 11551
How about use java.io.FilterReader? You can figure out what Reader methods you need to override by trial and error. You custom class will have to read ahead an entire line and see if it is a 'Section' line. If it is, then return EOF to stop the commons-csv
parser. You can then read the next section from your custom class. Not elegant, but it would probably work. Example given:
class MyReader extends FilterReader {
private String line;
private int pos;
public MyReader(BufferedReader in) {
super(in);
line = null;
pos = 0;
}
@Override
public int read() {
try {
if ( line == null || pos >= line.length() ) {
do {
line = ((BufferedReader)in).readLine();
} while ( line != null && line.length() == 0 );
if ( line == null ) return -1;
line = line + "\r\n";
pos = 0;
}
if ( line.contains("-------------------- Section ") ) {
line = null;
return -1;
}
return line.charAt(pos++);
} catch ( Exception e) { throw new RuntimeException(e); }
}
}
You would use it like so:
public void run() throws Exception {
BufferedReader in = new BufferedReader(new FileReader(ReadRecords.class.getResource("/records.txt").getFile()));
MyReader reader = new MyReader(in);
int c;
while( (c=reader.read()) != -1 ) {
System.out.print((char)c);
}
while( (c=reader.read()) != -1 ) {
System.out.print((char)c);
}
while( (c=reader.read()) != -1 ) {
System.out.print((char)c);
}
reader.close();
}
Upvotes: 3
Reputation: 19895
Assuming that the file contains text in 2 sections, delineated as per the example, its processing is straightforward, e.g.:
Java
BufferedReader
object to read the file line-by-lineCSV
header (Section 2)CSV
parser (commons-csv
or other) using the header and the other parameters (comma separator, quotes etc.)The parser will provide some iterator-like API to read each line into a Java
object, from which reading the fields will be trivial. This approach is vastly superior to pre-loading everything in memory, because it can accommodate any file size.
Upvotes: 0
Reputation: 220762
You can use String.split()
to access the individual CSV sections:
for (String csv : content.split("\"----+ Section \\d+ ----+\"")) {
// Skip empty sections
if (csv.length() == 0) continue;
// parse and process each individual "csv" section here
}
Upvotes: 1