Sebastien
Sebastien

Reputation: 1328

Why is CSVParser is reading the next CSVRecord

Using org.apache.commons.csv.CSVParser I am having a strange behavior.

I am trying to read, line by line, a csv file delimited by ; but my parser is skipping line for an unknown reason.

Here is my code:

public static void main(String[] args) {
    try (
        File file = new File("myFile.csv");
        Reader reader = new BufferedReader(new FileReader(file));
        CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT.withDelimiter(';'));
    ) {
        if (!parser.iterator().hasNext()) {
            throw new RuntimeException("The file is empty.");
        }
        while(parser.hasNext()) { //<----- This skip a line! 
            console.log(parser.iterator().next().get(0).trim());
        }
    }
}

So my console looks like:

line2
line4
line6
line8
line10
line12

etc...

So my problem is that the CSVParser is skipping a line on parser.hasNext() and it shouldn't.

Is my code wrong? I am pretty sure if I replace the parser with an ArrayList the iterator work as expected... Is this a known bug? If yes can you guys point to a work around or a better library?

Upvotes: 2

Views: 3313

Answers (2)

amanin
amanin

Reputation: 4139

Well, by default, the parser considers the first line as the header (column definition), so it is skipped in the returned records. To include this line, you must prepare your formatting accordingly, using withSkipHeaderRecord.

EDIT: Sorry, I've read too fast. I thought only first line was skipped.

Upvotes: -1

Arnaud
Arnaud

Reputation: 17534

The problem you have it that each iteration calls iterator(), which returns a NEW Iterator .

Things are getting weird past this point, since the iterator has a current field storing the current record, and of course the current record of a new iterator is null .

In that case it calls getNextRecord() from CSVParser (source code), thus skipping a line .

If you want to stick with the iterator, just re-use the same instance :

Iterator<CSVRecord> iterator = parser.iterator();

while(iterator.hasNext()) { 
    console.log(iterator.next().get(0).trim());
}

Upvotes: 2

Related Questions