Gilles
Gilles

Reputation: 167

Why do I have a trailing column when reading a CSV file?

I have a CSV file whith the following structure:

"customer_id";"customer_name";"quantity";
"id1234";"Henry";"15";

Parsing with Ruby's standard CSV lib:

csv_data = CSV.read(pathtofile,{
    :headers => :first_row,
    :col_sep => ";",
    :quote_char => '"'
    :row_sep => "\r\n" #setting it to "\r" or "\n" results in MalformedCSVError
})

puts csv_data.headers.count #4

I don't understand why the parsing seems to result in four columns although the file only contains three. Is this not the right approach to parse the file?

Upvotes: 3

Views: 546

Answers (2)

the Tin Man
the Tin Man

Reputation: 160571

The trailing ; is the culprit.

You can preprocess the file, stripping the trailing ;, but that incurs unnecessary overhead.

You can post-process the returned array of data from CSV using something like this:

csv_data = CSV.read(...).map(&:pop)

That will iterate over the sub-arrays, removing the last element in each. The problem is that read isn't scalable, so you might want to rethink using it and instead, use CSV.foreach to read the file line by line and then pop the last value as they're returned to you.

Upvotes: 0

Michael Durrant
Michael Durrant

Reputation: 96544

The ; at the end of each row is implying another field, even though there is no value.

I would either remove the trailing ;'s or just ignore the fourth field when it is parsed.

Upvotes: 6

Related Questions