Reputation: 103
I've got a CSV file with 3 million+ rows.
The format is supposed to be like so: date, name , num1, num2
e.g. "2019-05-07, New york, 10, 3 2019-05-08, New york, 15, 5, 2019-05-09, New york, 12, 6" and so on...
The problem is every 5,000 rows or so, the "Name" column will have commas in its value.
e.g. 2019-05-09, Denver, Colorado, 10, 9
My script incorrectly reads 4 columns and fails. Some values in the name column even have 3 commas.
Note the Name column values are not enclosed in quotes, so that's why it's giving me the error.
Is there a way to detect these extra commas? I don't think there is, so I'm beginning to think this 3m+ row file is useless trying to parse.
Upvotes: 0
Views: 95
Reputation: 5763
To parse, you can split into an array, then use shift and pop for the peripheral fields. Finally, you can just join on what's left:
let line = '2019-05-09, Denver, Colorado, 10, 9';
let entries = line.split(',');
let parsed = {
date: entries.shift().trim(),
num2: entries.pop().trim(),
num1: entries.pop().trim(),
name: entries.join(',').trim()
}
console.log(parsed);
So, to answer your question: No, your csv file is not unreadable, FOR NOW. If columns can be appended in the future, and such columns suffer the same issue as "name", you're in trouble. It's probably wiser to push back on the developer of the file and get them to properly quote it. You would not be out of line.
Upvotes: 1
Reputation: 80
It's not very efficient, but if the column in question is always cities and states you could always do a find/replace for any states in the file before running your script. (e.g. -Find ", Colorado" replace with " Colorado".
Upvotes: 0
Reputation: 1135
Well, nothing is impossible per se... you can, for example, work backwards and look for the first column (delimited by the first comma), the last two columns (by looking for the last 2 commas) and treat everything in between as the name. But you'll need to implement your own parsing function as I doubt a library would deal with invalid CSV like the one you have.
Upvotes: 0