Reputation: 1543
I have a long string containing CSV data from a file. I want to store it in a JavaScript Array of Arrays. But one column has arbitrary text in it. That text could contain double-quotes and commas.
Splitting the CSV string into separate row strings is no problem:
var theRows = theCsv.split(/\r?\n/);
But then how would I best split each row?
Since it's CSV data I need to split on commas. But
var theArray = new Array();
for (var i=0, i<theRows.length; i++) {
theArray[i] = theRows[i].split(',');
}
won't work for elements containing quotes and commas, like this example:
512,"""Fake News"" and the ""Best Way"" to deal with A, B, and C", 1/18/2019,media
How can I make sure that 2nd element gets properly stored in a single array element as
"Fake News" and the "Best Way" to deal with A, B, and C
Thanks.
The suggested solution which looked similar unfortunately did not work when I tried the CSVtoArray function there. Instead of returning array elements, a null value was returned, as described in my comment below.
Upvotes: 6
Views: 1390
Reputation: 430
This should do it:
let parseRow = function(row) {
let isInQuotes = false;
let values = [];
let val = '';
for (let i = 0; i < row.length; i++) {
switch (row[i]) {
case ',':
if (isInQuotes) {
val += row[i];
} else {
values.push(val);
val = '';
}
break;
case '"':
if (isInQuotes && i + 1 < row.length && row[i+1] === '"') {
val += '"';
i++;
} else {
isInQuotes = !isInQuotes
}
break;
default:
val += row[i];
break;
}
}
values.push(val);
return values;
}
It will return the values in an array:
parseRow('512,"""Fake News"" and the ""Best Way"" to deal with A, B, and C", 1/18/2019,media');
// => ['512', '"Fake News" and the "Best Way" to deal with A, B, and C', ' 1/18/2019', 'media']
To get the requested array of arrays you can do:
let parsedCsv = theCsv.split(/\r?\n/).map(parseRow);
The code might look a little obscure. But the principal idea is as follows: We parse the string character by character. When we encounter a "
we set isInQuotes = true
. This will change the behavior for parsing ,
and ""
. When we encounter a single "
we set isInQuotes = false
again.
Upvotes: 3