jotadepicas
jotadepicas

Reputation: 2493

MATLAB / Octave - how to parse CSV file with numbers and strings that contain commas

I have a CSV file that has 20 columns. Some of the columns have number values, others have text values, and the text ones may or may not contain commas.

CSV content example:

column1, column2, column3, column4
"text value 1", 123, "text, with a comma", 25
"another, comma", 456, "other text", 78

I'm using textscan function, but I'm getting the most buggy and weird behavior. With some arguments, it reads all the values in only one column, sometimgs it repeats columns, and most of the things I've tried lead to the commas being incorrectly interpreted as column separators (despite text being enclosed in double quotes). That is, I've tried specifying 'delimiter' argument, and also including literals in the format specification, to no avail.

What's the correct way of invoking textscan to deal with a CSV file as the example above? I'm looking for a solution that runs both on MATLAB and on Octave (or, if that's not possible, the equivalent solution in each one).

Upvotes: 3

Views: 8004

Answers (2)

Andy
Andy

Reputation: 8091

For GNU Octave, using io package

pkg load io
c = csv2cell ("jota.csv")

gives

c = 
{
  [1,1] = column1
  [2,1] = text value 1
  [3,1] = another, comma
  [1,2] =  column2
  [2,2] =  123
  [3,2] =  456
  [1,3] =  column3
  [2,3] =  text, with a comma
  [3,3] =  other text
  [1,4] =  column4
  [2,4] =  25
  [3,4] =  78
}

btw, you should explicitly mention if the solution should run on GNU Octave, Matlab or both

Upvotes: 9

ali14
ali14

Reputation: 124

First, read the column headers using the format '%s' four times:

fileID = fopen(filename);
C_text = textscan(fileID,'%s', 4,'Delimiter',',');

Then use the conversion specifier, %q to read the text enclosed by double quotation marks ("):

C = textscan(fileID,'%q %d %q %d','Delimiter',',');
fclose(fileID);

(This works for reading your sample data on Octave. It should work on MATLAB, too.)

Edit: removed redundant fopen.

Upvotes: 2

Related Questions