Momo
Momo

Reputation: 11

Why can't the matlab textscan function read + 22.24 as a float?

I'm currently having a problem with the matlab function textscan. I got a data file which looks like this:

     1,2018/08/14 17:06:15,  0,+ 22.24,+ 22.46,+ 18.18,+0.0000,+0.0005,LLLLLLLLLL,LLLLLLLLLL,LLLL

or sometimes when a sensor isn't working properly it looks like this:

   1,2018/07/11 17:02:53,  0,+ 23.88,+ 24.78,+ 23.65,+++++++,+ 23.94,+ 23.01,+ 24.33,LLLLLLLLLL,LLLLLLLLLL,LLLL

Since the data varies from file to file I am creating a matching formatSpec from the headerline. In the 1st case it would look like

formatSpec = '%*u %s %*u%f%f%f%f%f%*[^\n]'

and in the 2nd case like

formatSpec = '%*u %s %*u%f%f%f%f%f%f%f%*[^\n]'

I am using the texscan function like this:

textscan(fileID, formatSpec_data, data_rows, 'Delimiter', ',', 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );

but it keeps throwing an error on me with the message

Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 1, field number 4) ==> + 23.88,+ 24.78,+ 23.65,+++++++,+ 23.94,+ 23.01,+ 24.33,LLLLLLLLLL,LLLLLLLLLL,LLLL\n

Error in data_logger (line 31)
dataArray = textscan(fileID, formatSpec_data, data_rows, 'Delimiter', delimiter, 'HeaderLines' ,startRow, 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );

When I deactivate 'returnOnError' then textscan reads only the first row and except the date/time string everything is just empty. I also tried to use textscan without TreatAsEmpty and / or EmptyValue but I get the same result. I really don't get why textscan got problems to read e.g. ,+ 22.24 as a float. When I specify formatSpec to read all the data as strings it works but then I have to use str2num afterwards which I don't really want to do.

I'm thankful for every help and looking forward to understand this behaviour.

Upvotes: 1

Views: 249

Answers (1)

Mike Scannell
Mike Scannell

Reputation: 378

Short answer: Matlab doesn't like the space between the + and the number in those fields. I think the simplest solution may be to just tell Matlab to ignore the + by calling it white space. Add the arguments 'WhiteSpace','+' when you call textscan, like this:

textscan(fileID, formatSpec_data, data_rows, 'Delimiter', ',', 'EmptyValue', NaN, 'ReturnOnError', 0 , 'WhiteSpace', '+');

Note that I also removed the 'TreatAsEmpty' argument, because once you consider all the + as white space, it is empty anyway.

Another option would be to pre-parse the file and remove the space between the + and the number. You could read the file using fileread, do a replacement using strrep or regexprep, then run textscan on the result.

datain = fileread('mydatafile.csv')
datain = strrep(datain,'+ ','+');
textscan(datain, formatSpec_data, data_rows, 'Delimiter', ',', 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );

Finally, if you get stuck where you absolutely have to read as text then convert to numeric values, try str2doubleq, available on the Matlab File Exchange. It is much faster than str2double or str2num.

Upvotes: 1

Related Questions