Reputation: 11
I'm currently having a problem with the matlab function textscan. I got a data file which looks like this:
1,2018/08/14 17:06:15, 0,+ 22.24,+ 22.46,+ 18.18,+0.0000,+0.0005,LLLLLLLLLL,LLLLLLLLLL,LLLL
or sometimes when a sensor isn't working properly it looks like this:
1,2018/07/11 17:02:53, 0,+ 23.88,+ 24.78,+ 23.65,+++++++,+ 23.94,+ 23.01,+ 24.33,LLLLLLLLLL,LLLLLLLLLL,LLLL
Since the data varies from file to file I am creating a matching formatSpec from the headerline. In the 1st case it would look like
formatSpec = '%*u %s %*u%f%f%f%f%f%*[^\n]'
and in the 2nd case like
formatSpec = '%*u %s %*u%f%f%f%f%f%f%f%*[^\n]'
I am using the texscan function like this:
textscan(fileID, formatSpec_data, data_rows, 'Delimiter', ',', 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );
but it keeps throwing an error on me with the message
Error using textscan
Mismatch between file and format character vector.
Trouble reading 'Numeric' field from file (row number 1, field number 4) ==> + 23.88,+ 24.78,+ 23.65,+++++++,+ 23.94,+ 23.01,+ 24.33,LLLLLLLLLL,LLLLLLLLLL,LLLL\n
Error in data_logger (line 31)
dataArray = textscan(fileID, formatSpec_data, data_rows, 'Delimiter', delimiter, 'HeaderLines' ,startRow, 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );
When I deactivate 'returnOnError' then textscan reads only the first row and except the date/time string everything is just empty. I also tried to use textscan without TreatAsEmpty and / or EmptyValue but I get the same result. I really don't get why textscan got problems to read e.g. ,+ 22.24 as a float. When I specify formatSpec to read all the data as strings it works but then I have to use str2num afterwards which I don't really want to do.
I'm thankful for every help and looking forward to understand this behaviour.
Upvotes: 1
Views: 249
Reputation: 378
Short answer: Matlab doesn't like the space between the + and the number in those fields. I think the simplest solution may be to just tell Matlab to ignore the + by calling it white space. Add the arguments 'WhiteSpace','+'
when you call textscan, like this:
textscan(fileID, formatSpec_data, data_rows, 'Delimiter', ',', 'EmptyValue', NaN, 'ReturnOnError', 0 , 'WhiteSpace', '+');
Note that I also removed the 'TreatAsEmpty' argument, because once you consider all the + as white space, it is empty anyway.
Another option would be to pre-parse the file and remove the space between the + and the number. You could read the file using fileread, do a replacement using strrep or regexprep, then run textscan on the result.
datain = fileread('mydatafile.csv')
datain = strrep(datain,'+ ','+');
textscan(datain, formatSpec_data, data_rows, 'Delimiter', ',', 'TreatAsEmpty', {'+++++++'},'EmptyValue', NaN, 'ReturnOnError', 0 );
Finally, if you get stuck where you absolutely have to read as text then convert to numeric values, try str2doubleq, available on the Matlab File Exchange. It is much faster than str2double or str2num.
Upvotes: 1