Paul
Paul

Reputation: 99

Skip reading strings in MATLAB

Is there is easy command in MATLAB which prevents program from crashing when it reads characters?

I use xlsread to read a (20 400) matrix data , the first row and column get disregarded as they have headers, so that:

data = xlsread ('C:\file.xls') 

results in data with a size of (19 399).

I have a problem, some cells have missing data and it's written 'missing' and on some data sets i have headers reappear in middle.

Is there a way to skip these characters without the program crashing and me having to open the file in excel and deleting those fields?

Thanks


sorry for the late update. Here is the code i am using:

[a,patha]=uigetfile({'*.csv'},'Select the file' ,'c:\Data\2010'); 
file1=[patha a]; 

%# get a file ID 
fid = fopen(file1,'rt'); 
newf= textscan(fid, ['%s' repmat('%f',1,27)], 'HeaderLines', 1, 'Delimiter', ','); 
fclose(fid) ;

%//Make time a datenum of the first column
time = datenum(newf{1} );

%//Find the difference in minutes from each row
timeDiff = round(diff(datenum(time)*(24*60)));

%//the rest of the data
newf = cell2mat(newf(2:28));

the error i get is:

??? Error using ==> cat
CAT arguments dimensions are not consistent.

Error in ==> cell2mat at 81
            m{n} = cat(2,c{n,:});

Error in ==> testprogram at 31
pwr = cell2mat(newf(2:28));

it is due to the characters in my file i selected. it disappears when i manually delete them

Upvotes: 0

Views: 2942

Answers (2)

Jonas
Jonas

Reputation: 74930

Textscan fails if the string doesn't match the expectation. The empty entries lead to problems when catenating - your array would have an uneven number of columns.

textscan('bla,5.4,missing,3,3,3.4','%s%f%f%f%f%f','Delimiter',',')

ans = 

    {1x1 cell}    [5.4000]    [0x1 double]    [0x1 double]    [0x1 double]    [0x1 double]

However, you can use'TreatAsEmpty', to treat 'missing' as empty (i.e. they're replaced by NaNs)

textscan('bla,5.4,missing,3,3,3.4','%s%f%f%f%f%f','Delimiter',',','TreatAsEmpty','missing')

ans = 

    {1x1 cell}    [5.4000]    [NaN]    [3]    [3]    [3.4000]

This allows you to run cell2mat without problems.

Upvotes: 3

gnovice
gnovice

Reputation: 125854

I don't know specifically what problem you are having that is causing your program to crash, because you haven't told us how you are trying to process the data you get from XLSREAD. However, maybe this will help...

You can have XLSREAD return numeric, text, and raw data from the file in the following way:

[numData,txtData,rawData] = xlsread('C:\file.xls');

For this calling syntax:

  • The variable numData will contain only numeric values from the spreadsheet. Any cells containing non-numeric data are set to NaN.
  • The variable txtData will contain only text data from the spreadsheet. Any cells containing numeric data are set to an empty string ('').
  • The variable rawData will contain all the raw unprocessed cell content from the spreadsheet, both numeric and text.

Perhaps you can use these different forms for the data to help you deal with your additional character fields. I'm guessing that part of your problem may result from the fact that the numeric data you are processing may have NaN values in it (in places where there was text in the file) and your processing steps aren't taking this into account.

Upvotes: 2

Related Questions