user962808
user962808

Reputation: 115

Reading data using textscan in MATLAB

I need to read the following data from a text file in MATLAB. This is just a sample, the actual matrix contains data from 1218 stations (USH denotes for US and the number adjacent to that is station ID), each column being the monthly value. Column number 2 values are flags, which I don't want in final data form. How I should proceed?

USH00011084 1890 -9999    -9999    -9999    -9999    -9999    -9999    -9999    -9999    -9999    -9999    -9999      170  2 -9999   
USH00011084 1891   550  2   561  2   575  2   165  2   275  2   670  2   425  2   172  2   200  2     0  2   930  2   525  2  5048   
USH00011084 1892  1245  2   440  2   256  2   160  2   380  2   780  2  1226  2  1355  2   300  2   100  2   165  2 -9999    -9999   
USH00011084 1893   535  2   608  2   465  2   380  2   730  2   345  2   425  2   645  2   345  2   635  2   487  2   487  2  6087   
USH00011084 1894   240  2  1649  2   788  2   225  2   465  2   265  2 -9999    -9999    -9999       30  2 -9999       75  2 -9999   
USH00011084 1895 -9999      150  2   400  2   400  2   400  2   400  2   300  2 -9999    -9999      200  2 -9999      300  2 -9999   
USH00011084 1896   440  2  1340  2   590  2 -9999      320  2  1100  2   935  2   185  2   140  2   500  2   340  2   225  2 -9999   
USH00011084 1897   245  2  1230  2   475  2   435  2    50  2    25  4   735  2   980  2     0  2     0  2 -9999    -9999    -9999   
USH00011084 1900 -9999    -9999      731  2   704  2   225  2  1296  2   783  2   391  2   611  2   648  2   192  2   607  2 -9999   
USH00011084 1926 -9999      553     1031      356      503      678      846a    1161     1369      348      324      354    -9999   
USH00011084 1927    99      920      501       89      225      731      444      703      437      316      560      937     5963   
USH00011084 1928   111a     804      730     1409      351      779      535      607    -9999    -9999    -9999    -9999    -9999   

Upvotes: 0

Views: 190

Answers (2)

siliconwafer
siliconwafer

Reputation: 732

% open the file
fid = fopen('C:\temp\foo.txt');
if ~fid
    error('Unable to open file');
end

% define column formats
colFormats{1} = '%11s';
colFormats{2} = '%4s';
colFormats{3} = '%5s';
colFormats{4} = '%1s';
colFormats{5} = '%5s';
colFormats{6} = '%1s';
colFormats{7} = '%5s';
colFormats{8} = '%1s';
colFormats{9} = '%5s';
colFormats{10} = '%1s';
colFormats{11} = '%5s';
colFormats{12} = '%1s';
colFormats{13} = '%5s';
colFormats{14} = '%1s';
colFormats{15} = '%5s';
colFormats{16} = '%1s';
colFormats{17} = '%5s';
colFormats{18} = '%1s';
colFormats{19} = '%5s';
colFormats{20} = '%1s';
colFormats{21} = '%5s';
colFormats{22} = '%1s';
colFormats{23} = '%5s';
colFormats{24} = '%1s';
colFormats{25} = '%5s';
colFormats{26} = '%1s';
colFormats{27} = '%5s';

% read data
data = textscan(fid, cell2mat(colFormats),'MultipleDelimsAsOne',true);

% close file
fclose(fid);

% extract numbers from col 3...

cellfun(@(x) str2double(x(~isletter(x))), data{3}, 'uniformoutput', false)

ans =

[-9999]
[  550]
[ 1245]
[  535]
[  240]
[-9999]
[  440]
[  245]
[-9999]
[-9999]
[   99]
[  111]

Upvotes: 1

Buck Thorn
Buck Thorn

Reputation: 5073

Looks like your data has formatted column widths, so you might get away with something like this:

data = fileread('data.txt')';
data = strvcat(strsplit(data,char(10),1));
data(:,25:9:end) = [];
data(:,23:8:end) = [];

stations = strtrim(mat2cell(data(:,1:12), ones(size(data,1),1),12))
data = str2num(data(:,13:end))

This won't win awards for elegance but will get the job consistently done if the columns are always the same width, and should work even if the number of columns changes. stations is a cell column array and data is type double. If you need to pick something else you can tinker with the choice of sections that are deleted.

For your example, data is the following matrix:

data =

   1890  -9999  -9999  -9999  -9999  -9999  -9999  -9999  -9999  -9999  -9999  -9999    170  -9999
   1891    550    561    575    165    275    670    425    172    200      0    930    525   5048
   1892   1245    440    256    160    380    780   1226   1355    300    100    165  -9999  -9999
   1893    535    608    465    380    730    345    425    645    345    635    487    487   6087
   1894    240   1649    788    225    465    265  -9999  -9999  -9999     30  -9999     75  -9999
   1895  -9999    150    400    400    400    400    300  -9999  -9999    200  -9999    300  -9999
   1896    440   1340    590  -9999    320   1100    935    185    140    500    340    225  -9999
   1897    245   1230    475    435     50     25    735    980      0      0  -9999  -9999  -9999
   1900  -9999  -9999    731    704    225   1296    783    391    611    648    192    607  -9999
   1926  -9999    553   1031    356    503    678    846   1161   1369    348    324    354  -9999
   1927     99    920    501     89    225    731    444    703    437    316    560    937   5963
   1928    111    804    730   1409    351    779    535    607  -9999  -9999  -9999  -9999  -9999

Upvotes: 1

Related Questions