Reputation: 115
I need to read the following data from a text file in MATLAB. This is just a sample, the actual matrix contains data from 1218 stations (USH denotes for US and the number adjacent to that is station ID), each column being the monthly value. Column number 2 values are flags, which I don't want in final data form. How I should proceed?
USH00011084 1890 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 170 2 -9999
USH00011084 1891 550 2 561 2 575 2 165 2 275 2 670 2 425 2 172 2 200 2 0 2 930 2 525 2 5048
USH00011084 1892 1245 2 440 2 256 2 160 2 380 2 780 2 1226 2 1355 2 300 2 100 2 165 2 -9999 -9999
USH00011084 1893 535 2 608 2 465 2 380 2 730 2 345 2 425 2 645 2 345 2 635 2 487 2 487 2 6087
USH00011084 1894 240 2 1649 2 788 2 225 2 465 2 265 2 -9999 -9999 -9999 30 2 -9999 75 2 -9999
USH00011084 1895 -9999 150 2 400 2 400 2 400 2 400 2 300 2 -9999 -9999 200 2 -9999 300 2 -9999
USH00011084 1896 440 2 1340 2 590 2 -9999 320 2 1100 2 935 2 185 2 140 2 500 2 340 2 225 2 -9999
USH00011084 1897 245 2 1230 2 475 2 435 2 50 2 25 4 735 2 980 2 0 2 0 2 -9999 -9999 -9999
USH00011084 1900 -9999 -9999 731 2 704 2 225 2 1296 2 783 2 391 2 611 2 648 2 192 2 607 2 -9999
USH00011084 1926 -9999 553 1031 356 503 678 846a 1161 1369 348 324 354 -9999
USH00011084 1927 99 920 501 89 225 731 444 703 437 316 560 937 5963
USH00011084 1928 111a 804 730 1409 351 779 535 607 -9999 -9999 -9999 -9999 -9999
Upvotes: 0
Views: 190
Reputation: 732
% open the file
fid = fopen('C:\temp\foo.txt');
if ~fid
error('Unable to open file');
end
% define column formats
colFormats{1} = '%11s';
colFormats{2} = '%4s';
colFormats{3} = '%5s';
colFormats{4} = '%1s';
colFormats{5} = '%5s';
colFormats{6} = '%1s';
colFormats{7} = '%5s';
colFormats{8} = '%1s';
colFormats{9} = '%5s';
colFormats{10} = '%1s';
colFormats{11} = '%5s';
colFormats{12} = '%1s';
colFormats{13} = '%5s';
colFormats{14} = '%1s';
colFormats{15} = '%5s';
colFormats{16} = '%1s';
colFormats{17} = '%5s';
colFormats{18} = '%1s';
colFormats{19} = '%5s';
colFormats{20} = '%1s';
colFormats{21} = '%5s';
colFormats{22} = '%1s';
colFormats{23} = '%5s';
colFormats{24} = '%1s';
colFormats{25} = '%5s';
colFormats{26} = '%1s';
colFormats{27} = '%5s';
% read data
data = textscan(fid, cell2mat(colFormats),'MultipleDelimsAsOne',true);
% close file
fclose(fid);
% extract numbers from col 3...
cellfun(@(x) str2double(x(~isletter(x))), data{3}, 'uniformoutput', false)
ans =
[-9999]
[ 550]
[ 1245]
[ 535]
[ 240]
[-9999]
[ 440]
[ 245]
[-9999]
[-9999]
[ 99]
[ 111]
Upvotes: 1
Reputation: 5073
Looks like your data has formatted column widths, so you might get away with something like this:
data = fileread('data.txt')';
data = strvcat(strsplit(data,char(10),1));
data(:,25:9:end) = [];
data(:,23:8:end) = [];
stations = strtrim(mat2cell(data(:,1:12), ones(size(data,1),1),12))
data = str2num(data(:,13:end))
This won't win awards for elegance but will get the job consistently done if the columns are always the same width, and should work even if the number of columns changes. stations
is a cell column array and data is type double. If you need to pick something else you can tinker with the choice of sections that are deleted.
For your example, data is the following matrix:
data =
1890 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 170 -9999
1891 550 561 575 165 275 670 425 172 200 0 930 525 5048
1892 1245 440 256 160 380 780 1226 1355 300 100 165 -9999 -9999
1893 535 608 465 380 730 345 425 645 345 635 487 487 6087
1894 240 1649 788 225 465 265 -9999 -9999 -9999 30 -9999 75 -9999
1895 -9999 150 400 400 400 400 300 -9999 -9999 200 -9999 300 -9999
1896 440 1340 590 -9999 320 1100 935 185 140 500 340 225 -9999
1897 245 1230 475 435 50 25 735 980 0 0 -9999 -9999 -9999
1900 -9999 -9999 731 704 225 1296 783 391 611 648 192 607 -9999
1926 -9999 553 1031 356 503 678 846 1161 1369 348 324 354 -9999
1927 99 920 501 89 225 731 444 703 437 316 560 937 5963
1928 111 804 730 1409 351 779 535 607 -9999 -9999 -9999 -9999 -9999
Upvotes: 1