Reputation: 727
I would like to read a file into Matlab as a matrix for a file that contains information in an odd format.
The file data.txt has the data written as:
04001400 HI 34.50 118.27 19480701 08 LST
0 0 0 0 0 0 0 0 0 0 0 0
MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
04001400 HI 34.50 118.27 19480801 08 LST
0 0 0 0 0 0 0 0 0 0 0 0
MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
04001400 HI 34.50 118.27 19480901 08 LST
0 0 0 0 0 0 0 0 0 0 0 0
MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
the first number is a station number, HI is a case, the third and fourth numbers are lat and long coordinates, the other number is year,month,day,, and the other number (08) is time zone, followed by LST meaning time frame. The following 24 numbers or in the above example the 0's and MIS are values for a particular region and time. I am trying to store the contents of the file as a matrix in this type of format of dimension [n x 31] (where 31 is the number of columns and n is the amount of rows total in the file):
04001400 HI 34.50 118.27 19480701 08 LST 0 0 0 0 0 0 0 0 0 0 0 0 MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
04001400 HI 34.50 118.27 19480801 08 LST 0 0 0 0 0 0 0 0 0 0 0 0 MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
04001400 HI 34.50 118.27 19480901 08 LST 0 0 0 0 0 0 0 0 0 0 0 0 MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS MIS
I have tried coding it this way based on the function textscan():
fid = fopen('data.txt', 'rt');
data = textscan(fid, '%d %s %f %f %s %d %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s');
fclose(fid);
But it does not work as I have described above. Is there a way that I can do that? Thank you for your help.
Note: I want to read the date (19480701) as a string so I can later format it as a date type.
Upvotes: 0
Views: 247
Reputation: 16
Actually, the code you wrote should be pretty close to working. You just have to tell textscan() to consider newline characters as a normal whitespace character as well.
Try eliminating the whitespaces in your format string and using the 'whitespace' parameter to add '\n':
data=textscan(fid, '%d%s%f%f%s%d%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s','whitespace',' \b\t\n';
Also, make sure to double check that your input file does not end with any empty lines. This seems to mess up textscan().
Hope this helps!
Upvotes: 0
Reputation: 12486
Your example code doesn't work because textscan()
assumes that every row in the file will have the same format. That is, to use textscan()
, every line must have the same number of columns, containing the same type of data.
I would instead treat the data as something like a comma-separated value format. Split each row into list of tokens separated by the space delimiter, like suggested by Rob Henson here:
>> string = 'Need-to-split-this-string'
string =
Need-to-split-this-string
>> parts = strread(string,'%s','delimiter','-')
parts =
'Need'
'to'
'split'
'this'
'string'
You will need to loop through the file reading all the rows. Your data appears to come in stanzas of three lines, so process the data three rows at a time.
Alternately, you can do a pre-processing run over the text file to reformat each stanza of three line to be on a single line instead. (Just delete the first and second out of every three newlines.) Then use a regular expression to replace the whitespaces with commas. You will end up with something like:
04001400,HI,34.50,118.27,19480701,08,LST,0,0,0,0,0,0,0,0,0,0,0,0,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS
04001400,HI,34.50,118.27,19480801,08,LST,0,0,0,0,0,0,0,0,0,0,0,0,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS
04001400,HI,34.50,118.27,19480901,08,LST,0,0,0,0,0,0,0,0,0,0,0,0,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS,MIS
Which is then in a format where you can use textscan()
or, better, csvread()
.
Upvotes: 1