sfactor
sfactor

Reputation: 13062

Parsing a file in C

I need parse through a file and do some processing into it. The file is a text file and the data is a variable length data of the form "PP1004181350D001002003..........". So there will be timestamps if there is PP so 1004181350 is 2010-04-18 13:50. The ones where there are D are the data points that are three separate data each three digits long, so D001002003 has three coordonates of 001, 002 and 003.

Now I need to parse this data from a file for which I need to store each timestamp into a array and the corresponding datas into arrays that has as many rows as the number of data and three rows for each co-ordinate. The end array might be like

TimeStamp[1] = "135000", low[1] = "001", medium[1] = "002", high[1] = "003"
TimeStamp[2] = "135015", low[2] = "010", medium[2] = "012", high[2] = "013"
TimeStamp[3] = "135030", low[3] = "051", medium[3] = "052", high[3] = "043"
....

The question is how do I go about doing this in C? How do I go through this string looking for these patterns and storing the values in the corresponding arrays for further processing?

Note: Here the seconds value in timestamp is added on our own as it is known at each data comes after 15 seconds.

Upvotes: 2

Views: 2796

Answers (4)

TheCodeArtist
TheCodeArtist

Reputation: 22487

Simply Parsing? Here it is!!


UPDATE: Checkout KillianDS's code above. Thats even better!!

  • [STEP 1] Search for /n ( or CR+LF)

  • [STEP 2] Starting from the first character on the line, U know the no. of characters each datafield occupies. Read that many characters from the file.

Repeat for all fields.

Upvotes: 0

AndersK
AndersK

Reputation: 36082

I wouldn't recommend using fscanf directly on input data because it is very sensitive to the in data, if one byte is wrong and suddenly doesn't the format specifier then you could in worst case a memory overwrite.

It is better to either in using fgetc and parse as it comes in or read into a buffer (fread) and process it from there.

Upvotes: 0

KillianDS
KillianDS

Reputation: 17176

edit: updated to follow your specs.

While your file seems to be variable length, your data isn't, you could use fscanf and do something like this:

while(fscanf(file,"PP%*6d%4d", &timestamp, &low, &medium, &high)) 
{
    for(int i = 0; fscanf(file, "D%3d%3d%3d", &low, &medium, &high); i++)
    {
        timestamp=timestamp*100+i*15;
        //Do something with variables (e.g. convert to string, push into vector, ...)
    }
}

Note that this reads the data into integers (timestamp, low, medium and high are int's), A string version looks like this (timestamp, low, medium and high are char arrays):

int first[] = {'0', '1', '3', '4'};
int second[] = {'0','5'};

while(fscanf(file,"PP%*6d%4c", &timestamp, &low, &medium, &high)) 
{
    for(int i = 0; fscanf(file, "D%3c%3c%3c", &low, &medium, &high); i++)
    {
        timestamp[i][4]=first[i%4];
        timestamp[i][2]=second[i%2];
    }
}

edit: some more explanation about the formatting string, with %*6d I mean: look for 6 digits and discard them (* means: do not put in a variable). %4d or %4c means in this context the same (as 1 digit will be one char), but we do save them in corresponding variables.

Upvotes: 2

splicer
splicer

Reputation: 5394

As long as your patterns aren't variable length, you could simply use fscanf. If you need something more complex, you might try PCRE, but for this case I think sscanf will suffice.

Upvotes: 0

Related Questions