Reputation: 871
I am trying to read formatted content from a file. To do so, I read line by line using fgets()
and sscanf()
.
The content of the file is supposed to be a table. One row would look like the following example:
456 2 39 chained_words 62.5 // comment with more than one word
To read it, I use:
fgets(temp,MAXLINELENGTH,file);
sscanf(temp,"%d %d %d %s %f // %s",&num1,&num2,&num3,word,&num4,comment);
It works fine with the first five elements plus the first word after the //
, but the problem is that I need to store the whole comment in the comment
char * variable. I have tried multiple solutions proposed in other posts, like specifying format that excludes certain characters, but nothing worked.
I'd appreciate any hint to solve the problem!
Upvotes: 0
Views: 2938
Reputation: 84561
Following on from your comment, if you were to add another number after the existing comment
, that would complicate things a bit. The reason being is that with comment
containing multiple words, you have no discrete end to search for.
However, C rarely lets you down. Whenever you need to parse data from a line or buffer, you look at the format of your data and ask "What am I going to use as my reference for the beginning or end of what I need?" Here, with nothing in comment, we will need to use the end of the buffer as a reference and work backwards.
Doing do we are going to assume that the value is the last thing on the line before the newline (no tabs or spaces follow). We could loop backwards until we find the last non-whitespace character to validate, but for purposes here we make our assumption.
For purposes of this problem, we will break parsing line into 2 parts. We can read everything up to the comment with our original sscanf
call in a reliable fashion. So we will consider everything in the first part of a line (up to and including the float) part 1, and everything after the comment characters //
part 2. You read/parse part one as usual:
sscanf (line, "%d %d %d %s %f", &d1, &d2, &d3, word, &f1);
Searching for a specific character in a line, we have a manual char-by-char comparison (we always have that) and we have strchr
and strrchr
functions in string.h
that will search a line of text for the first (strchr
) or last (strrchr
) occurrence of the given character. Both functions return a pointer to that character within the string.
Working backwards from the end of our line, if we find /
, we now have a pointer (the address within the string) to the last '/'
before the beginning of the comment. We now read the entire remainder of the line into comment
(value and all) using our pointer.
p = strrchr (line, '/'); /* find last '/' in line */
sscanf (p, "/ %[^\n]%*c", comment); /* read comment and value */
Now we are working in only comment
(instead of line
). We know if we work backwards from the end of comment
looking for a space ' '
, we will be in a position to read our last value. After we read the last value, since we have our pointer pointing to the address right before the value, we know we can null-terminate
comment
at the pointer to finish our parse.
p = strrchr (comment, ' '); /* find last space in file */
sscanf (p, " %d", &d4); /* read last value into d4 */
*p = 0; /* null-terminate comment */
(note: you can check/removed any trailing spaces in comment
if needed, but for our purposes, that is omitted)
Putting it all together you would have something that looked like this:
Quick Example
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXS 128
int main (int argc, char **argv) {
if (argc < 2 ) { /* check for at least 1 argument */
fprintf (stderr, "error: insufficient input, usage: %s filename\n",
argv[0]);
return 1;
}
char line[MAXS] = {0};
char word[MAXS] = {0};
char comment[MAXS] = {0};
char *p = NULL;
size_t idx = 0;
int d1, d2, d3, d4;
float f1 = 0.0;
FILE *fp = NULL;
d1 = d2 = d3 = d4 = 0;
if (!(fp = fopen (argv[1], "r"))) { /* open/validate file */
fprintf (stderr, "error: file open failed '%s'.", argv[1]);
return 1;
}
while (fgets (line, MAXS, fp) != NULL) /* read each line in file */
{
/* read buffer through first float */
sscanf (line, "%d %d %d %s %f", &d1, &d2, &d3, word, &f1);
p = strrchr (line, '/'); /* find last '/' in line */
sscanf (p, "/ %[^\n]%*c", comment); /* read comment and value */
p = strrchr (comment, ' '); /* find last space in file */
sscanf (p, " %d", &d4); /* read last value into d4 */
*p = 0; /* null-terminate comment */
printf ("\nline : %zu\n\n %s\n", idx, line);
printf (" d1 : %d\n d2 : %d\n d3 : %d\n d4 : %d\n f1 : %.2f\n",
d1, d2, d3, d4, f1);
printf (" chained : %s\n comment : %s\n", word, comment);
idx++;
}
fclose (fp);
return 0;
}
Input
$ cat dat/strwcmt.txt
456 2 39 chained_words 62.5 // comment with more than one word 227
457 2 42 more_chained_w 64.5 // another comment 228
458 3 45 s_n_a_f_u 66.5 // this is still another comment 229
Output
$ ./bin/str_rd_mixed dat/strwcmt.txt
$ ./bin/str_rd_mixed dat/strwcmt.txt
line : 0
456 2 39 chained_words 62.5 // comment with more than one word 227
d1 : 456
d2 : 2
d3 : 39
d4 : 227
f1 : 62.50
chained : chained_words
comment : comment with more than one word
line : 1
457 2 42 more_chained_w 64.5 // another comment 228
d1 : 457
d2 : 2
d3 : 42
d4 : 228
f1 : 64.50
chained : more_chained_w
comment : another comment
line : 2
458 3 45 s_n_a_f_u 66.5 // this is still another comment 229
d1 : 458
d2 : 3
d3 : 45
d4 : 229
f1 : 66.50
chained : s_n_a_f_u
comment : this is still another comment
Note: There is no limit to the different ways to approach this. This is simply one approach. Another would be to tokenize the entire line into separate words, check whether each word begins with a digit (and contains a '.' for a float) and then simply convert all numbers and concatenate all non-number words as needed. It's up to you. The bigger your toolbox, the more ways you will see to approach it.
Upvotes: 3