Reputation: 361
I have a file that I'm trying to read and fill variables with. The file consists of this:
0\ttake a nap\n
1\tstudy heap-based priority queue\n
101\treview trees for Midterm 2\n
3\tdo assignment 7\n
This may be hard to read, but you can see that there is an integer to begin with, followed by a tab, a string after that, followed by a newline. I need to take the integer and put that into a variable, detect the tab, and put the string following the tab into a variable, detect the newline, take the two variables and create a node with the information, and then start over again on the next line. After hours of scouring the internet, this is what I've come up with:
char activity[SIZE];
char position[SIZE];
char line[100];
FILE *infile;
char *inname = "todo.txt";
int i = 0;
infile = fopen(inname, "r");
if (!infile) {
printf("Couldn't open %s for reading\n");
return 0;
}
while(i < 100 && fgets(line, sizeof(line), infile) != NULL){
sscanf(line, "%s\t%s", position, activity);
printf("%s\n", position);
printf("%s\n", activity);
i++;
}
When running this test code on the txt file above, I get this as a result:
0
take
1
study
101
review
3
do
So, it looks to me like it's getting the first number alright (as a string) and putting it into the variable, seeing the tab, and grabbing the first sequence after the tab and stopping there after putting it into the other variable. How do I rectify this situation?
Upvotes: 4
Views: 35803
Reputation: 497
The following worked quite well for my use case. I wanted to read the first two fields of a TAB-delimited file into string vars, then read the remainder of each line into a final string var.
Here's the code:
#include <stdlib.h>
#include <stdio.h>
int main()
{
unsigned char string1 [255];
unsigned char string2 [255];
unsigned char string3 [255];
/* read from stdin until done */
while(!feof(stdin))
{
fscanf( stdin, "%[^\t]\t%[^\t]\t%[^\n]\n", string1, string2, string3 );
printf( "%s\t%s\t%s\n", string1, string2, string3 );
}
return(0);
}
I'm reading from STDIN because I used this program to create a command line filter.
Explanation of the fscanf codes:
%[^\t] - any character that is not a TAB
\t - the TAB character
%[^\n] - any character that is not a NEWLINE
\n - the NEWLINE character
Thus, my fscanf is reading all characters up to the first TAB (including spaces but not the TAB itself) and placing the string into var string1, all characters up to the second TAB (including spaces but not the TAB itself) and placing the string into var string2, then reading all remaining characters of the record (TABs, spaces, everything except the NEWLINE) up to the NEWLINE into string3.
In my real program, I am doing specific processing on string1 and string2. My output is the result of that processing along with string3. In other words, my output is also TAB-delimited with the original contents of string3 unaltered.
If you have a TAB-delimited file with three or more fields, then the following (on Linux) should be true:
cat FILE | ABOVE_PROGRAM > OUT_FILE
diff FILE OUT_FILE # This should yield nothing (no differences)
Hopefully this will help others process TAB-delimited files.
Upvotes: 4
Reputation: 182639
You can try changing the sscanf
:
sscanf(line, "%s\t%[^\n]", position, activity);
The %s
specifier stops when it encounters blanks. That's why it only reads study instead of study heap-based priority queue. The %[^\n]
tells it: "read until newline". Another issue: you should test the value returned by sscanf
to make sure it filled the required number of objects.
You could also read the first integer as an integer, changing position
to int
and using %d
instead of %s
.
To make myself clear, what I was suggesting was:
int position;
sscanf(line, "%d\t%[^\n]", &position, activity);
Upvotes: 5