Reputation: 343
How can i read a text file like this one:
Acqua Naturale 200
Coca Cola 100
Bibite 300
and store in a string Acqua naturale
and Coca Cola
and their int value in a int variable, using sscanf()
.
The example code is this:
struct Test
{
char name[16];
int id;
};
char * buffer = malloc(sizeof(struct Test));
while(fgets(buffer, sizeof(struct Test), filep))
{
if(sscanf(buffer, "%s %d", p.name, &p.id) == 2)
{
//do something with data
}
}
Upvotes: 0
Views: 429
Reputation: 343
I found this solution using strtok()
, strcpy()
, strcat()
, atoi()
and isdigit()
. I am using linked list to store the data, so i think it is a specific solution. Ignore the parameter of the function Load()
and the function CreateNewNodeOfList()
.
void Load(HeadNode *pp) // ignore parameter
{
FILE *f;
struct Test p;
char * buffer;
char * token;
char name[32] = "";
if(!(f = fopen(PATH, "r")))
{
perror("Errore");
exit(-1);
}
buffer = malloc(sizeof(struct Test));
while(fgets(buffer, sizeof(struct Test), f))
{
for(token = strtok(buffer, " "); token != NULL; token = strtok(NULL, " "))
{
if(isdigit(token[0]))
{
p.id = atoi(token);
}
else
{
strcat(p.name, token);
strcat(p.name, " ");
}
}
CreateNewNodeOfList(p, pp); //ignore this function
strcpy(p.name, "");
}
free(buffer);
fclose(f);
}
Upvotes: 0
Reputation: 23208
Two quick observations,
strtok()
over sscanf()
is a better choice given this particular task.
Unless there is only one record (data line) in the input file, an array of struct
(as opposed to a single instance) is needed to contain the data.
Rational:
The more defined and predictable the syntax of a source file, the less complex it is to parse. Your file, as described has predictable contents. With limited variability in syntax, tokenizing the record, using the strtok() function is a good choice.
For what you are doing, the only variability in your file content would be the number of lines, and the number of alpha strings preceding the numeric string at the end. The rest assumes space separated sub-strings within each line, with only the last having numeric content. So one approach that would accommodate this type of file might use run-time memory creation for an array of struct
, based on number of lines to process, and the strtok()
function to read through the elements, storing each based on the type of string it is (either alpha or numeric).
Example approach:
file: x.txt contains the following:
Acqua Naturale 200
Coca Cola 100
Bibite 300
Nesbits Gold 400
Fanta Iced Orange 500
Coca Cola Cherry Cream 600
char filename[] = {".\\x.txt"};
typedef struct {
char name[200]; // add plenty of space
int id;
}TEST;
void PopulateTest(TEST *t, char *file);//populate struct with content of file.
int GetLines(char *name);//get line count
int main(int argc, char *argv[])
{
int lineCount = GetLines(filename);//get lines in file
int i;
TEST *test;//to create a variable number of instances of TEST
test = calloc(lineCount, sizeof(TEST));
if(test)
{
PopulateTest(test, filename);
}
for(i=0;i<lineCount;i++)
{
;//do something with results
}
free(test);
return 0;
}
void PopulateTest(TEST *t, char *file)
{
int num = 0;
int i = 0;
char *tok = NULL;
char line[200] = {0};
char accum[200] = {0};
FILE *fp = fopen(filename, "r");
if(fp)
{
while(fgets(line, sizeof(line), fp))
{
tok = strtok(line, " ");
while(tok)// this loop accommodates a variable number of fields within each line
{
if(isdigit(tok[0]))//test for sub-string content
{
num = atoi(tok);
}
else //read string segments and reconstruct string,
{
strcat(accum, tok);
strcat(accum, " ");
}
tok = strtok(NULL, " ");
}
strcpy(t[i].name, accum);//populate struct element members with parsed data.
t[i].id = num;
i++;
}
fclose(fp);
}
return;
}
int GetLines(char *name)
{
int count = 0;
char line[200] = {0};
FILE *fp = fopen(name, "r");
if(fp)
{
while(fgets(line, sizeof(line), fp))
{
count++;
}
fclose(fp);
}
return count;
}
Upvotes: 1
Reputation: 47923
Before trying to write code to read this file, you should think a little more about how the file is defined -- precisely how it's defined.
Informally, the definition of the file is "the first column is a string possibly containing whitespace, and the second column is an integer". But what separates the columns?
If the columns are separated by whitespace, and if the first column can contain whitespace, then the first column isn't really the first column, it's potentially multiple columns. That is, the line
Coca Cola 100
really contains three columns.
So if we want to go down this road, we have to try to differentiate between a second column that's an integer, and a first column that (though it might contain whitespace) does not look like an integer.
But if we go down that road, we have two pretty significant problems:
It's hard to code. It's probably impossible to code satisfactorily using scanf
or sscanf
alone.
It's still ambiguous. What if Coca Cola comes out with a new product "Coca Cola 2020"? Then we'll have a line like
Coca Cola 2020 50
So my bottom line is, if it was me, I wouldn't even try to write code to parse this file format. I would come up with a cleaner, less ambiguous file format, perhaps
Coca Cola, 100
or
"Coca Cola",100
or
Coca Cola|100
and then write some clean and simple code to parse that. (I probably still wouldn't use scanf
, though; I'd probably use something more like strtok
. See also this chapter in my C Programming notes.)
Addendum: the other road to potentially go down is to count columns from the right-hand edge. In this case, you could write code to, in effect, say that the product name is in columns 1 to N-1, and the count is column N. This can work as long as there's at most one "column" containing whitespace.
Upvotes: 1
Reputation: 153338
To separate "Acqua Naturale 200" into "Acqua Naturale" and 200 is a problem of looking for an integer at the end of the line.
Various approaches.
Perhaps look for last space separator,
OP nicely reads a line and then attempts to parse - this is better than scanf()
.
Note that OP's buffer size is too small. Consider "abcdefghijklmno -2000000000\n"
, valid input which needs size 15 + 1 + 11 + 1 + 1 bytes. Certainly that is more than sizeof(struct Test)
as the text of a int
may need more space than the binary encoded int
(e.g. 2, 4 or 8 bytes).
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
...
FILE *filep;
struct Test p;
// p.name sp int \n \0
#define LINE_SIZE (sizeof p.name + 1 + 11 + 1 + 1)
char buffer[LINE_SIZE *2]; // No need to be stingy with temp buffer space, go for x2
while(fgets(buffer, sizeof buffer, filep)) {
char *last_space = strrchr(buffer, ' ');
if (last_space == NULL || (last_space - buffer) >= sizeof p.name ||
sscanf(last_space, "%d", &p.id) == 0) {
fprintf(stderr, "Bad input '%s'\n", buffer);
break;
}
memcpy(p.name, buffer, last_space - buffer);
p.name[last_space - buffer] = '\0';
// Do something with `p`
}
More robust code would use a strtoi and look for extra junk after the number as in "xxx 122zzz"
. Excessively long lines should be detected too.
Upvotes: 1
Reputation: 213378
There's some misconceptions here.
sizeof
a struct from it, since that would assume binary format, not text which is longer. Instead of malloc, just allocate buffer
as "large enough", for example char buffer[200];
. buffer
, then parse through it. sscanf
is a rather blunt tool for this unless you are certain of the buffer format. Instead you can search for the last space ' '
in the string and take anything before it as the name (make sure it is less than 15 characters + 1 null terminator), and everything after the space you pass to strtol
which converts it to int
.Upvotes: 0