J.Doe123
J.Doe123

Reputation: 15

How to read variable separated by punctuations in a text file in C

I was wondering how to read data from a text file that has its data separated by a comma.

For example line 1 of the text file says: (Integer,Name Surname, IntegerArray)

This is 1 line: 123456789,Jonh Brown,123456434-4325234-235234-42345234

typedef struct BST
{
    long long ID;
    char *name;
    char *surname;
    long long *friendsID;
    struct BST *left;
    struct BST *right;
}BST;

reading from file:

do
{
    c = fscanf(fp,"%I64d%*c%s%s",&ID,name,surname);
    if (c != EOF)
        root=insertNewUser(root,ID,name,surname);
} while (c != EOF);

newNodeTemp->ID = ID;
newNodeTemp->name = (char*)calloc(strlen(name),sizeof(char));
newNodeTemp->surname = (char*)calloc(strlen(surname),sizeof(char));
strcpy(newNodeTemp->name,name);
strcpy(newNodeTemp->surname,surname);

but I do not know how to get it as array into BST->friends without '-'(hyphen).

this part: 123456434-4325234-235234-42345234

I defined the friends of array as a pointer. Because we don't know its size. I will use dynamic memory allocation...

Upvotes: 0

Views: 226

Answers (1)

David C. Rankin
David C. Rankin

Reputation: 84579

If I understand your question, that you have a CSV with user info and the friends of that user, where the friends are encoded as a hyphen separated list of friend-IDs as the third field in the line, then you can use a combination of the re-entrant version of strtok (named strtok_r) to separate the comma separated fields, and than use calls to strtok within your outer loop to separate the hyphen separated values.

Note, strtok_r requires an additional "save pointer" as its third argument so that you can resume calls to that instance of strtok_r after having made intermediate calls to a difference instance of strtok or strtok_r for alternative separation purposes.

Given your line of:

"123456789,Jonh Brown,123456434-4325234-235234-42345234"

where 123456789 is the ID, Jonh Brown is the name, and 123456434-4325234-235234-42345234 is a list of friend IDs, you could parse the line and individual friends, just by keeping a field count and calling a separate instance of strtok within your tokinzation loop to separate friends on hyphens.

A short example would be:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define FRIENDS 3

int main (void) {

    char line[] = "123456789,Jonh Brown,123456434-4325234-235234-42345234",
        *delims = ",\n",    /* strtok_r delimiters for tokens */
        *p      = line,     /* pointer to line */
        *sp     = p;        /* save pointer for reentrant strtok_r */
    int count = 0;

    /* must use reentrant version of strtok to nest strtok use for friends */
    p = strtok_r (line, delims, &sp);  /* 1st call uses name of buffer */
    count++;

    while (p) {             /* outputer tokenization loop */
        printf ("token: '%s'\n", p);
        if (count == FRIENDS) {
            char *pf = calloc (strlen (p) + 1, 1),  /* pointer to friends   */
                *delim2 = "-\n",                    /* delims for friends   */
                *f;                                 /* pointer preserves pf */
            if (!pf) {
                perror ("malloc-pf");
                exit (EXIT_FAILURE);
            }
            strcpy (pf, p);                 /* copy friends token to pf */
            f = pf;                         /* set f, to pf, to preserve pf */
            f = strtok (f, delim2);         /* regular strtok OK for friends */
            if (f)
                printf ("friends:\n");
            while (f) {     /* friends tokenization loop */
                printf ("    %s\n", f);
                f = strtok (NULL, delim2);  /* subsequent calls use NULL */
            }
            free (pf);      /* free allocated memory at preserved address */
            count = 0;      /* reset count */
        }
        p = strtok_r (NULL, delims, &sp);  /* subsequent calls use NULL */
        count++;
    }

    return 0;
}

(note: since strtok modifies the original string and advances the pointer it uses, you must make a copy of the friends token, and preserve a pointer to the starting address of the allocated token for friends (pf) so that it can be freed after you are done with separating friends)

(also note: if your system provides strdup, you can replace the two calloc (strlen (p) + 1, 1) and strcpy (pf, p); calls with a simple call to char *pf = strdup(p);. But note, since strdup allocates dynamically, you should still validate if (!pf) after the call)

Example Use/Output

$ ./bin/strtok_csv
token: '123456789'
token: 'Jonh Brown'
token: '123456434-4325234-235234-42345234'
friends:
    123456434
    4325234
    235234
    42345234

Look things over and let me know if you have further questions.

Upvotes: 1

Related Questions