Evelyn
Evelyn

Reputation: 43

C: Read a file and store them into a data structure

So I have a .txt doc with the following information (it can be multiple lines, but in this case it is 3 lines)

Jane Smith    123 lala land    123-222-1231
Bob Fall    123 blue jay st    812-923-1111
Sally White    1 rose ave.    +1-231-2318

I want to create a 'read' function that reads the file, then a 'write' function that writes it into a data structure.

So far , I have this:

void read()
{
    FILE *file;
    file = fopen("fileName", "r");
    write(file);
}

void write(FILE *file)
 {

 }

I was wondering how I would store each line into a data structure since C does not support vectors. I want to be able to then create a print function where it can print:

  1 //line #1 info here
  2 //etc
  3 //etc

Upvotes: 0

Views: 9410

Answers (3)

David C. Rankin
David C. Rankin

Reputation: 84521

Since you are trying to parse string that contains spaces, which are in turn separated from other fields by more spaces (e.g name and address), you cannot read the line and then parse the string with sscanf. It simply isn't possible. With scanf/sscanf the match of a string terminates on the first whitespace (unless a width specifier is given), rendering it useless for parsing strings of varying length containing whitespace. E.g.:

Jane Smith    123 lala land    123-222-1231

Attempting to parse with %s reads Jane and no more. Unless you are guaranteed of a fixed width column, sscanf will not work in this case.

Compounding the problem, not only do the strings contain spaces, but the delimiters are composed of multiple-spaces. So unfortunately, this is a situation where you must use pointers to parse the string. How? Start with known information.

The only thing that makes this possible is presumption that the phone number contains no-whitespace. So using strrchr (or set a pointer at the end of the string and backup) you can simply find the space before the start of the phone number. Set an end pointer (ep) prior to this space, advance the original pointer by 1 and copy the phone number to the structure.

Starting at ep, work backwards until you find the first non-space character (end of the address field) and set a null-terminating character there.

The next known point is the beginning of the string. Start there and find the first double-space. (the presumption being that the name, address and phone fields are all separated by at least 2 spaces). You know the first of the double-space is the end of the name field, set a null-terminating character there. (you can read/copy name to the struct at this point by simply reading the start of the string)

Finally, work forward until you find the next non-space character. This is the start of the address. Copy the address to the struct and you are done. (repeat process for each line).

Sometimes where you have no sane delimiter, you have to fall-back to simply stepping through the string with a pointer and processing it piecemeal. This is one of those cases. Look over the following and let me know if you have questions:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define EMAX 128

typedef struct entry {
    char name[32];
    char address[32];
    char phone[16];
} entry;

size_t readtxtfile (char *fn, entry *array);
void prn_entries (entry *array);

int main (int argc, char **argv) {

    /* validate number of arguments */
    if (argc < 2 ) {
        fprintf (stderr, "error: insufficient input, usage: %s <filename1>\n", argv[0]);
        return 1;
    }

    /* initialize all variables */
    size_t index = 0;
    entry contacts[EMAX] = {{{0}, {0}, {0}}};

    /* read file into an array of entries,
    number of entries, returned to index */
    index = readtxtfile (argv[1], contacts);

    /* simple print function */
    if (index > 0)
    {
        printf ("\nNumber of entries in contacts : %zu\n\n", index);
        prn_entries (contacts);
    }
    else
        fprintf (stderr, "error: no entries read from file '%s'\n.", argv[1]);

    return 0;
}

size_t readtxtfile (char *fn, entry *array)
{
    if (!fn) return 0;              /* validate filename provided       */

    char *ln = NULL;                /* NULL forces getline to allocate  */
    size_t n = 0;                   /* max chars to read (0 - no limit) */
    ssize_t nchr = 0;               /* number of chars actually read    */
    size_t idx = 0;                 /* couner for number of entries     */
    FILE *fp = NULL;                /* file pointer to open file fn     */

    /* open / validate file */
    if (!(fp = fopen (fn, "r"))) {
        fprintf (stderr, "%s() error: file open failed '%s'.", __func__, fn);
        return 0;
    }

    /* read each line from file */
    while ((nchr = getline (&ln, &n, fp)) != -1)
    {
        /* strip newline or carriage rtn    */
        while (nchr > 0 && (ln[nchr-1] == '\n' || ln[nchr-1] == '\r'))
            ln[--nchr] = 0;

        /* create a copy of ln to preserve start address */
        char *lcpy = strdup (ln);
        if (!lcpy) {
            fprintf (stderr, "%s() error: memory allocation failed.\n", __func__);
            continue;
        }

        char *p = strrchr (lcpy, ' ');                  /* find last space in line      */
        char *ep = p - 1;                               /* set end pointer 1 before     */

        p++;                                            /* advance to next char         */
        strncpy (array[idx].phone, p, strlen (p));      /* copy p to phone              */

        while (ep > lcpy && *ep == ' ') ep--;           /* find first space after addr  */
        *(++ep) = 0;                                    /* null-terminat at that space  */

        p = lcpy;           /* start at beginning of string and find first double-space */
        while (*(p + 1) && !(*(p + 1) == ' ' && *p == ' ')) p++;

        *p = 0;                     /* null-terminate at first space    */

        while (*(++p) == ' ');      /* find first char in addr          */

        strncpy (array[idx].address, p, strlen (p));    /* copy p to address            */
        strncpy (array[idx].name, lcpy, strlen (lcpy)); /* copy lcpy to name            */

        free (lcpy);                /* free memory allocated by strdup  */
        lcpy = NULL;                /* reset pointer NULL               */

        idx++;                      /* increment entry index            */
        if (idx == EMAX)            /* check if EMAX reached & return   */
        {
            fprintf (stderr, "%s() warning: maximun number of entries read\n", __func__);
            break;
        }
    }

    if (ln) free (ln);              /* free memory allocated by getline */
    if (fp) fclose (fp);            /* close open file descriptor       */

    return idx;
}

/* print an array of character pointers. */
void prn_entries (entry *array)
{
    register size_t n = 0;
    while (strlen (array[n].name) > 0)
    {
        printf (" (%2zu.)  %-32s %-32s %s\n", n, array[n].name, array[n].address, array[n].phone);
        n++;
    }
}

Output

$ ./bin/read_entries dat/nmaddph.txt

Number of entries in contacts : 3

 ( 0.)  Jane Smith                       123 lala land                    123-222-1231
 ( 1.)  Bob Fall                         123 blue jay st                  812-923-1111
 ( 2.)  Sally White                      1 rose ave.                      +1-231-2318

Note: using getline or any time you have allocated space for a string dynamically, you need to make a copy before you alter that memory block with functions that do not preserve the original start of the string (like strtok or you manually iterating over the string with the string variable). getline allocates the memory for ln for you (if originally set NULL) and as a result getline is responsible for freeing it. If you alter the start address for the string, or leave parts of it unreachable, then when getline attempts to realloc or free that block of memory, a memory error will occur. Making a copy will save you a lot of headaches.

In the example above, a copy of the ln allocated by getline is made. Character pointers are assigned as needed to preserve the starting address of lcpy. If you would have iterated over the string advancing lcpy (e.g. lcpy++;), instead of using a second pointer, the original start address would be lost. When you (or the program on exit) attempts to free lcpy a whole host of errors (or segmentation faults) can occur.

Upvotes: 1

whalesf
whalesf

Reputation: 49

#include<stdio.h>
#include<string.h>
#define MAXLENGTH 200 
typedef struct node{
    char query[MAXLENGTH];
}DATA;

typedef struct stack{
    DATA data[MAXLENGTH];
    int top;
}Stack;

void write(FILE *file,Stack* st)
{   
    while(!feof(file)){  //Check EOF
        st->top++; 
        fgets(st->data[st->top].query,MAXLENGTH,file); //Scan data line by line and put into data structure
        //printf("%s",st->data[st->top].query); 
    }
}

void read(Stack* st)
{
    FILE *file;
    file = fopen("h.txt", "r");
    write(file,st);
}

int main(){
    int i;
    Stack st;
    st.top = -1;

    read(&st);

    for(i = 0; i<= st.top; i++){  //DISPLAY DATA
        printf("%s\n",st.data[i].query); 
    } 
    fflush(stdin);getchar();
    return 0;
}

Upvotes: 0

sharon
sharon

Reputation: 734

First You open the document file using the open or fopen function.

eg: fp=fopen (filename,"r");

Then read line by line using fgets.

eg: while(fgets(array,BUFSIZ,fp) != NULL)

After reading each line store the data in the structure using the sscanf function.

eg: sscanf(array," %d %s", &var[i].id,var[i].name);`

The data in the file will be loaded in the structure.

Upvotes: 2

Related Questions