user3691838
user3691838

Reputation: 189

Understanding sscanf formatting

First things first, yes this is homework, yes I've been trying to figure it out on my own, yes I read similar questions on SO but haven't found the help I need.

I have a .txt file that I am reading into a struct and I am really struggling to understand the formatting for sscanf. I'm at a point now where I'm just trying any and everything, so I worry that if I do get it right it will be because I got lucky and not because I actually understand what I'm doing, which is where, hopefully, you fine folks could help.

Here is sample data from the .txt

4, Ben Appleseed, 1587 Apple Street, Salt Lake City, UT, 80514
2, Terri Lynn Smith, 1234 Slate Street, Cincinnati, OH, 45242

Note: Each 'field' is seperated by a space, comma, or tab. Some of the entries don't have a middle name, other wise each entry follow the same pattern. (If anyone has advice on how to handle the lines w/o all 8 fields I'm open to help)

This is my struct:

typedef struct
{
  long lngRecordID;
  char strFirstName[50];
  char strMiddleName[50];
  char strLastName[50];
  char strStreet[100];
  char strCity[50];
  char strState[50];
  char strZipCode[50];
} udtAddressType;

This is my routine to fill the structure

void AddAddressToArray(char strBuffer[], udtAddressType audtAddressList[])
{
  int intIndex = 0;

  for (intIndex = 0; intIndex < strBuffer[intIndex]; intIndex += 1)
  {

    if(sscanf(strBuffer, "%d, %s %s %s %s %s %s %s ",
        &audtAddressList[intIndex].lngRecordID,
        &audtAddressList[intIndex].strFirstName,
        &audtAddressList[intIndex].strMiddleName,
        &audtAddressList[intIndex].strLastName,
        &audtAddressList[intIndex].strStreet,
        &audtAddressList[intIndex].strCity,
        &audtAddressList[intIndex].strState,
        &audtAddressList[intIndex].strZipCode) != 8)
       {
          break;
       }
  }

}

That gives me an output of:

  Address #50 -------------------------
     Address ID:            4
     First Name:            Ben
     Middle Name:           Appleseed,
     Last Name:             1587
     Street Address:        Apple
     City:                  Street,
     State:                 Salt
     Zip Code:              Lake

And that's not right.

I don't understand how to specify that I want the three address fields to be on one line. And a lot of what I've been reading is just confusing me further.

Function to load the file into the array:

void PopulateAddressList( udtAddressType audtAddressList[])
{
  // Declare a file pointer
  FILE* pfilInput = 0;
  int intResultFlag = 0;
  char strBuffer[50] = "";
  char chrLetter = 0;
  int intIndex = 0;



  // Try to open the file for reading
  intResultFlag = OpenInputFile("c:\\temp\\Addresses1.txt", &pfilInput);

  // Was the file opened?
  if (intResultFlag == 1)
  {

      // Yes, read in records until end of file( EOF )
      while (feof(pfilInput) == 0)
      {
        // Read next line from file
        fgets(strBuffer, sizeof(strBuffer), pfilInput);

        AddAddressToArray(strBuffer, &audtAddressList[intIndex]);

        intIndex += 1;      
      }

    // Clean up
   fclose(pfilInput);
  }


}

Any help is much appreciated!

Upvotes: 1

Views: 1325

Answers (2)

ad absurdum
ad absurdum

Reputation: 21314

Using feof() to control a file loop is in general a bad practice. feof() returns a true value when the end-of-file indicator has been set by a previous file I/O operation; this often leads to errors when a loop continues after end-of-file has been reached, but before this indicator has been set by a failing I/O operation. Read more about this issue here.

You can accomplish your goal by using scansets in sscanf() format strings. For example, the scanset directive %[^,] will cause sscanf() to match any characters, storing them in the location indicated by the corresponding argument, until a , is reached. When this directive has completed, scanning of the input will resume with the comma, so a comma may need to be placed in the format string following this scanset directive to instruct sscanf() to match and ignore this comma in the input before the next assignment is attempted. Note that it is important to specify a maximum width when using %s or %[] directives with functions from the scanf() family to avoid buffer overflow.

After obtaining the name as a string including (possibly three) components, this string can be further subdivided into first, middle, and last names if they are present.

Here is an example using this idea. Note that instead of feof(), the return value of fgets() is used to determine when all lines of the file have been read. The loop that reads the file may also be terminated if there are more than MAX_RECORDS entries.

When the line from the file is first scanned with sscanf(), the return value is checked. If there were not six assignments made, then input was not as expected. In such a case, the record counter is not incremented and if the line is empty (a newline as first character) it is simply skipped, otherwise an error message is printed before continuing.

After a successful scan of the line input buffer, name[] contains the full name from the record. Again sscanf() is used, this time with name[] as the input string. The return value is stored and used to determine how to store the strings contained in fname[], mname[], and lname[] (if applicable).

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ADDR_FILE    "addresses.txt"
#define MAX_RECORDS  1000
#define BUF_SZ       1000
#define NAME_SZ      1000

struct UdtAddress_t
{
  long lngRecordID;
  char strFirstName[50];
  char strMiddleName[50];
  char strLastName[50];
  char strStreet[100];
  char strCity[50];
  char strState[50];
  char strZipCode[50];
};

int main(void)
{
    FILE *fp = fopen(ADDR_FILE, "r");
    if (fp == NULL) {
        perror("Unable to open file");
        exit(EXIT_FAILURE);
    }

    struct UdtAddress_t records[MAX_RECORDS];

    /* Populate structure */
    size_t record_ndx = 0;
    char buffer[BUF_SZ];
    while (record_ndx < MAX_RECORDS &&
           fgets(buffer, sizeof buffer, fp) != NULL) {

        char name[NAME_SZ];
        if (sscanf(buffer, "%ld, %999[^,], %99[^,], %49[^,], %49[^,], %49s",
                   &records[record_ndx].lngRecordID,
                   name,
                   records[record_ndx].strStreet,
                   records[record_ndx].strCity,
                   records[record_ndx].strState,
                   records[record_ndx].strZipCode) != 6) {

            /* Skip empty lines and bad input */
            if (buffer[0] != '\n') {
                fprintf(stderr, "bad input line\n");
            }
            continue;
        }

        /* Break name into parts */
        char fname[50];
        char mname[50];
        char lname[50];
        int scan_ret = sscanf(name, "%49s %49s %49s", fname, mname, lname);

        strcpy(records[record_ndx].strFirstName, fname);

        switch(scan_ret) {
        case 2:
            strcpy(records[record_ndx].strMiddleName, "None");
            strcpy(records[record_ndx].strLastName, mname);
            break;
        case 3:
            strcpy(records[record_ndx].strMiddleName, mname);
            strcpy(records[record_ndx].strLastName, lname);
            break;
        default:
            strcpy(records[record_ndx].strMiddleName, "None");
            strcpy(records[record_ndx].strLastName, "None");            
        }

        ++record_ndx;
    }

    /* Finished with file */
    fclose(fp);

    /* Show address information */
    for (size_t i = 0; i < record_ndx; i++) {
        printf("Address %zu -----------------------\n", i+1);
        printf("\tAddress ID:            %ld\n", records[i].lngRecordID);
        printf("\tFirst Name:            %s\n", records[i].strFirstName);
        printf("\tMiddle Name:           %s\n", records[i].strMiddleName);
        printf("\tLast Name:             %s\n", records[i].strLastName);
        printf("\tStreet Address:        %s\n", records[i].strStreet);
        printf("\tCity:                  %s\n", records[i].strCity);
        printf("\tState:                 %s\n", records[i].strState);
        printf("\tZip Code:              %s\n", records[i].strZipCode);
        putchar('\n');
    }

    return 0;
}

Here is a test file and an output example:

4, Ben Appleseed, 1587 Apple Street, Salt Lake City, UT, 80514
2, Terri Lynn Smith, 1234 Slate Street, Cincinnati, OH, 45242
42, Cher, 4 Positive Street, Hollywood, CA, 99999
Address 1 -----------------------
    Address ID:            4
    First Name:            Ben
    Middle Name:           None
    Last Name:             Appleseed
    Street Address:        1587 Apple Street
    City:                  Salt Lake City
    State:                 UT
    Zip Code:              80514

Address 2 -----------------------
    Address ID:            2
    First Name:            Terri
    Middle Name:           Lynn
    Last Name:             Smith
    Street Address:        1234 Slate Street
    City:                  Cincinnati
    State:                 OH
    Zip Code:              45242

Address 3 -----------------------
    Address ID:            42
    First Name:            Cher
    Middle Name:           None
    Last Name:             None
    Street Address:        4 Positive Street
    City:                  Hollywood
    State:                 CA
    Zip Code:              99999

Upvotes: 2

xing
xing

Reputation: 2508

Use scansets to capture sub-strings. %200[^,], will scan everything not a comma, up to 200 characters into char strSub[201];. As needed, sscanf strSub to capture the fields.

if(sscanf(strBuffer, "%d, %200[^,], %99[^,], %49[^,], %49[^,],%49s",
    &audtAddressList[intIndex].lngRecordID,
    strSub,
    audtAddressList[intIndex].strStreet,
    audtAddressList[intIndex].strCity,
    audtAddressList[intIndex].strState,
    audtAddressList[intIndex].strZipCode) == 6)
{
    //sscan the fields
    fields = sscanf ( strSub, "%s%s%s",
        audtAddressList[intIndex].strFirstName,
        audtAddressList[intIndex].strMiddleName,
        audtAddressList[intIndex].strLastName);
    if ( fields == 2) {//only two fields
        //assume that the second field was for last name so copy middle to last
        strcpy (
            audtAddressList[intIndex].strLastName,
            audtAddressList[intIndex].strMiddleName);
        //set middle as blank
        audtAddressList[intIndex].strMiddleName[0] = '\0';
    }
}
else {
    break;
}

Upvotes: 4

Related Questions