Electric Coffee
Electric Coffee

Reputation: 12104

sscanf fails on identically formatted input

I'm sitting here with quite a bit of a mystery. I have a file with 230 lines of data, separated by a newline into chunks of 6. Every nonempty line is formatted identically, yet the sscanf function fails half the time.

Here's the bit of code that fails:

void extract_data(const size_t match_count, FILE *league_file, match *matches) {
  int thousands, hundreds, i, current_round = 1, current_match = 0, scanned_items;
  char buffer[BUFFERSIZE];
  match temp_match;
  rewind(league_file);

  for (i = 0; fgets(buffer, BUFFERSIZE, league_file) != NULL; i++) {
    /* printf("read: %s\n", buffer); */
    scanned_items = sscanf(buffer, "%s %i/%i %i.%i %s - %s %i - %i %i.%i",
               temp_match.weekday, &temp_match.day, &temp_match.month,
               &temp_match.hour, &temp_match.minute,
               temp_match.home_team, temp_match.away_team,
               &temp_match.home_score, &temp_match.away_score,
               &thousands, &hundreds);
    temp_match.spectator_count = (thousands * 1000) + hundreds;
    temp_match.year =  temp_match.month >= JUL ? 2013 : 2014;
    temp_match.round = current_round;

    if (scanned_items != -1) /* for debugging purposes */
      printf("%3i round %2i %s %02i/%02i %02i.%02i %3s - %3s %i - %i %i\n",
         i + 1, current_round,
         temp_match.weekday, temp_match.day, temp_match.month,
         temp_match.hour, temp_match.minute,
         temp_match.home_team, temp_match.away_team,
         temp_match.home_score, temp_match.away_score,
         temp_match.spectator_count);

    /* if everything was successfully read, copy the temp onto the output array */
    if(scanned_items == 11) {
      matches[current_match] = temp_match;
      current_match++;
    }
    else if (scanned_items == -1) { /* if empty line is met */
      current_round++;
      printf("%3i\n", i + 1);
    }
    else { /* report how many items were successfully scanned */
      printf("    scanned items: %i\n", scanned_items);
    }
  }
}

Here's an excerpt from the file that's being read:

Fre     19/07 18.30     AGF - FCM      0 - 2     9.364   
Lor     20/07 17.00     VFF - RFC      2 - 2     4.771   
Son     21/07 14.00     OB - SDR       1 - 1     7.114   
Son     21/07 17.00     BIF - FCV      1 - 1     18.770  
Son     21/07 19.00     AAB - FCK      2 - 1     7.062   
Man     22/07 19.00     EFB - FCN      4 - 0     7.594   

Fre     26/07 18.30     FCN - VFF      1 - 1     5.067  
Lor     27/07 17.00     FCV - AGF      0 - 2     3.859   
Son     28/07 14.00     RFC - OB       1 - 1     4.852   
Son     28/07 17.00     SDR - BIF      1 - 0     5.700   
Son     28/07 19.00     FCM - FCK      1 - 0     8.759   
Man     29/07 19.00     EFB - AAB      1 - 2     9.517   

Fre     02/08 18.30     FCM - SDR      2 - 1     5.145  
Lor     03/08 17.00     AGF - FCN      2 - 1     6.997   
Son     04/08 14.00     OB - VFF       4 - 2     7.889   
Son     04/08 17.00     FCK - RFC      1 - 3     12.956  
Son     04/08 19.00     BIF - EFB      0 - 2     14.771  
Man     05/08 19.00     FCV - AAB      2 - 1     4.688  

and here's the debug output printed to the console (the first number is line number for cross-referencing with the file):

  1 round  1 Fre 19/07 18.30 AGF - FCM 0 - 2 9364
  2 round  1 Lor 20/07 17.00 VFF - RFC 2 - 2 4771
  3 round  1 Son 21/07 14.00  OB - SDR 1 - 1 7114
  4 round  1 Son 21/07 17.00 BIF - FCV 1 - 1 18770
  5 round  1 Son 21/07 19.00 AAB - FCK 2 - 1 7050
  6 round  1 Man 22/07 19.00 EFB - FCN 4 - 0 7594
  7
  8 round  2 Fre 26/07 18.30 FCN - VFF 1 - 1 5055
  9 round  2 Lor 27/07 17.00 FCV - AGF 0 - 2 3859
 10 round  2 Son 28/07 14.00 RFC -  OB 1 - 1 4852
 11 round  2 Son 28/07 17.00 SDR - BIF 1 - 0 5700
 12 round  2 Son 28/07 19.00 FCM - FCK 1 - 0 8759
 13 round  2 Man 29/07 19.00 EFB - AAB 1 - 2 9517
 14
 15 round  3 Fre 02/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 16 round  3 Lor 03/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 17 round  3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 18 round  3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 19 round  3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 20 round  3 Man 05/00 08.00 EFB - AAB 1 - 2 9517
    scanned items: 4
 21

Despite everything being formatted identically, it still somehow manages to fail. Why is that? And how do I fix it?

Upvotes: 1

Views: 69

Answers (1)

user3121023
user3121023

Reputation: 8286

With the %i specifier, digits preceded by a zero, '0', are scanned as octal values. Values preceded by 0x will be scanned as hexadecimal. Other values are scanned as decimal.
For 01 to 07 the octal and decimal values are the same. So those scanned without problem.
When scanning 08, the zero was scanned, then the 8 was rejected as not an octal character.
Zero was assigned to the .month member.
The next scan was for another integer using %i. This time the '8' was scanned by itself as a decimal 8 and assigned to the .hour member.
The format indicated the next thing to scan was a '.' but the character following the '8' was a space and sscanf indicated failure after scanning four items.
Using the %d specifier removes the ambiguity of the %i specifier when there is a chance that decimal values may be preceded by a zero.

Upvotes: 1

Related Questions