Reputation: 12104
I'm sitting here with quite a bit of a mystery. I have a file with 230 lines of data, separated by a newline into chunks of 6. Every nonempty line is formatted identically, yet the sscanf
function fails half the time.
Here's the bit of code that fails:
void extract_data(const size_t match_count, FILE *league_file, match *matches) {
int thousands, hundreds, i, current_round = 1, current_match = 0, scanned_items;
char buffer[BUFFERSIZE];
match temp_match;
rewind(league_file);
for (i = 0; fgets(buffer, BUFFERSIZE, league_file) != NULL; i++) {
/* printf("read: %s\n", buffer); */
scanned_items = sscanf(buffer, "%s %i/%i %i.%i %s - %s %i - %i %i.%i",
temp_match.weekday, &temp_match.day, &temp_match.month,
&temp_match.hour, &temp_match.minute,
temp_match.home_team, temp_match.away_team,
&temp_match.home_score, &temp_match.away_score,
&thousands, &hundreds);
temp_match.spectator_count = (thousands * 1000) + hundreds;
temp_match.year = temp_match.month >= JUL ? 2013 : 2014;
temp_match.round = current_round;
if (scanned_items != -1) /* for debugging purposes */
printf("%3i round %2i %s %02i/%02i %02i.%02i %3s - %3s %i - %i %i\n",
i + 1, current_round,
temp_match.weekday, temp_match.day, temp_match.month,
temp_match.hour, temp_match.minute,
temp_match.home_team, temp_match.away_team,
temp_match.home_score, temp_match.away_score,
temp_match.spectator_count);
/* if everything was successfully read, copy the temp onto the output array */
if(scanned_items == 11) {
matches[current_match] = temp_match;
current_match++;
}
else if (scanned_items == -1) { /* if empty line is met */
current_round++;
printf("%3i\n", i + 1);
}
else { /* report how many items were successfully scanned */
printf(" scanned items: %i\n", scanned_items);
}
}
}
Here's an excerpt from the file that's being read:
Fre 19/07 18.30 AGF - FCM 0 - 2 9.364
Lor 20/07 17.00 VFF - RFC 2 - 2 4.771
Son 21/07 14.00 OB - SDR 1 - 1 7.114
Son 21/07 17.00 BIF - FCV 1 - 1 18.770
Son 21/07 19.00 AAB - FCK 2 - 1 7.062
Man 22/07 19.00 EFB - FCN 4 - 0 7.594
Fre 26/07 18.30 FCN - VFF 1 - 1 5.067
Lor 27/07 17.00 FCV - AGF 0 - 2 3.859
Son 28/07 14.00 RFC - OB 1 - 1 4.852
Son 28/07 17.00 SDR - BIF 1 - 0 5.700
Son 28/07 19.00 FCM - FCK 1 - 0 8.759
Man 29/07 19.00 EFB - AAB 1 - 2 9.517
Fre 02/08 18.30 FCM - SDR 2 - 1 5.145
Lor 03/08 17.00 AGF - FCN 2 - 1 6.997
Son 04/08 14.00 OB - VFF 4 - 2 7.889
Son 04/08 17.00 FCK - RFC 1 - 3 12.956
Son 04/08 19.00 BIF - EFB 0 - 2 14.771
Man 05/08 19.00 FCV - AAB 2 - 1 4.688
and here's the debug output printed to the console (the first number is line number for cross-referencing with the file):
1 round 1 Fre 19/07 18.30 AGF - FCM 0 - 2 9364
2 round 1 Lor 20/07 17.00 VFF - RFC 2 - 2 4771
3 round 1 Son 21/07 14.00 OB - SDR 1 - 1 7114
4 round 1 Son 21/07 17.00 BIF - FCV 1 - 1 18770
5 round 1 Son 21/07 19.00 AAB - FCK 2 - 1 7050
6 round 1 Man 22/07 19.00 EFB - FCN 4 - 0 7594
7
8 round 2 Fre 26/07 18.30 FCN - VFF 1 - 1 5055
9 round 2 Lor 27/07 17.00 FCV - AGF 0 - 2 3859
10 round 2 Son 28/07 14.00 RFC - OB 1 - 1 4852
11 round 2 Son 28/07 17.00 SDR - BIF 1 - 0 5700
12 round 2 Son 28/07 19.00 FCM - FCK 1 - 0 8759
13 round 2 Man 29/07 19.00 EFB - AAB 1 - 2 9517
14
15 round 3 Fre 02/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
16 round 3 Lor 03/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
17 round 3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
18 round 3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
19 round 3 Son 04/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
20 round 3 Man 05/00 08.00 EFB - AAB 1 - 2 9517
scanned items: 4
21
Despite everything being formatted identically, it still somehow manages to fail. Why is that? And how do I fix it?
Upvotes: 1
Views: 69
Reputation: 8286
With the %i specifier, digits preceded by a zero, '0', are scanned as octal values. Values preceded by 0x will be scanned as hexadecimal. Other values are scanned as decimal.
For 01 to 07 the octal and decimal values are the same. So those scanned without problem.
When scanning 08, the zero was scanned, then the 8 was rejected as not an octal character.
Zero was assigned to the .month
member.
The next scan was for another integer using %i. This time the '8' was scanned by itself as a decimal 8 and assigned to the .hour
member.
The format indicated the next thing to scan was a '.' but the character following the '8' was a space and sscanf indicated failure after scanning four items.
Using the %d specifier removes the ambiguity of the %i specifier when there is a chance that decimal values may be preceded by a zero.
Upvotes: 1