Reputation: 409
On a microcontroller, in order to avoid loading settings from a previous firmware build, I also store the compilation time, which is checked at loading.
The microcontroller project is build with 'mikroC PRO for ARM' from MikroElektronika.
Being easier to debug, I programmed the code with minGW on my PC and, after checking it left and right put, it into microC.
The code using that check failed to work properly. After an evening of frustrating debugging I, found sizeof("...")
yielding different values on the two platforms and causing a buffer overflow as a consequence.
But now I don't know whose fault is it.
To re-create the problem, use following code:
#define SAVEFILECHECK_COMPILE_DATE __DATE__ " " __TIME__
char strA[sizeof(SAVEFILECHECK_COMPILE_DATE)];
char strB[] = SAVEFILECHECK_COMPILE_DATE;
printf("sizeof(#def): %d\n", (int)sizeof(SAVEFILECHECK_COMPILE_DATE));
printf("sizeof(strA): %d\n", (int)sizeof(strA));
printf("sizeof(strB): %d\n", (int)sizeof(strB));
On MinGW it returns (as expected):
sizeof(#def): 21
sizeof(strA): 21
sizeof(strB): 21
However, on 'mikroC PRO for ARM' it returns:
sizeof(#def): 20
sizeof(strA): 20
sizeof(strB): 21
This difference caused a buffer overflow down the line (overwriting byte zero of a pointer – ouch).
21 is the answer I expect: 20 chars and the '\0' terminator.
Is this one of the 'it depends' things in C or is there a violation of the sizeof
operator behavior?
Upvotes: 39
Views: 5299
Reputation: 213960
This is all 100% standardized. C17 6.10.8.1:
__DATE__
The date of translation of the preprocessing translation unit: a character string literal of the form"Mmm dd yyyy"
... and the first character ofdd
is a space character if the value is less than 10.
...
__TIME__
The time of translation of the preprocessing translation unit: a character string literal of the form"hh:mm:ss"
" "
(the space you used for string literal concatenation) = 111 + 8 + 1 + 1 = 21
As for sizeof
, a string literal is an array. Whenever you pass a declared array to sizeof
, the array does not "decay" into a pointer to the first element, so sizeof
will report the size of the array in bytes. In case of string literals, this includes the null termination, C17 6.4.5:
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type
char
, and are initialized with the individual bytes of the multibyte character sequence.
(Translation phase 6 is also mentioned, which is the string literal concatenation phase. I.e string literal concatenation is guaranteed to happen before null termination is added.)
So it would appear that mikroC PRO is non-conforming/bugged. There's lots of questionable embedded systems compilers out there for sure.
Upvotes: 43
Reputation: 81179
As others have noted, the behavior of sizeof
on a string literal has long been standardized as yielding a value one larger than the length of the string represented thereby, rather than the size of the smallest character array that could be initialized using that string literal. That having been said, if one wishes to make code compatible even with compilers that adopt the latter interpretation, I'd suggest using something an expression like (1-(sizeof "")+(sizeof "stringLiteral of interst"))
which would allow code to operate correctly with the quirky compilers, but avoid sacrificing compatibility with standard ones.
Upvotes: 11
Reputation: 224082
This is a compiler bug. String literals, whether they consist of a single quoted sequence or multiple adjacent quoted sequences, are stored as static arrays which always contain a terminating null byte. That's not happening here, where it should.
This is specified in section 6.4.5p6 of the C standard regarding string literals:
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. 78) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.
This means that sizeof(SAVEFILECHECK_COMPILE_DATE)
should count both the characters in the string and the terminating null byte, but the compiler for some reason isn't including the null byte.
Upvotes: 5
Reputation: 144780
Is this one of the 'it depends' things in C or is there a violation of the
sizeof
operator behavior?
The behavior is fully defined in the C Standard. Below are the relevant quotes from the C99 published standard, which were identical except for the section numbers in the C90 (ANSI C) version and have not been modified in essence in more recent version up to and including the upcoming C23 version:
The __DATE__
and __TIME__
macros are specified by
6.10.8 Mandatory macros
__DATE__
The date of translation of the preprocessing translation unit: a character string literal of the form "Mmm dd yyyy"
, where the names of the months are the same as those generated by theasctime
function, and the first character ofdd
is a space character if the value is less than10
. If the date of translation is not available, an implementation-defined valid date shall be supplied.__TIME__
The time of translation of the preprocessing translation unit: a character string literal of the form "hh:mm:ss"
as in the time generated by theasctime
function. If the time of translation is not available, an implementation-defined valid time shall be supplied.
From the above, if the time of translation is available, the macro SAVEFILECHECK_COMPILE_DATE
expands to 3 string literals for a total of 11+1+8 = 20 characters, hence 21 bytes including the null terminator. If the time of translation is not available, implementation defined valid dates and times must be used, hence the behavior must be the same.
5.1.1.2 Translation phases
- Adjacent string literal tokens are concatenated.
- White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting tokens are syntactically and semantically analyzed and translated as a translation unit.
Hence the fact that the argument to sizeof
be made of 3 adjacent string literals is irrelevant, all occurrences of the sizeof
operator in your examples get a single string literal argument in phase 7, then
6.5.3.4 The
sizeof
operator4 When
sizeof
is applied to an operand that has typechar
,unsigned char
, orsigned char
, (or a qualified version thereof) the result is1
. When applied to an operand that has array type, the result is the total number of bytes in the array.
Therefore all 3 outputs in your example must show 21 bytes. You have found a bug in the mikroc compiler: you should report it and find a work around for your current projects.
Upvotes: 16
Reputation: 10539
#include <stdio.h>
int main(){
printf("%zu\n", sizeof("aa"));
}
Interestingly, in this case, "aa"
not decaying to pointer, but act as char array. Since array have 3 elements (including zero terminator), output is 3.
This defines string (array of char)
#include <stdio.h>
#define SAVEFILECHECK_COMPILE_DATE __DATE__ " " __TIME__
int main(){
printf("%zu\n", sizeof(SAVEFILECHECK_COMPILE_DATE));
}
every time you compile it is different, because __DATE__
and __TIME__
.
My current result is 21, but it may change.
Same is valid for C++.
Upvotes: -4