Reputation: 61
I was reading about the vulnerabilities in strings in C and then I came across this code. Could anyone give me an explanation why this is happening? Thanks in advance.
int main (int argc, char* argv[]) {
char a[16];
char b[16];
char c[32];
strncpy(a, "0123456789abcdef", sizeof(a));
strncpy(b, "0123456789abcdef", sizeof(b));
strncpy(c, a, sizeof(c));
printf("a = %s\n", a);
printf("b = %s\n", b);
printf("c = %s\n", c);
}
output:
a = 0123456789abcdef0123456789abcdef
b = 0123456789abcdef
c = 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
Upvotes: 1
Views: 805
Reputation: 214168
strncpy
is a dangerous function, because it only adds null termination if there is room left. This is what happens in your code, you copy exactly 16 bytes
The strncpy
function was actually never intended to be used for C strings, but for an ancient Unix string format that didn't use null termination. It is a function that should be avoided for most purposes. In particular, it is not a "safe version of strcpy" - but a more dangerous function than strcpy, as we can see from the bugs here.
The solution is to check the size to copy in advance, before you copy. And then use strcpy. For example:
char a[16];
const char to_copy[] = "0123456789abcdef";
_Static_assert(sizeof(to_copy) <= sizeof(a), "to_copy is too big");
strcpy(a, to_copy);
To fix your current program, you need to allocate room for the null terminator, like this:
#include <string.h>
#include <stdio.h>
int main (void)
{
char a[16+1];
char b[16+1];
char c[32+1];
const char to_copy[] = "0123456789abcdef";
_Static_assert(sizeof(to_copy) <= sizeof(a), "to_copy is too big");
_Static_assert(sizeof(to_copy) <= sizeof(b), "to_copy is too big");
strcpy(a, to_copy);
strcpy(b, to_copy);
strcpy(c, a);
printf("a = %s\n", a);
printf("b = %s\n", b);
printf("c = %s\n", c);
}
Upvotes: 1
Reputation: 126857
The n
in strncpy
does not mean the same as the n
in strncat
or snprintf
; strncpy
was born to manipulate fixed size buffer strings such as directory entries, so it does copy at most n characters, filling the unused ones with NULs (= byte 0 = '\0'
= null character = ...), but if there are no spare ones it does not add any NUL. Hence, the target of strncpy
is not necessarily going to be NUL-terminated, so if you try to manipulate it as a C string you are in for some surprises.
This is exactly what happens in this case. Your a
and b
buffers are exactly as long as the string you are copying in them; strncpy
isn't terminating them with a NUL, so when the third strncpy
or the later printf
try to read from them, the result is anybody's guess (read: it's undefined behavior, so anything can happen), as there's no NUL stopping them from going on reading in unrelated memory.
As for the particular output you are getting, it depends from how exactly a
, b
and c
are laid out in memory (indeed, on my machine I get different results), from how strncpy
is written exactly (as it's not meant to be invoked on overlapping strings) and from how exactly the optimizer decided to mangle your code (remember: reading outside of the boundaries is undefined behavior, so the optimizer is allowed to assume it never happens when rearranging your code).
A possible explanation of the actual behavior you are seeing is that c
, a
and b
are laid out consecutively in memory, in this order, and the rest of the stack happens to be NUL-filled (here I'm using °
as a placeholder for a NUL1):
°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°...
c a b ?
So what happens should be something like:
0123456789abcdef
is copied in a
, without any NUL termination as it reached the maximum allowed characters (16).
°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°0123456789abcdef°°°°°°°°°°°°°°°°°°°°...
c a b ?
0123456789abcdef
is copied in b
, without any NUL termination (same as before).
°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°0123456789abcdef0123456789abcdef°°°°...
c a b ?
a
is copied into c
; as a
is not NUL terminated, strncpy
goes on happily reading straight into b
's space, copying the full 32 characters it is allowed to copy. As it reached 32 characters, no NUL is written.
0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef°°°°...
c a b ?
a
is printed; as it's not NUL terminated, printf
goes on reading on the memory that follows, namely b
, thus printing 32 characters.
b
is printed; it hasn't been explicitly NUL terminated, but the memory after it happens to contain a NUL, so it stops after the 16 characters that have been copied;c
is printed; as it's not NUL terminated, printf
goes on reading over the full length of c
, a
and b
(at whose end lies a NUL, that stops the print), printing 64 characters.Remember: this is just a possible explanation of the output you show. It's not necessarily right, nor of course it's contractual (on my machine I obtain different output depending from the compilation flags, and even on different runs, depending from what happens to be on the stack on startup).
Upvotes: 4
Reputation: 8475
Reading beyond the end of the string is undefined behavior (UB). With UB there are no guarantees that the code will behave in one way or the other. The behavior can differ on different systems, compilers, linkers, compilation/linking flags, depending on (seemingly) unrelated code, and the version of all of the above.
On many systems, variables sit consecutively on the stack, in reverse order. Replace your printf with:
printf("a (%p) = %s\n", a, a);
printf("b (%p) = %s\n", b, b);
printf("c (%p) = %s\n", c, c);
It prints the addresses of the arrays:
a (0x7fff559adad0) = 0123456789abcdef<F0>ښU<FF>
b (0x7fff559adac0) = 0123456789abcdef0123456789abcdef<F0>ښU<FF>
c (0x7fff559adaa0) = 0123456789abcdef<F0>ښU<FF>
As evident from the addresses, the printout of b
starts at address 0x7fff559adac0 but continues well into the address of a
(which starts 16 bytes after the start of b
).
Also note that the strings have junk at the end. The just is because the '\0' terminator is missing in the string, and printf
goes on to read the following junk (UB in its own sake).
This happens because:
strncpy(a, "0123456789abcdef", sizeof(a));
sets a[] to have all its bytes equal to "0123456789abcdef" without a null terminator. Having no '\0' printf does not know where to stop, and will result in UB.
strncpy(b, "0123456789abcdef", sizeof(b));
also sets b[] to have all its bytes equal to "0123456789abcdef" without a null terminator. Here also, any printf causes UB. But this time instead of random junk, it simply reads the next string.
To add insult to injury, the line
strncpy(c, a, sizeof(c));
reads 32 bytes from a 16 byte array. This is also UB. On your (and my) system it readsa
and a lot of junk after it. In theory, this can crash your program with access-violation or a segmentation fault.
Some viruses and worms use such overflows to read or write data they are not supposed to.
Upvotes: 1