Reputation: 7134
What follows are abbreviated just to keep this question short (no check for null, etc.).
program1.c
main()
{
char *aString = calloc(10, sizeof(char));
printf("Enter string: ");
scanf("%s", aString);
printf("You typed in %s\n", aString);
}
program2.c
main()
{
char aString[10];
printf("Enter string: ");
scanf("%s", aString);
printf("You typed in %s\n", aString);
}
program1.c will let me enter characters seemingly forever. I've entered 2000+ characters and the program will execute without error, despite the fact that this is "undefined behavior".
program2.c will let me enter more than 10 characters, but if get close to like 30 or 40 characters, it will give me a segmentation fault.
Now my limited understanding from class and other tutorials tells me that both of these programs are doing the same thing under the hood --- setting aside a piece of memory intended to be an array of chars of length 10. But it seems that program2.c's implementation provides some degree of safety. Or is the segmentation fault error completely random when you exceed the granted memory space, and I just happen to be getting it with program2.c just because that's the mood my computer is in right now?
What is the difference between program1.c and program2.c, and which is the "safer" method of entering a string? I realize there are other methods which may be even better, but I'm curious about the comparison between just these two.
Upvotes: 1
Views: 936
Reputation: 140748
Assuming a typical modern operating system, your program 1 does not crash because calloc
had to request an entire page (4096 bytes of RAM, usually) from the OS to satisfy the request for 10 bytes. If you feed that program sufficiently many characters, it will crash. However, writing even one byte more than the overtly requested size (10 bytes) is forbidden, and has an excellent chance of corrupting the internal data structure used to keep track of "heap" allocations. It is probable that if you added another malloc
or free
call to this program, after the scanf
, it would crash inside that malloc
or free
. By way of illustration, consider this program:
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *p = malloc(23);
memcpy(p, "abcdefghijklmnopqrstuvwx", 25);
char *q = malloc(1);
return 0;
}
$ MALLOC_CHECK_=1 ./a.out
*** Error in `./a.out': malloc: top chunk is corrupt: 0x0000000001bc4020 ***
(On this system, copying only 24 bytes does not crash. Do not rely on this information.)
Program 2, meanwhile, is probably crashing not because the scanf
call wrote all the way to unmapped memory (which, for similar reasons, would require far more bytes of input) but because data on the stack is very densely packed and it clobbered something critical, e.g. the address to which main
should return.
In a program that does anything even a little more complicated than your examples, both "techniques" are equally dangerous -- both heap and stack overflows can and have lead to catastrophic security holes.
You explicitly asked for a comparison between your two unsafe techniques, but for the benefit of future readers I am going to describe two much better techniques for reading strings from standard input. If your C library includes it, the best option is getline
, which (in a simple program like this) would be used like so:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *line = 0;
size_t n = 0;
ssize_t r;
fputs("Enter a string: ", stdout);
fflush(stdout);
r = getline(&line, &n, stdin);
if (r == -1) {
perror("getline");
return 1;
}
if (r > 0 && line[r-1] == '\n')
line[r-1] = '\0';
printf("You entered %s\n", line);
free(line);
return 0;
}
If you don't have getline
, and you need to read an arbitrarily long string from the user, your best option is to implement getline
yourself (gnulib has an implementation you can borrow, if your code can be released under the GPL). But an acceptable alternative in many cases is to place an upper limit on input length, at which point you can use fgets
:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE_LEN 81
int main(void)
{
char *line = malloc(MAX_LINE_LEN);
size_t n;
fputs("Enter a string: ", stdout);
fflush(stdout);
if (!fgets(line, MAX_LINE_LEN, stdin)) {
perror("fgets");
return 1;
}
n = strlen(line);
if (line[n] != '\n') {
fprintf(stderr, "string too long - %u characters max\n", MAX_LINE_LEN);
return 1;
}
line[n] = '\0';
printf("You entered %s\n", line);
free(line);
return 0;
}
Notes:
sizeof(char) == 1
by definition; therefore, sizeof(char)
should never appear in well-written code. If you want to use calloc
to allocate a prezeroed array of characters, write calloc(1, nchars)
.scanf
, fscanf
, or sscanf
.fgets
with gets
. fgets
is safe if used correctly; it is impossible to use gets
safely.Upvotes: 2
Reputation: 137467
The two are not the same under the hood.
program1 is calling calloc
to allocate memory from the heap.
program2 has been compiled to reserve additional space on the stack when the function is called.
Both programs are exploitable because you are not checking any bounds when you call scanf()
. He is free to write as many bytes as he wishes to either buffer. The solution here is scanf("%9s", aString)
, which tells scanf
to only write up to 9+1 bytes.
Upvotes: 2
Reputation: 35580
Although neither program is safe, the likely reason you are seeing the behavior is that program B allocates the array on the stack, and as soon as you get out of bounds you are overwriting other useful things like the stack frame of the scanf
call.
Whereas program A allocates heap memory, and since you are not doing anything else in this toy program, the memory you are writing into is unused.
In any real program, both are equally unsafe.
Note: This out-of-bounds behavior is undefined by the C standard, and the compiler could theoretically be doing anything. But in most common real-world compilers, the above is most likely what actually happens.
Upvotes: 2