guptashark
guptashark

Reputation: 119

How does C treat Buffer overflows?

I understand that in C, there are arrays that can be given a length at declaration. I want to know if those length declarations are simply for other programmers to see and understand the use of, or if the compiler can be made to protect the code by disallowing the read of more than the buffer length of characters. When I read in a string, it simply keeps going,and starts to overwrite data stored in variables that are declared after the buffer I want to read into. Are there safe ways to read in data?

char arr[5];                                                                
char buff[5] = "cat";                                                                                                                                        
printf("The buffer holds: %s\n", buff);                                     
printf("Input a word to be held in \"arr\": ");                             

scanf("%s", arr);                                                           

printf("The array holds:  %s\n", arr);                                      
printf("The buffer holds: %s\n", buff);                                     
printf("%c\n", arr[9]);      

If the string read into arr is long enough, "cat" is overwritten, and none of the compile flags seem to do anything (I compile with -Wextra -Wall -Werror -std=c99) The only thing that complains is valgrind. How do I write safe array code in C?

Upvotes: 0

Views: 832

Answers (4)

Harry
Harry

Reputation: 11638

C does not protect you from going past the end of an array. There are ways to detect it though. See this post

Setting up a bounds-protected array

Try this code

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARRAY_SIZE 100

int main(void) {
  size_t i = 0;
  char   arr1[ARRAY_SIZE];
  char * arr2 = malloc(ARRAY_SIZE );
  for(i = 0; i < 200; i++) {
    arr1[i] = '1';
    arr2[i] = '2';
  }

  for(i = 0; i < 200; i++) {
    printf("%zu arr1[i]=%c  \n", i, arr1[i]);
    printf("%zu arr2[i]=%c  \n", i, arr2[i]);
  }
  return 0;
}

using the following compile time options (This only works with gcc ie clang does not give errors)

gcc -O3 -Wall -std=c11 -pedantic array_overflow_at_03.c

then try it using

gcc -Wall -std=c11 -pedantic array_overflow_at_03.c

Each method to do this have their merits, your application needs would determine which one to use.

Upvotes: 0

Paul Ogilvie
Paul Ogilvie

Reputation: 25266

An array size in C only tells the compiler how much memory to reserve for the array. C will not insert code to check if you go beyond the array boundary. the size '5' in int a[5]; is nowhere stored in the compiled program. It is only in the source code. Other programmers who can see the source code can see it; no-one else can.

As C does not check what you do and hold your hand (see Lyle Rolleman's answer), C won't "detect" a buffer overrun. Consequently, the behavior is undefined when this happens (so called "Undefined Behavior", or UB). What often happens is that the stack is overwritten, and on the stack is the return adress to the caller. This being overwritten, when the current function wants to return, it jumps to "nowhere" (or somewhere, as this behavior is used by "stack exploits" from hackers who carefully overwrite the stack so the jump is to "their-where").

Upvotes: 0

Keith Thompson
Keith Thompson

Reputation: 263177

In a sense, the C language itself neither protects you nor fails to protect you from going beyond the bounds of an array. More precisely, a C compiler is not required to perform bounds checking, but it's permitted to do so. (Few compilers take advantage of that permission. Very few do so by default.)

For example, if you write:

int arr[10];
arr[20] = 42;

the behavior is undefined. That doesn't mean that your program will crash. It doesn't mean that the error will or will not be detected. It is, to quote the ISO C Standard,

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

A typical C compiler will probably generate code that takes the base address of arr, adds an offset of 20 * sizeof (int) to it, and then attempts to store 42 at the resulting location. Without explicit or implicit checks, this could clobber some other data structure, it could write to memory that's owned by your process but not used for anything else, or it could terminate your program. (Or #include <stdjoke.h> it could make demons fly out of your nose.)

But a conforming C compiler could add code to check that the index is in the range 0 to 9, and take some sensible action if it isn't. C doesn't forbid bounds checking; it just doesn't require it.

And in this particular case, it's possible (but not required) to detect at compile time that the array access is out of bounds, so a compiler could issue a compile-time warning. (This isn't possible if the index value isn't known until run time.)

Ultimately, the responsibility for avoiding out-of-bounds accesses falls on you, the programmer. Don't assume that the compiler will check it for you -- and don't assume that it won't.

Upvotes: 3

Lyle Rolleman
Lyle Rolleman

Reputation: 195

C follows the philosophy of "the programmer knows best" and "I ain't holding you hand"

This is why C is so fast, it doesn't have to do any checks.

For safe user input, you can use fgets

something along the lines of:

fgets(arr, sizeof(arr), stdin);

arr will hold the input up to the specified size. For further information, I recommend the man page for fgets http://linux.die.net/man/3/fgets

You may need to make multiple calls of this in order to get all the input from stdin.

Upvotes: 1

Related Questions