Dmytro
Dmytro

Reputation: 5213

Is it possible to check if dereferencing arbitrary memory will crash the program apriori in C?

I want to write an interactive interpreted C shell that allows you to address arbitrary memory and perform commands on these memory addresses.

e.g(running program shell):

prompt> 10 bytes starting 0x400000 

This instruction would try to access address 0x400000 and show 10 bytes starting there. e.g. range [0x400000, 0x400009].

and would produce output like:

{0x00, 0x01, 0x02, 0x03, 0x04, <bad>, <bad>, 0x07, 0x08, 0x09, 0x0a}

Where "bad" would indicate an attempt to address "illegal" memory.

I want to know if there is a standard way in C to check if the program is allowed to access the memory I am attempting to access, or if accessing this memory will cause the program to crash(before it actually crashes), and report information to the user that the program is not allowed to access that memory.

I ask this because most questions on this topic tend to be answered by "you can't definitely check if a pointer is valid", but I am sure that there must be some way to check if the pointer is at least "definitely invalid and will crash" or "possibly invalid, but won't crash", and unfortunately I can't find the answer to this question.

Thanks ahead of time.

Upvotes: 2

Views: 607

Answers (4)

Daniel Jour
Daniel Jour

Reputation: 16156

As already said, we're far off from "standard C" here.

Nonetheless, you can (somewhat) accomplish that by handling segmentation faults. Of course there's a library for that: GNU libsigsegv

Upvotes: 1

jforberg
jforberg

Reputation: 6762

I don't think there is any way to do this using just standard C.

However you can use evil platform specific tricks to get an ides of how your memory mappings look. On Linux, the file /proc/(pid)/maps will list the memory maps of process pid, including read/write permission status. This is how it looks for a simple cat process on my machine:

00400000-0040c000 r-xp 00000000 00:13 1237228                            /usr/bin/cat
0060b000-0060c000 r--p 0000b000 00:13 1237228                            /usr/bin/cat
0060c000-0060d000 rw-p 0000c000 00:13 1237228                            /usr/bin/cat
01864000-01885000 rw-p 00000000 00:00 0                                  [heap]
7fe7a5e0b000-7fe7a6121000 r--p 00000000 00:13 1487092                    /usr/lib/locale/locale-archive
7fe7a6123000-7fe7a62ba000 r-xp 00000000 00:13 1486770                    /usr/lib/libc-2.23.so
7fe7a62ba000-7fe7a64ba000 ---p 00197000 00:13 1486770                    /usr/lib/libc-2.23.so
7fe7a64ba000-7fe7a64be000 r--p 00197000 00:13 1486770                    /usr/lib/libc-2.23.so
7fe7a64be000-7fe7a64c0000 rw-p 0019b000 00:13 1486770                    /usr/lib/libc-2.23.so
7fe7a64c0000-7fe7a64c4000 rw-p 00000000 00:00 0 
7fe7a64cb000-7fe7a64ee000 r-xp 00000000 00:13 1486769                    /usr/lib/ld-2.23.so
7fe7a66cc000-7fe7a66ee000 rw-p 00000000 00:00 0 
7fe7a66ee000-7fe7a66ef000 r--p 00023000 00:13 1486769                    /usr/lib/ld-2.23.so
7fe7a66ef000-7fe7a66f0000 rw-p 00024000 00:13 1486769                    /usr/lib/ld-2.23.so
7fe7a66f0000-7fe7a66f1000 rw-p 00000000 00:00 0 
7fe7a66f5000-7fe7a66f8000 rw-p 00000000 00:00 0 
7ffe398e8000-7ffe39909000 rw-p 00000000 00:00 0                          [stack]
7ffe3999b000-7ffe3999e000 r--p 00000000 00:00 0                          [vvar]
7ffe3999e000-7ffe399a0000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

So from this you can see that the program image itself is mapped near the beginning of virtual memory, the heap is slightly higher up, the stack is mapped to 7ffe398e8000-7ffe39909000 and the C library and dynamic linker are also loaded into memory.

Note that each file is mapped several times. For instance, /usr/bin/cat has both a read-only, executable and read-write segment. This is to prevent processes from writing to const memory and from executing data.

From the mapping table you could get a fair idea of how your memory is laid out and what operations would be possible on these parts of memory.

Is this a good idea? NO.

Most likely not unless you are writing a debugger or similar development tool.


As an aside, the "shell" you are thinking about writing sounds very much like a debugger. Debuggers such as gdb can do the things you talk about, including evaluating C expressions and examining memory.


As a second aside, and because I find this very interesting, here is a small exercise:

As you can see there is some kernel memory mapped at ffffffffff600000. If this theory is correct, we should be able to read that memory even though in general we can't access the kernel's memory. Let's try:

int main(void)
{
  unsigned long *p = 0xffffffffff600000;

  for (;;)
    printf("0x%lx, ", *p++);
}

We get

0xf00000060c0c748, 0xccccccccccccc305, 0xcccccccccccccccc, ... Segmentation fault

If you wonder why this memory is readable to a user space process, it is to accelerate certain syscalls such as gettimeofday and allow them to work without having to switch to kernel mode as other syscalls have to. See e.g. this question.

Upvotes: 3

a3f
a3f

Reputation: 8657

Standard way? No. What you experience as crash is a result of undefined behavior. The standard doesn't delve into such details.

The Windows API provides a IsBadReadPtr function that seems to be what you are after. The documentation is quite clear on that you shouldn't use it.

The thing you overlooked is that some invalid accesses, you just can't recover from. If you touch a guard page, and catch the error without giving the guard page access handler a chance to run, you missed your chance. Next time you access the same address, you get an access violation and a core dump. Although on normal execution, this would have been fine. See Raymond Chen's IsBadXxxPtr should really be called CrashProgramRandomly.

On Unix, you can have the kernel do the dirty work for you by passing the pointer to write(2). If it returns EFAULT, it means you would have crashed your process.

Note, that while it looks like you check a priori, you're really checking the aftermath. Checking in advance isn't reliable (The mapping might change, between the check and the actual access).

If you want to get notified after the failure, write a signal handler for SIGSEGV On UNIX. On Windows, deal with the EXCEPTION_ACCESS_VIOLATION SEH exception.


Addendum: What you want to do sounds a bit like what mmbbq did. It injected a lua interpreter into an external application and allowed calling and dereferencing addresses. If you fudge it up, only the thread that was started anew was affected and the program itself continued working (for a while at least..). The website isn't online anymore, but maybe you are successful in finding a mirror.

Upvotes: 2

R Sahu
R Sahu

Reputation: 206707

I want to know if there is a standard way in C to check if the program is allowed to access the memory I am attempting to access, or if accessing this memory will cause the program to crash(before it actually crashes), and report information to the user that the program is not allowed to access that memory.

No, there isn't.

The standard only says that you can't dereference a null pointer. Beyond that, the range of pointer values that are valid is dependent on the platform. What you are hoping to pull off cannot be pulled off in platform independent code.

From footnote 87 in the C99 Standard:

Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime

Upvotes: 2

Related Questions