Siqi Lin
Siqi Lin

Reputation: 1257

Portable way to check if a char* pointer is a null-terminated string

I have a C function that takes in a char* pointer. One of the function's preconditions is that the pointer argument is a null-terminated string

void foo(char *str) {
    int length = strlen(str);
    // ...
}

If str isn't a pointer to a null-terminated string, then strlen crashes. Is there a portable way to ensure that a char* pointer really does point to a null-terminated string?

I was thinking about using VirtualQuery to find lowest address after str that's not readable, and if we haven't seen a null-terminator between the beginning of str and that address, then str doesn't point to a null-terminated string.

Upvotes: 8

Views: 26426

Answers (5)

Scott Milano
Scott Milano

Reputation: 51

The best you can do is put an upper bounds on the size of the string with the strn functions. So if you are writing a library call and don't trust the caller, document your call noting that strings cannot be above a specific reasonable size and check:

#define MAXNAME 32

if (strnlen(sketchyName,MAXNAME)==MAXNAME) return ERROR;

Upvotes: 2

Adrian McCarthy
Adrian McCarthy

Reputation: 48012

The other answers are correct, but here's another way of thinking about it.

If the pointer points to a buffer of n chars, none of which are '\0', then as soon as you try to examine the n + 1 character, you're in the realm of undefined behavior. So, to scan to see if there's a '\0', it's not enough to know some upper bound of where the end of the buffer is, you have to know exactly where the end of the buffer is.

C doesn't give you a way to know that, other than to require that the caller provide it to you. VirtualQuery (assuming it were portable) is not enough because there may be other objects immediately after the buffer in memory. While it might appear to work on many implementations, the fact that you're relying on undefined behavior means that it's necessarily non-portable.

Upvotes: 2

Keith Thompson
Keith Thompson

Reputation: 263497

No, there is no portable way to do that. A null-terminated string can be arbitrarily long (up to SIZE_MAX bytes) -- and so can a char array that isn't null-terminated. A function that takes a char* argument has no way of knowing how big a chunk of valid memory it points to, if any. A check would have to traverse memory until it finds a null character, which means that if there is no null character in array, it will go past the end of it, causing undefined behavior.

That's why the standard C library functions that take string pointers as arguments have undefined behavior of the argument doesn't point to a string. (Checking for a NULL pointer would be easy enough, but that would catch only one error case at the cost of slower execution for valid arguments.)

EDIT : Responding to your question's title:

Portable way to check if a char* pointer is a null-terminated string

a pointer cannot be a string. It may or may not be a pointer to a string.

Upvotes: 12

abligh
abligh

Reputation: 25129

As others have pointed out, there is no portable way to do this. The reason is that it isn't useful.

Normal semantics are to check for NULL only, and assume if a non-NULL is passed, it's valid. After all, there's likely to be a NULL somewhere after your pointer. The only other possibility is that you run into unmapped memory. It's more likely however that even with a bogus pointer, you find a NULL. That means a bogus 2000 character string will still get past the check.

Upvotes: 0

danielschemmel
danielschemmel

Reputation: 11126

To prove null termination of a string, you don't just have to prove that a null char exists, you have to prove that it exists at exactly the right spot (no later, but also no earlier). To do that you need to know the intended content or at least length of the string, at which point it is very simple to do the verification...

Consider e.g. a device w/o virtual memory: That means you can iterate over the whole address space without triggering any kind of interrupts.

If your stack is at a higher address than the heap and your compiler puts a copy of '\0' on the stack (instead of only keeping it in a register or using it as an immediate value), you are suddenly guaranteed that any string on the heap will be weakly zero-terminated in the sense that you will always be able to consider the '\0' that your verification code put on the stack as the zero-terminator.

Upvotes: 2

Related Questions