DevSolar
DevSolar

Reputation: 70333

C11 Annex K: "objects that overlap"

There is a phrase that keeps popping up in Annex K of the C standard (bounds-checking interfaces):

....copying shall not take place between objects that overlap.

Considering, for example, strcpy_s( char * restrict s1, rsize_t s1max, char const * restrict s2 ), in which s1max specifies the maximum capacity of s1 to enable the bounds checking.

What exactly would be "the object" s1 at this point, which must not overlap with "the object" s2?

Would that be...

or

If it is the former, I wonder about the lack of consistency, as I do not know the size of the buffer that is s2, and would have to apply a different definition of "the object".

If it is the latter, I wonder if it doesn't break "the promise" that is given, as conceivably the source string and the eventual (post-copy) destination string could overlap if the source string is longer than the original one.

What is the intention / the intended definition of "object" here?

Upvotes: 4

Views: 1719

Answers (2)

Lundin
Lundin

Reputation: 214365

This is found everywhere in the standard, not just in the optional bounds-checking interface, but also in mandatory library functions such as strcpy. The bounds-checking interface functions merely inherited the very same text.

The formal definition of an object is:

3.15
object
region of data storage in the execution environment, the contents of which can represent values

Based on this, a string has to be the whole array including the null terminator. Because a function such as strcpy would break if the null terminator was somehow overwritten during copy - it has to be regarded as part of the (array) object.

There seem to be no definition of the term "overlap", but the intention is fairly clear: to prevent situations such as this:

  char str[] = "foobar";
  strcpy(str+3,str);

where one possible implementation of strcpy would be while(*dst++ = *src++){}. Which would break as it never hits the null terminator and we'd end up writing out of bounds.

Notably, you already promise the compiler that the parameters don't overlap when you pass them to a function expecting restrict pointers. The text in the standard regarding overlaps being undefined just makes it clearer still.

In the strcpy example, any lvalue access to what dst points at, is not allowed to modify what str points at, or we violate the definition of restrict (C17 6.7.3) and thereby invoke undefined behavior.

This is, as far as I know, always the programmer's responsibility. No compiler I know of gives diagnostic messages for restrict violations on the caller-side.

Upvotes: 0

I believe the intent is such that the s1max characters starting from s1 must not overlap any of the characters in s2 including the null terminator. K.3.7.1.3p5 says that:

  1. All elements following the terminating null character (if any) written by strcpy_s in the array of s1max characters pointed to by s1 take unspecified values when strcpy_s returns. [418]

with the footnote 418 saying that

  1. This allows an implementation to copy characters from s2 to s1 while simultaneously checking if any of those characters are null. Such an approach might write a character to every element of s1 before discovering that the first element should be set to the null character.

However, Microsoft's says that "if source and dest overlap, the behavior is undefined", so this would hint that in fact anything could happen in that case. This seems to negate the usefulness of the bounds-checking interface.

Upvotes: 3

Related Questions