Reputation:
I'm looking to implement a function that determines whether a given pointer points into a given buffer. The specification:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len);
If there is some n
, 0 <= n && n < len
, for which p == buf + n
, returns true
.
Otherwise, if there is some n
, 0 <= n && n < len * sizeof(T)
, for which reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n
, the behaviour is undefined.
Otherwise, returns false
.
The obvious implementation would look something like
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
return p >= buf && p < buf + len;
}
but that has undefined behaviour in standard C++: relational comparisons of pointers are only defined for pointers into the same array.
An alternative would be to use the standard library's comparer objects:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
return std::greater_equal<T *>()(p, buf) && std::less<T *>()(p, buf + len);
}
which is guaranteed to return true
when I want it to return true
, and avoids undefined behaviour, but allows for false positives: given int a; int b;
, it allows a result of true
for points_into_buffer(&a, &b, 1)
.
It can be implemented as a loop:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
for (std::size_t i = 0; i != len; i++)
if (p == buf + i)
return true;
return false;
}
However, compilers have trouble optimising away that loop.
Is there a valid way of writing this, where with current compilers and optimisations enabled, the result is determined in constant time?
Upvotes: 15
Views: 319
Reputation:
As far as I can tell, this is a portable implementation of the function I'm after for all possible implementations:
#ifdef UINTPTR_MAX
bool points_into_buffer(std::uintptr_t p, std::uintptr_t buf, std::size_t len)
{
const auto diff = p + 0u - buf;
if (diff < len)
// #1
if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + diff)
return true;
for (std::size_t n = 0; n != len; n++)
if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n)
// #2
if (reinterpret_cast<char *>(p) - reinterpret_cast<char *>(buf) != diff)
return true;
return false;
}
template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
return points_into_buffer(reinterpret_cast<std::uintptr_t>(p),
reinterpret_cast<std::uintptr_t>(buf),
len * sizeof(T));
}
#else
template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
for (std::size_t n = 0; n != len; n++)
if (p == buf + n)
return true;
return false;
}
#endif
In general, diff
is not guaranteed to have a meaningful value. But that's okay: the function returns true
if and only if it finds some n
such that reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n
. It only uses diff
as a hint to find the value of n
faster.
It relies on the compiler optimising conditions that are not necessarily known at compile time in general, but are known at compile time for a particular platform. The conditions for the if
statements marked as #1
and #2
are determined by GCC at compile time to always be true
and false
respectively, because of how diff
is defined, allowing GCC to see that no useful action is performed inside the loop, and allowing the entire loop to be dropped.
The generated code for points_into_buffer<char>
and points_into_buffer<int>
looks like:
bool points_into_buffer(char*, char*, unsigned int): movl 4(%esp), %edx movl $1, %eax movl 12(%esp), %ecx subl 8(%esp), %edx cmpl %edx, %ecx ja L11 xorl %eax, %eax L11: rep ret bool points_into_buffer(int*, int*, unsigned int): movl 4(%esp), %edx movl 12(%esp), %eax subl 8(%esp), %edx leal 0(,%eax,4), %ecx movl $1, %eax cmpl %edx, %ecx ja L19 xorl %eax, %eax L19: rep ret
On systems where std::uintptr_t
is not available, or where addresses are more complicated than simple integers, the loop is used instead.
Upvotes: 8
Reputation: 1862
If you cast the pointers to large enough unsigned integers and add number of bytes instead of number of objects, the undefined behavior goes away.
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
uintptr_t ip = (uintptr_t)p;
uintptr_t ibuf = (uintptr_t)buf;
return ip >= ibuf && ip < (ibuf + sizeof(T) * len);
}
This code doesn't detect if p is not correctly aligned, but you could easily add a test with a %.
Upvotes: 1