Reputation: 477
The following simple program makes AddressSanitizer to report "invalid-pointer-pair" with latest master build of Clang (doesn't happen with the latest official release 15):
#include <filesystem>
#include <iostream>
constexpr std::string_view str = "/usr";
const std::filesystem::path path{str};
int main() {
std::cout << "Path: " << path << std::endl;
}
What happens:
$ clang++ -fsanitize=address,pointer-compare,pointer-subtract -g -O0 -std=c++20 repro-asan.cpp -stdlib=libc++ -o repro-asan
$ export ASAN_OPTIONS="detect_invalid_pointer_pairs=2"
$ ./repro-asan
=================================================================
==3183863==ERROR: AddressSanitizer: invalid-pointer-pair: 0x563a8315eb41 0x563a82764600
error: failed to decompress '.debug_aranges', zlib is not available
error: failed to decompress '.debug_info', zlib is not available
error: failed to decompress '.debug_abbrev', zlib is not available
error: failed to decompress '.debug_line', zlib is not available
error: failed to decompress '.debug_str', zlib is not available
error: failed to decompress '.debug_line_str', zlib is not available
error: failed to decompress '.debug_loclists', zlib is not available
error: failed to decompress '.debug_rnglists', zlib is not available
#0 0x563a8274d818 in bool std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__addr_in_range[abi:v170000]<char const&>(char const&) const /home/me/dev/llvm/build/bin/../include/c++/v1/string:1979:23
#1 0x563a8274d418 in std::__1::enable_if<__is_cpp17_forward_iterator<char const*>::value, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&>::type std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::append[abi:v170000]<char const*>(char const*, char const*) /home/me/dev/llvm/build/bin/../include/c++/v1/string:2805:14
#2 0x563a8274d264 in std::__1::enable_if<__is_cpp17_forward_iterator<char const*>::value, void>::type std::__1::__fs::filesystem::_PathCVT<char>::__append_range[abi:v170000]<char const*>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, char const*, char const*) /home/me/dev/llvm/build/bin/../include/c++/v1/__filesystem/path.h:316:12
#3 0x563a8274cfcd in void std::__1::__fs::filesystem::_PathCVT<char>::__append_source[abi:v170000]<std::__1::basic_string_view<char, std::__1::char_traits<char>>>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::basic_string_view<char, std::__1::char_traits<char>> const&) /home/me/dev/llvm/build/bin/../include/c++/v1/__filesystem/path.h:331:5
#4 0x563a8274cb2e in std::__1::__fs::filesystem::path::path[abi:v170000]<std::__1::basic_string_view<char, std::__1::char_traits<char>>, void>(std::__1::basic_string_view<char, std::__1::char_traits<char>> const&, std::__1::__fs::filesystem::path::format) /home/me/dev/llvm/build/bin/../include/c++/v1/__filesystem/path.h:483:5
#5 0x563a82676438 in __cxx_global_var_init /home/me/dev/tmp/repro-asan.cpp:6:29
#6 0x563a82676474 in _GLOBAL__sub_I_repro_asan.cpp /home/me/dev/tmp/repro-asan.cpp
#7 0x7fca55896eba in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29eba) (BuildId: 69389d485a9793dbe873f0ea2c93e02efaa9aa3d)
#8 0x563a826764a4 in _start (/home/me/dev/tmp/repro-asan+0x1f4a4)
0x563a8315eb41 is located 1 bytes inside of global variable 'path' defined in '/home/me/dev/tmp/repro-asan.cpp:6' (0x563a8315eb40) of size 24
0x563a82764600 is located 32 bytes before global variable '.str.2' defined in '/home/me/dev/llvm/build/bin/../include/c++/v1/string:1984' (0x563a82764620) of size 13
'.str.2' is ascii string 'basic_string'
0x563a82764600 is located 0 bytes inside of global variable '.str.1' defined in '/home/me/dev/tmp/repro-asan.cpp:4' (0x563a82764600) of size 5
'.str.1' is ascii string '/usr'
SUMMARY: AddressSanitizer: invalid-pointer-pair /home/me/dev/llvm/build/bin/../include/c++/v1/string:1979:23 in bool std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__addr_in_range[abi:v170000]<char const&>(char const&) const
==3183863==ABORTING
Looking at the source code of the library, there's indeed a comparison between two pointers: https://github.com/llvm/llvm-project/blob/7f12efa88e17548d98f3e7425687f4afe0df34ed/libcxx/include/string#L1979
My understanding is that these pointers belong to different objects, meaning their comparison is undefined in C++. Thus, I'm wondering whether it's me doing something wrong (using the standard library wrong), or is it a problem with the library itself?
System: Ubuntu 22.04 LTS
Upvotes: 4
Views: 240
Reputation: 76829
The problematic function is testing whether an element lies in a range with the following (from https://github.com/llvm/llvm-project/blob/main/libcxx/include/string#L1979):
return data() <= __p && __p <= data() + size();
where __p
is the pointer under consideration and [data(), data()+size())
the range.
My understanding is that these pointers belong to different objects, meaning their comparison is undefined in C++.
It is not undefined, just unspecified. The comparison doesn't have to be consistent, but the operation must result in true
or false
and the rest of the execution must be consistent with either.
While the standard only says that the comparison is unspecified, on common architectures, such as libc++ is targeting, it is guaranteed that memory has a linear layout and that the comparison of addresses forms a total order, so that the check is actually fine (except in constant evaluation, where it is forbidden, which is why the linked function has a specific branch for that). In principle compilers could also arbitrarily choose the result of these operations e.g. for optimization purposes, but no compiler I am aware of does that. And libc++ may rely on such guarantees, even if they are not made explicitly.
What ASAN's invalid-pointer-pair
is checking here is not a check for undefined behavior. It only checks for unspecified behavior which is very likely unintentional, and at least non-portable. In this case it seems to me that the use is however intentional, with the assumption that <
forms a strict total order over pointer addresses on the given compiler/architecture.
Libc++ could have used std::less
instead of the relational operators directly. std::less
is specified to actually guarantee a total order of all pointer addresses. However, the standard library also has to make the same assumptions to implement std::less
anyway and on a usual platform it will simply be implemented with the <
operator itself, so I don't think this is a big improvement and I don't know whether ASAN would recognize that the diagnostic must be suppressed in this case either.
In the end, I guess there is simply a suppression missing here somewhere.
Upvotes: 2