ThomasF
ThomasF

Reputation: 1

How does the C++ algorithm library check the range of the output and not create segfaults when it is smaller than the range of the input?

I was just curious how some of the C++ algorithms check the range of the result/output container when you only provide the range of the input? For example, for the following code

#include <iostream>
#include <algorithm>
#include <vector>

int main()
{
  std::vector<int> a = {4, 2, 1, 7};
  std::vector<int> b = {1, 2};
  
  std::copy(a.begin(), a.end(), b.begin());
  for(auto val : b)
    std::cout << val << std::endl;
}

with the output:

4
2

I don't understand how the algorithm knows that the capacity of the output container b is 2. I would have expected it assumes just the same range as the input container and therefore generates some kind of segmentation fault.

Upvotes: 0

Views: 166

Answers (1)

prehistoricpenguin
prehistoricpenguin

Reputation: 6326

The copy algorithm doesn't check the iterator range, your program has heap-buffer-overflow issues.

The implementation for copy is straightforward, just a simple loop, you can view libcxx's implementation here.

It's a general memory issue, thanks to the great memory sanitizer tool in the compiler, we can quickly locate the problem. Though your program doesn't crash with SEGSEV now, it does have memory buffer overflow,if this is a large program, it is likely to cause other code to crash(maybe in a different thread, different library), and it is difficult to troubleshoot the cause since the crash code is just a victim.

Build with -fsanitize=address, then we get a clear report after running the program:

==227==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000038 at pc 0x7f548b0f003d bp 0x7ffd43cf2ea0 sp 0x7ffd43cf2648                                                                            WRITE of size 16 at 0x602000000038 thread T0                                                                                                                                                                           #0 0x7f548b0f003c in __interceptor_memmove (/lib/x86_64-linux-gnu/libasan.so.5+0xa103c)                                                                                                                            #1 0x55afc64f43b5 in int* std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<int>(int const*, int const*, int*) (/tmp/stackoverflow/a.out+0x43b5)                                            #2 0x55afc64f3eeb in int* std::__copy_move_a<false, int*, int*>(int*, int*, int*) (/tmp/stackoverflow/a.out+0x3eeb)                                                                                                #3 0x55afc64f390c in __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > > std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > > >(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >) (/tmp/stackoverflow/a.out+0x390c)                                                                                          #4 0x55afc64f322e in __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > > std::copy<__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > > >(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, __gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >) (/tmp/stackoverflow/a.out+0x322e)                                                                                                           #5 0x55afc64f2834 in main (/tmp/stackoverflow/a.out+0x2834)                                                                                                                                                        #6 0x7f548ac880b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)                                                                                                                                   #7 0x55afc64f238d in _start (/tmp/stackoverflow/a.out+0x238d)

0x602000000038 is located 0 bytes to the right of 8-byte region [0x602000000030,0x602000000038)
allocated by thread T0 here:
    #0 0x7f548b15e947 in operator new(unsigned long) (/lib/x86_64-linux-gnu/libasan.so.5+0x10f947)
    #1 0x55afc64f469c in __gnu_cxx::new_allocator<int>::allocate(unsigned long, void const*) (/tmp/stackoverflow/a.out+0x469c)
    #2 0x55afc64f431d in std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) (/tmp/stackoverflow/a.out+0x431d)
    #3 0x55afc64f3d35 in std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) (/tmp/stackoverflow/a.out+0x3d35)
    #4 0x55afc64f3582 in void std::vector<int, std::allocator<int> >::_M_range_initialize<int const*>(int const*, int const*, std::forward_iterator_tag) (/tmp/stackoverflow/a.out+0x3582)
    #5 0x55afc64f2d98 in std::vector<int, std::allocator<int> >::vector(std::initializer_list<int>, std::allocator<int> const&) (/tmp/stackoverflow/a.out+0x2d98)
    #6 0x55afc64f27bf in main (/tmp/stackoverflow/a.out+0x27bf)
    #7 0x7f548ac880b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib/x86_64-linux-gnu/libasan.so.5+0xa103c) in __interceptor_memmove
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00 00 fa fa 00[fa]fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):

Memory sanitizer is a general memory tool. When focusing on the STL iterator debugging, the STL libraries do have some helpful utils:

For Visual Studio 2019, add #define _ITERATOR_DEBUG_LEVEL 1 into your code, or add it in the project configuration. We get exception when run your code, the call stack:

>   iterator_check.exe!std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<int>>>::_Verify_offset(const __int64 _Off) Line 113  C++
    iterator_check.exe!std::_Get_unwrapped_n<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>> &,__int64>(std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>> & _It, const __int64 _Off) Line 1318  C++
    iterator_check.exe!std::copy<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>,std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>>(std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>> _First, std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>> _Last, std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>> _Dest) Line 3813  C++
    iterator_check.exe!main() Line 13   C++
    [External Code] 

We can see that an extra check function _Verify_offset is called, then the error is caught.

For libstdc++ from gcc, there is also similar debugging mode.: compiler flag -D_GLIBCXX_DEBUG

Build your code with compiler flag -D_GLIBCXX_DEBUG under GCC 9 and run it, we get an error report:

/usr/include/c++/9/bits/stl_algobase.h:471:
In function:
    _OI std::copy(_II, _II, _OI) [with _II =
    __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<int*,
    std::__cxx1998::vector<int, std::allocator<int> > >,
    std::__debug::vector<int>, std::random_access_iterator_tag>; _OI =
    __gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator<int*,
    std::__cxx1998::vector<int, std::allocator<int> > >,
    std::__debug::vector<int>, std::random_access_iterator_tag>]

Error: attempt to subscript a dereferenceable (start-of-sequence) iterator 4
step from its current position, which falls outside its dereferenceable
range.

Objects involved in the operation:
    iterator "__result" @ 0x0x7fff292a8dd0 {
      type = __gnu_cxx::__normal_iterator<int*, std::__cxx1998::vector<int, std::allocator<int> > > (mutable iterator);
      state = dereferenceable (start-of-sequence);
      references sequence with type 'std::__debug::vector<int, std::allocator<int> >' @ 0x0x7fff292a8e70
    }
Aborted

Which is helpful too!

For libcxx from LLVM, there are also something similar. But the document is somewhat not complete, you can give it a try.

Be careful that all the tools have performance degressions, so it's only useful in the test build and not suitable for a production build.

Upvotes: 0

Related Questions