user10288406
user10288406

Reputation:

How to understand the source code of 'nullptr' in LLVM?

Recently, I want to know how nullptr works. In http://www.stroustrup.com/N1488-nullptr.pdf , I found this code:

const // this is a const object...
class {
public:
  template<class T> // convertible to any type
    operator T*() const // of null non-member
    { return 0; } // pointer...
  template<class C, class T> // or any type of null
    operator T C::*() const // member pointer...
    { return 0; }
private:
  void operator&() const; // whose address can’t be taken
} nullptr = {};

Then I searched the keyword nullptr in Woboq again, I found the source code in LLVM is different from the above, I copy them below.

struct _LIBCPP_TEMPLATE_VIS nullptr_t
{
    void* __lx;
    struct __nat {int __for_bool_;};
    _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR nullptr_t() : __lx(0) {}
    _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR nullptr_t(int __nat::*) : __lx(0) {}
    _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR operator int __nat::*() const {return 0;}
    template <class _Tp>
        _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR
        operator _Tp* () const {return 0;}
    template <class _Tp, class _Up>
        _LIBCPP_ALWAYS_INLINE
        operator _Tp _Up::* () const {return 0;}
    friend _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR bool operator==(nullptr_t, nullptr_t) {return true;}
    friend _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR bool operator!=(nullptr_t, nullptr_t) {return false;}
};
inline _LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR nullptr_t __get_nullptr_t() {return nullptr_t(0);}
#define nullptr _VSTD::__get_nullptr_t()

The most diference is, it defines struct __nat and two functions

_LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR nullptr_t(int __nat::*) : __lx(0) {}
_LIBCPP_ALWAYS_INLINE _LIBCPP_CONSTEXPR operator int __nat::*() const {return 0;}

I thought about it for a long time, and I still don't understand why LLVM implemented it like this. Can someone give me any advice?

Upvotes: 11

Views: 701

Answers (1)

Quuxplusone
Quuxplusone

Reputation: 27334

This is an obsolete idiom dating back to Howard Hinnant's very first (public) commit of libc++. libc++ originally implemented support for class nullptr_t in C++98 (where nullptr isn't a built-in keyword and decltype(nullptr) isn't a built-in scalar type). That is, in C++98 we wanted this to work:

nullptr_t a = 0;  // should be OK

but not this:

nullptr_t b = 1;  // should error

We accomplish that by giving nullptr_t a converting constructor from a pointer type — and the more obscure, the better. E.g. this would work fine:

class nullptr_t {
    nullptr_t(int*) {}

but this would be better, because the user is unlikely to create an int __nat::* (pointer-to-member) value "by accident":

class nullptr_t {
    struct __nat {};
    nullptr_t(int __nat::*) {}

And, on the flip side, we also would like our fake class nullptr_t to be convertible to bool — but only explicitly! That is, we want to give it an explicit operator bool.

to simulate an explicit operator bool. See, explicit in C++98 could be applied only to constructors; it wasn't until C++11 that you could apply it to a conversion function. This is important because bool is implicitly convertible to int (Godbolt):

std::nullptr_t p = 0;
bool b = p; // shouldn't work
int i = p;  // REALLY shouldn't work

To achieve the effect of explicit operator bool, we make class nullptr_t implicitly convertible to a pointer type — the more obscure, the better. Like this:

struct __nat {int __for_bool_;};
operator int __nat::*() const {return ???;}

Fill in the ??? with &__nat::__for_bool_ if you want the result to be truthy, or with a simple null pointer constant 0 if you want it to be falsey. For nullptr_t of course we just want it to be falsey.

But for unique_ptr and shared_ptr — which also used this idiom in libc++'s C++98 mode — we actually did need to sometimes return truthy; thus the __for_bool_ member was needed, for them. See this other SO question.

So the __for_bool_ member here is almost certainly unneeded — it was just copy-pasted in from the two other places that this version of __nat was used in old-school libc++.

(I can think of one possible reason it might have been needed: Maybe some old-school compiler might have given an annoying warning if you tried to form the type int T::* for a type T with no int data members. But I doubt it.)


As @geza commented above, this obsolete idiom actually had a name: the Safe Bool Idiom. It was obsoleted by C++11's introduction of explicit conversion functions.

Upvotes: 1

Related Questions