Aykhan Hagverdili
Aykhan Hagverdili

Reputation: 29985

Why is this pointer null

In Visual Studio, it seems like pointer to member variables are 32 bit signed integers behind the scenes (even in 64 bit mode), and a null-pointer is -1 in that context. So if I have a class like:

#include <iostream>
#include <cstdint>

struct Foo
{
    char arr1[INT_MAX];
    char arr2[INT_MAX];
    char ch1;
    char ch2;
};


int main()
{
    auto p = &Foo::ch2;
    std::cout << (p?"Not null":"null") << '\n';
}

It compiles, and prints "null". So, am I causing some kind of undefined behavior, or was the compiler supposed to reject this code and this is a bug in the compiler?

Edit:

It appears that I can keep the "2 INT_MAX arrays plus 2 chars" pattern and only in that case the compiler allows me to add as many members as I wish and the second character is always considered to be null. See demo. If I changed the pattern slightly (like 1 or 3 chars instead of 2 at some point) it complains that the class is too large.

Upvotes: 33

Views: 2042

Answers (4)

Ben Voigt
Ben Voigt

Reputation: 283733

It's clearly a collision between an optimization on pointer-to-member representation (use only 4 bytes of storage when no virtual bases are present) and the pigeonhole principle.

For a type X containing N subobjects of type char, there are N+1 possible valid pointer-to-members of type char X::*... one for each subobject, and one for null-pointer-to-member.

This works when there are at least N+1 distinct values in the pointer-to-member representation, which for a 4-byte representation implies that N+1 <= 232 and therefore the maximum object size is 232 - 1.

Unfortunately the compiler in question made the maximum object-type size (before it rejects the program) equal to 232 which is one too large and creates a pigeonhole problem -- at least one pair of pointer-to-members must be indistinguishable. It's not necessary that the null pointer-to-member be one half of this pair, but as you've observed in this implementation it is.

Upvotes: 3

Swift - Friday Pie
Swift - Friday Pie

Reputation: 14663

The expression &Foo::ch2 is of type char Foo::*, which is pointer to member of class Foo. By rules, a pointer to member converted to bool should be evaluated as false ONLY if it is a null pointer, i.e. it had nullptr assigned to it.

The fault here appears to be a implementation's flaw. i.e. on gcc compilers with -march=x86-64 any assigned pointer to member evaluates to non-null (1) unless it had nullptr assigned to it with following code:

struct foo
{
    char arr1[LLONG_MAX];
    char arr2[LLONG_MAX];
    char ch1;
    char ch2;
};

int main()
{
    char  foo::* p1 = &foo::ch1;
    char  foo::* p2 = &foo::ch2;
    std::cout << (p1?"Not null ":"null ") << '\n';
    std::cout << (p2?"Not null ":"null ") << '\n';
    
    std::cout << LLONG_MAX + LLONG_MAX << '\n';
    std::cout << ULLONG_MAX << '\n';
    std::cout << offsetof(foo, ch1) << '\n';
}

Output:

Not null 
null 
-2
18446744073709551615
18446744073709551614

Likely it's related to fact that class size is exceeding platform limitations, leading to offset of member being wrapped around of 0 (internal value of nullptr). Compiler doesn't detect it because it becomes a victim of... integer overflow with signed value and it's programmer's fault to cause UB within compiler by using signed literals as array size: LLONG_MAX + LLONG_MAX = -2 would be "size" of two arrays combined.

Essentially size of first two members is calculated as negative and offset of ch1 is -2 represented as unsigned 18446744073709551614. And -2 therefore pointer is not null. Another compiler may clamp value to 0 producing a nullptr, or actually detect existing problem as clang does.

If offset of ch1 is -2, then offset of ch2 is -1? Let's add this:

std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch1)) << '\n';
std::cout << reinterpret_cast<signed long long&&> (offsetof(foo, ch2)) << '\n';

Additional output:

-2
-1

And offset for first member is obviously 0 and if pointer represent offsets, then it needs another value to represent nullptr. it's logical to assume that this particular compiler considers only -1 to be a null value, which may or may not be case for other implementations.

Upvotes: 2

computerquip
computerquip

Reputation: 143

The size limit of an object is implementation defined, per Annex B of the standard [1]. Your struct is of an absurd size.

If the struct is:

struct Foo
{
    char arr1[INT_MAX];
    //char arr2[INT_MAX];
    char ch1;
    char ch2;
};

... the size of your struct in a relatively recent version of 64-bit MSVC appears to be around 2147483649 bytes. If you then add in arr2, suddenly sizeof will tell you that Foo is of size 1.

The C++ standard (Annex B) states that the compiler must document limitations, which MSVC does [2]. It states that it follows the recommended limit. Annex B, Section 2.17 provides a recommended limit of 262144(?) for the size of an object. While it's clear that MSVC can handle more than that, it documents that it follows that minimum recommendation so I'd assume you should take care when your object size is more than that.

[1] http://eel.is/c++draft/implimits

[2] https://learn.microsoft.com/en-us/cpp/cpp/compiler-limits?view=vs-2019

Upvotes: 8

Barrnet Chou
Barrnet Chou

Reputation: 1933

When I test the code, VS shows that Foo: the class is too large. enter image description here

When I add char arr3[INT_MAX], Visual Studio will report Error C2089 'Foo': 'struct' too large. Microsoft Docs explains it as The specified structure or union exceeds the 4GB limit. enter image description here

Upvotes: 0

Related Questions