ChaoSXDemon
ChaoSXDemon

Reputation: 910

Overloading the [] operator with no array internally

I have seen the following code:

class SomeClass
{
public:
    int mA;
    int mB;
    int mC;
    int mD;

    int operator[] (const int index) const
    {
        return ((int*)(this))[index];
    }
};

How does this work? I know the this keyword is a pointer to this class but without property knowing how many variables there are ... how can we safely access [index] of "this" pointer?

Upvotes: 2

Views: 95

Answers (2)

bames53
bames53

Reputation: 88155

This 'works' only because it's undefined behavior. The memory location the compiler chooses for mB happens to be the same as &mA + 1, the address for mC is &mA + 2 and &mB + 1 and so on.

However it is true that the class is 'standard layout' (all data members have same access specifier and no inheritance is being used [n3337 § 9 p7]), which traditionally does produce this organization of the member variables. It's not guaranteed though. 'standard layout' could mean something else, such as some amount of padding between the members [n3337 § 9.2 p14], but I doubt there's any platform which actually does that.

The spec does say that it's legal to get the address one past an object, so calculating &mA + 1 is legal. [n3337 § 5.7 p5] And the spec does also say that a pointer with the correct type and value does point at an object regardless of how the pointer is calculated.

If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained. [Note: For instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. — [n3337 § 3.9.2 p3]

After reviewing §5.9 and §5.10 I think the following may be technically legal, though it could possibly have unspecified behavior:

if (&mA + 1 == &mB && &mB + 1 == &mC && &mC + 1 == &mD) {
    return ((int*)(this))[index];
}

Even the cast is legal, since the spec says a standard layout class can be cast to a pointer to its first member. [n3337 § 9.2 p20]

Upvotes: 0

VoidStar
VoidStar

Reputation: 5421

This type of thing is not recommended, because it is not guaranteed by the C++ standard. However, most compilers do explicitly define their memory layout behaviors (often on a per-architecture basis) and provide #pragmas for manipulating the packing behavior (such as #pragma pack in MSVC). If you understand/leverage these features, you can make it work on most given compilers/architectures. However, it will not be portable! For each new compiler, you'd need to re-test and adjust, a costly maintenance task. Generally, we prefer greater ease of portability.

If you really want to do this, you can add a static_assert to verify the compiler's behavior.

int operator[] (const int index) const
{
    static_assert(sizeof(SomeClass) == 4 * sizeof(mA), "Padding not supported");
    return ((int*)(this))[index];
}

Because the standard does not allow members to be reordered, logically we can deduce that if the size of SomeClass is 16, then this code will work as expected. With the assert, we are at least notified if somebody builds on a different compiler and it tries to pad it (thus messing us up).

However, we can be standards-compliant and achieve names for array slots. You might consider a pattern such as:

class SomeClass
{
    enum Index {
        indexA,
        indexB,
        indexC,
        indexD,

        indexCount;
    };

    int mData[indexCount];

public:
    int operator[] (const int index) const
    {
        return mData[index];
    }

    int& A() { return mData[indexA]; }
    int& B() { return mData[indexB]; }
    int& C() { return mData[indexC]; }
    int& D() { return mData[indexD]; }
};

This provides similar functionality, but is guaranteed by the C++ standard.

Upvotes: 4

Related Questions