Parobay
Parobay

Reputation: 2619

C++ casting to base and "overwriting" vptr issue

I was just reading a new C++ challenge: http://blogs.msdn.com/b/vcblog/archive/2014/02/04/challenge-vulnerable-code.aspx

The supplied code if full of issues, some obvious to anybody with good programming habits, some visible only to C++ natives :-)

It is described in the comments, that one particular line (37) is particularly dangerous:

ImageFactory::DebugPrintDimensions((ImageFactory::CustomImage*)image);

the function then calls a virtual method of CustomImage (defined first time in CustomImage).

This allegedly causes the first member of CustomImage to be treated as the vptr of the instance (it's an unique_ptr actually) and make the binary pointed by it treated as executable (perhaps malicious) code..

While I can understand this, I wonder why does this really work.

CustomImage is a virtual class, so (probably) its first 4 bytes (just assume X86) are THE vptr, and the unique_ptr member is next.. And since the cast doesn't seem to be shifting anything...

... how would it be possible to execute data held by unique_ptr?

Upvotes: 2

Views: 480

Answers (2)

Mike Seymour
Mike Seymour

Reputation: 254581

Presumably, the memory layout is such that the vptr comes before the base sub-object, like this:

class CustomImage {
    void * __vptr;
    Image  __base;  // empty
    unique_ptr<whatever> evil;
};

This means that a valid conversion from Image* to CustomImage* requires subtracting a few bytes from the pointer. However, the evil cast you've posted comes before the class definitions, so it doesn't know how to correctly adjust the pointer. Instead, it acts like reinterpret_cast, simply pretending that the pointer points to CustomImage without adjusting its value.

Now, since the base class is empty, the pointer inside unique_ptr will be misinterpreted as the vptr. This points to another pointer, which will be misinterpreted as the vtable's pointer to the first virtual member function. This in turn points to data loaded from the file, which will be executed as code when that virtual function is called. As the icing on the cake, the memory protection flags are loaded from the file, and not adjusted to prevent execution.

Some lessons here are:

  • Avoid C-style casts, especially on pointer or reference types. They fall back to reinterpret_cast, leading to a minefield of undefined behaviour, if the conversion isn't valid. (To compound the evil, their syntax is also ungreppable, and easy to miss if you don't read the code too carefully.)
  • Avoid non-polymorphic base classes. As well as being conceptually dubious, and making deletion more awkward and error-prone, it can do surprising things to the memory layout as we see here. If the base class were polymorphic, we could use dynamic_cast (or avoid the cast altogether by providing suitable virtual functions), with no possibility of an invalid conversion.
  • Avoid unnecessary levels of indirection - there's no particular need for m_imageData to be a pointer.
  • Never put user data in executable memory.

Upvotes: 1

Tristan Brindle
Tristan Brindle

Reputation: 16824

My take (and I'm more than happy to be corrected):

Here, CustomImage is a polymorphic class (with a vptr as the first "member" under the Windows ABI), but Imageis not. The order of the definitions means that the ImageFactory functions know that CustomImage is an Image, but main() does not.

So when the factory does:

Image* ImageFactory::LoadFromDisk(const char* imageName)
{
    return new (std::nothrow) CustomImage(imageName);
}

the CustomImage* pointer is converted to an Image* and everything is fine. Because Image is not polymorphic, the pointer is adjusted to point to the (POD) Image instance inside the CustomImage -- but crutially, this is after the vptr, because that always comes first in a polymorphic class in the MS ABI (I assume).

However, when we get to

ImageFactory::DebugPrintDimensions((ImageFactory::CustomImage*)image);

the compiler sees a C-style cast from one class it knows nothing about to another. All it does is take the address it has and pretend it's a CustomImage* instead. This address in fact points to the Image within the custom image; but because Image is empty, and presumably the empty base class optimisation is in effect, it ends up pointing to the first member within CustomImage, i.e. the unique_ptr.

Now, ImageFactory::DebugPrintDimensions() assumes it's been handed a pointer to a fully-complete CustomImage, so that the address is equal to the address of the vptr. But it hasn't -- it's been handed the address of the unique_ptr, because at the point at which is was called, the compiler didn't know any better. So now it dereferences what it thinks is the vptr (really data we're in control of), looks for the offset of the virtual function and blindly exectutes that -- and now we're in trouble.

There are a couple of things that could have helped mitigate this. Firstly, since we're manipulating a derived class via a base class pointer, Image should have a virtual destructor. This would have made Image polymorphic, and in all probability we wouldn't have had a problem (and we wouldn't leak memory, either).

Secondly, because we're casting from base-to-derived, dynamic_cast should have been used rather than the C style cast, which would have involved a run-time check and correct pointer adjustment.

Lastly, if the compiler had all the information to hand when compiling main(), it might have been able to warn us (or performed the cast correctly, adjusting for the polymorphic nature of CustomImage). So moving the class definitions above main() is recommended too.

Upvotes: 1

Related Questions