Reimplement 3rd party non-virtual function

I'm wondering if there is a way how to reimplement a non-virtual function foo in a 3rd party class Base.

The motivation is that I just need to append a callback into foo. However, the function is called from many places of the class Base, and since it is not virtual, it would require to either change the Base in place, or to dramatically rewrite implementation of the class that will derive from Base. I would like to avoid both scenarios.

I do not need polymorphism at all, the derived class will have just one instance and the type will be known at compile-time (so e.g. CRTP instead of virtualization would suffice too).

I tried to use a class, which inherits from an auxiliary class that declares foo as virtual, but with no luck. Here is an example, where bar simulates any place of Bases implementation from which foo could be called:

/// Ideally, do not modify the `Base` at all
struct Base {
    void foo()
    {
        cout << "Base::foo" << endl;
    }

    void bar()
    {
        cout << "bar" << endl;
        foo();  //< foo is not virtual !
    }
};

struct Virtual {
    virtual void foo() = 0;
};

struct Virtual_base : Virtual, Base {
    void foo() override = 0;  //< it still does not affect Base !
};

struct Virtual_derived : Virtual_base {
    void foo() override
    {
        cout << "Virtual_derived::foo" << endl;
        Base::foo();
    }
};

Well, Virtual_derived::foo does override, but the Base::bar, unsurprisingly, is still unchanged. I also tried the CRTP approach, but obviously with no luck, as it still clashes with the very same issue that Base stays encapsulated.

I'm little afraid that the answer is that it is just impossible.. is it?

Upvotes: 7

Views: 196

Answers (2)

Anonymous1847
Anonymous1847

Reputation: 2598

On Windows you can use hotpatching: https://jpassing.com/2011/05/03/windows-hotpatching-a-walkthrough/ .

Compile with /hotpatch. This will add a two-byte NOP to the beginning of every function, and a 6-byte nop (5 on 32-bit) before, allowing you to patch in a redirection. What you want to do is modify the two-byte nop at the beginning to jump back into the 6-byte nop block, which then can jump to your callback wrapper, which then calls your callback and then jumps back into the function proper. To implement it, add this to a C++ source file:

void pages_allow_all_access(void* range_begin, size_t range_size) {
    DWORD new_settings = PAGE_EXECUTE_READWRITE;
    DWORD old_settings;
    VirtualProtect(
        range_begin,
        range_size,
        new_settings,
        &old_settings
    );
}

void patch(void* patch_func, void* callback_wrapper) {
    char* patch_func_bytes = (char*)patch_func;
    char* callback_wrapper_bytes = (char*)callback_wrapper;
    
    pages_allow_all_access(patch_func_bytes - 6, 8);

    // jmp short -5 (to the jmp rel32 instruction)
    patch_func_bytes[0] = 0xEB;
    patch_func_bytes[1] = 0x100 - 0x7;
    // nop (probably won't be executed)
    patch_func_bytes[-6] = 0x90;
    // jmp rel32 to callback_wrapper
    patch_func_bytes[-5] = 0xE9;
    *(int32_t*)&patch_func_bytes[-4]
        = (int32_t)(callback_wrapper_bytes - patch_func_bytes);
}

The callback wrapper might need to be defined in an assembly file:

callback_wrapper:
    ; save registers
    pushad
    pushfd
    call QWORD PTR [callback]
    popfd
    popad
    jmp QWORD PTR [after_trampoline]

The symbols callback and after_trampoline should be exposed in a C++ file (so at global scope).

void* callback = &callback_func;
void* after_trampoline = (char*)&patch_func + 2;

Then call patch at the top of main or some other suitable initialization time and you're set.

Also, you may have to allow write permissions on the memory pages you're modifying (the ones that patch_func is in) using a VirtualProtect call: https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualprotect . EDIT: I've added this code to the example above.

I may add ways of doing this on Linux or other Unixy systems later.

When you don't have a convenient set of NOPs to use in the function, hooking becomes more difficult, particularly on the x86 architecture, since instructions have widely varying lengths and so it's difficult to find out programmatically where an instruction ends so you can jump back to the next instruction. @ajm suggests this library: https://github.com/kubo/funchook for Linux & OSX. However, personally, when I don't have hotpatching, I usually use a debugger to find out a sequence of instructions in the patch target of length at least 9 bytes that I can replace. Then in the program I replace those instructions with a jump to absolute immediate 64-bit, using a technique similar to the above, but I also add those replaced instructions to be executed near the end of the callback wrapper. Avoid replacing call or jmp instructions as these are often relative to the instruction pointer, which will have a different value in the callback wrapper than in the original function.

Upvotes: 3

To sum up, it depends on one's requirements on his/her project.

In standard C++, it is impossible. Period.

The question lacks more detailed requirements, so the techniques proposed by @Anonymous1847 and @ajm - patching/hooking - actually answer my question properly. However, I am rather looking for a standard, portable, and stable solution. I definitely do not know details about these techniques, but with hotpatch there is problem with portability, and as for hooks, I assume that since they are dependent on disassembling, it cannot be considered generally reliable, rather fragile, mainly with respect to platform and compiler specific stuff, like optimizations. I am also wondering how these techniques would cope with the need to further override the foo in classes derived from the derivates of Base. Also, I would say that these reduce the readability of the code significantly. Nevertheless, these are possible solutions and can meet one's needs.

As I talked about the option of "prepending" a class that Base would inherit from, I realized that even if it was anyhow possible, there would be serious problem with the destructor ~Base being possibly non-virtual (as in the case of the example), which could not be influenced (fixed) by nothing else but from within the Base itself. (Off topic: While it would likely be a problem in case of C++, I suppose it would not in case of other languages, where all functions are virtual. I still think that this "inverse of inheritance" can be useful in some cases, but I do not know about any language nor basic concept that handles it. Does anyone know why?)

The solution I found the most beneficial is the one @Ulrich suggested, even though it violates my original question. The reason is that, based on arguments I mentioned above, its pros IMO surpass the cons - it just suffices to alter the Base::foo to be virtual. One does not have to worry about the runtime overhead as it is likely to be devirtualized by the optimizer. However, as we are talking about 3rd party library, one has to keep his/her own copy of it (e.g. to fork it in case of Gitlab/Github) and perform some minor modifications (like the virtual stuff), and to maintain the copy by him/her-self somehow. For example, if one still wants to keep track with the original library, it should not be difficult to keep performing this little "patch" upon the library's updates (I'm not too experienced with this stuff). (And of course, only in the case that the original interface of the Base will not change entirely.)

Feel free to correct my judgements.

Upvotes: 0

Related Questions