Archive

Posts Tagged ‘c++closures’

Ghetto Closures in Win32 with C++ III: Templates and Traits and Interfaces, oh my!

February 10th, 2009
Comments Off

Hey kids! It’s that time again.

Last time, we went over generating a tiny ASM thunk that could wrap a C++ method pointer (instance plus function) up in a non-method __stdcall function pointer, suitable for use as, say, a win32 WndProc.

Next, I’m going to talk about wrapping it all up in a convenient API. To do this, we’re going to need some template-fu.

Since I am using a modern compiler, I am going to see how close I can get to boost.function‘s interface:

boost::function<void ()> fn = &someFunction;

I consider this interface to be pretty rad.

boost::function works with only a single template argument, so we could go that route too. We could also accept that we’re doing something a bit different, and add a second parameter:

Thunk<LRESULT (Window::*)(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam)> wndProcThunk;

or

Thunk<Window, LRESULT (HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam)> wndProcThunk;

I picked the first one, mostly because it was the first thing that popped into my head. Here is my test harness:

#include <cstdio>
using std::printf;

struct I {
    virtual void print() = 0;
    virtual void printSum(double y) = 0;
};

struct C : I {
    C(int x)
        : x(x)
    { }

    void print() {
        printf("My x is %in", x);
    }

    void printSum(double y) {
        printf("%i + %f = %fn", x, y, x + y);
    }

    int x;
};

void main() {
    C instance(4);

    Thunk<void (I::*)()> thunk(&instance, &I::print);
    printf("n");
    thunk.get()();

    Thunk<void (I::*)(double)> thunk2(&instance, &I::printSum);
    printf("n");
    thunk2.get()(3.14);
}

First, I extracted the part of the code that actually constructs and cleans up the generated code. Everything else I’ll outline will just be scaffolding so that the interface is prettier.

template <typename D, typename S>
D really_reinterpret_cast(S s) {
    char __static_assert_that_types_have_same_size[sizeof(S) == sizeof(D)];

    union {
        S s;
        D d;
    } u;

    u.s = s;
    return u.d;
}

template <typename C, typename M>
void* createThunk(C* instance, M method) {
    char code[] = {
        0xB9, 0, 0, 0, 0,   // mov ecx, 0
        0xB8, 0, 0, 0, 0,   // mov eax, 0
        0xFF, 0xE0          // jmp eax
    };

    // YEEHAW
    *((I**)(code + 1)) = instance;
    *((void**)(code + 6)) = really_reinterpret_cast<void*>(method);

    void* thunk = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    memcpy(thunk, &code, sizeof(code));
    FlushInstructionCache(GetCurrentProcess(), thunk, sizeof(code));

    return thunk;
}

void releaseThunk(void* thunk) {
    VirtualFree(thunk, 0, MEM_RELEASE);
}

To be honest, this interface isn’t all that bad: all you have to do is remember to manage the lifetime of the generated code and cast the void* you get to the right type. That’s kind of boring, though, so let’s instead see if we can make something kickass and typesafe.

I like objects, so let’s start with one of those.

template <typename M>
struct Thunk {
    typedef M Method;
    typedef typename methodptr_traits<M>::class_type Class;
    typedef typename methodptr_traits<M>::function_type Function;

    Thunk(Class* instance, Method method)
        : ptr(createThunk(instance, method))
    { }

    ~Thunk() {
        releaseThunk(ptr);
    }

    Function get() const {
        return reinterpret_cast<Function>(ptr);
    }

private:
    Thunk(const Thunk&);

    void* ptr;
};

This is simple enough to be boring, except for this methodptr_traits thing.

methodptr_traits is an instance of something called a traits class. Basically, it is a fancy template type that defines various other types. You can think of it as a way to code ad-hoc, compile-time type introspection.

If you’ve never used templates this way, the implementation is pretty intimidating:

template <typename F>
struct methodptr_traits;

template <typename ReturnType, typename T>
struct methodptr_traits<ReturnType (T::*)()> {
    typedef T class_type;
    typedef ReturnType (__stdcall *function_type)();
};

This is one of the more convoluted things one can do with templates, and I’ll be the first to admit that I think it’s a bit scary. Let’s rewind a bit and look at this in simpler terms. Say we want a boolean variable that’s true if a particular template type is a number. We can use template specialization to accomplish this pretty easily:

template <typename T> struct is_int { enum {value = false}; };
template <> struct is_int<int> { enum {value = true}; };
template <> struct is_int<short> { enum {value = true}; };
template <> struct is_int<char> { enum {value = true}; };
// and so on

I might leverage this code with something like the following:

if (is_int<T>::value) { /* stuff */ }

and be off to the races.

Cool, right? Now what if we instead wanted to know whether something is a std::vector, whatever the element type? The same principle applies, but now we have a template specialization that is itself a template:

template <typename T> struct is_vector { enum {value=false}; };
template <typename E> struct is_vector<std::vector<E> > { enum {value=true}; };

How to use this should be obvious:

if (is_vector<T>::value) { /* do something that only works on std::vector */ }

methodptr_traits is just a tiny jump further. Most of the terror that this sort of thing inspires is really the fault of C++’s ridiculous function pointer syntax.

It is kind of a drag that C++ templates cannot express functions without specifying exactly how many arguments the function has. Because of this, a new specialization must be written for each argument count you want to support. I only did 0 and 1 arguments because this is just a small example. boost tends to support a minimum of 10 arguments by default, which is good enough for almost everyone.

For this example, I only need 0 and 1 argument, so here’s the specialization for a one-argument function:

template <typename ReturnType, typename T, typename Arg1>
struct methodptr_traits<ReturnType (T::*)(Arg1)> {
    typedef T class_type;
    typedef ReturnType (__stdcall *function_type)(Arg1);
};

And that’s it! With a single templatized class, we can dynamically generate an assembly thunk that works as a perfectly usable __stdcall function. We can pass this function pointer on to Win32, GLU, or whatever other C library we might need a callback function for.

Source code

Uncategorized ,

Ghetto Closures in Win32 with C++ II: __thiscall

February 9th, 2009
Comments Off

So, after some poking around, I have discovered that some guy has figured this out already. Rad! I’m going to go over it quickly anyway so that I can build on it for tomorrow.

First, the setup/demo part changes a bit:

struct I {
    virtual void print() = 0;
};

struct C : I {
    C(int x)
        : x(x)
    { }

    void print() {
        printf("My x is %in", x);
    }

    int x;
};

typedef void (__stdcall *Function0)();

It turns out that __thiscall and __stdcall are not all that different. All you need to do is stuff your ‘this’ pointer in ECX, and you’re set. Our satanic little blob of assembly changes to:

__asm {
    mov ecx, my_this        // B9 xx xx xx xx
    mov eax, real_func      // B8 yy yy yy yy
    jmp eax                 // FF E0
}

As before, we just have to create a block of memory, plug the code and pointers into it, and execute it:

// YEEHAW
*((I**)(code + 1)) = instance;
*((void**)(code + 6)) = really_reinterpret_cast<void*>(&I::print);

void* buffer = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(buffer, &code, sizeof(code));
FlushInstructionCache(GetCurrentProcess(), buffer, sizeof(code));

Function0 f0 = reinterpret_cast(buffer);

really_reinterpret_cast is a hack because we are doing Very Bad Things. The C++ standard says that you cannot convert pointer-to-methods to any other kind of pointer (not even void*!). I was unable to dig up the exact reasoning, but I think it is because the C++ implementation is allowed to make pointer-to-members have any format it wants. They don’t even have to be the same size as a normal pointer.

But that’s boring. :D I doubt Microsoft will change how this works any time soon now that existing code depends on it, and there is potential here to do something that is very useful.

It turns out that the unspecified implementation that they chose was for a pointer-to-member-function to either point to the code directly (like a stdcall function), or to point to a thunk that will go to the right place, if the method is virtual. In other words, we can just jump to it and we will be jumping to the right place.

Here’s my evil, standard-subverting cast:

template <typename D, typename S>
D really_reinterpret_cast(S s) {
    char __static_assert_that_types_have_same_size[sizeof(S) == sizeof(D)];

    union {
        S s;
        D d;
    } u;

    u.s = s;
    return u.d;
}

That weird looking char array is an ad-hoc compile-time assertion. A C array of length 0 is illegal, so if sizeof(S) != sizeof(D), then the compile will fail.

And, as before, the sweet thrill of victory:

Function0 f0 = reinterpret_cast<Function0>(buffer);
f0();

Source code

Uncategorized ,

Ghetto closures in win32 with C++

February 8th, 2009
Comments Off

One of the more annoying gaps in C++ is that there is no sort of “method reference” concept in the language. However, with just a bit of ingenuity and some knowledge of assembly and particular bits of win32, we can dynamically generate a bit of shim code that adds an extra argument into a function call.

The particular example I am going to go through only works for stdcall functions, which limits its usefulness somewhat, but it is still very useful when creating callbacks for particular Win32 functions. The first time I need it, for instance, was to subclass a window for which I did not have the source code. (that window was already using GWL_USERDATA to store its own ‘this’ pointer, so I needed to find another place for it, you see)

For reference, I am using Visual Studio 2008 as my C++ compiler. This code might violently explode if you try to use it on GCC, but you never know! You might get lucky!

First, a bit about how stdcall works. I am leaving out some details, but the important bits boil down to just a few simple things:

  1. Function arguments are pushed onto the stack in reverse order,
  2. The callee (the function you call) is on the hook for popping those arguments off the stack when it is through, and
  3. The return value lives in EAX, or EAX:EDX if it is 64 bits, or ST0 if it’s a double, or someplace in memory if it is, say, an object. Wherever the return value goes, it is not on the stack.

Given all of these things, it is possible to craft a bit of assembly code on the fly that pops off the return address, pushes our “extra” argument onto the stack, making it the “new” first parameter that will go to our real function, pushes the return address back, then jumps to the real function.

Thus armed, we are now ready to get ourselves into trouble. First, the setup!

#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <cstdio>
using std::printf;

typedef void (__stdcall *Function1)(int x);
typedef void (__stdcall *Function0)();

void __stdcall the_func(int x) {
    printf("Hello!  x = %in", x);
}

void main() {
}

Next, we need to get the bytes of the code we want to generate. I couldn’t think of a clever way to do this automatically, so I just added a do-nothing function to my code and used Visual Studio’s disassembler view to tell me how the bytes fell out:

void nothing() {
    __asm {
        pop eax                 // 58
        push 0x12345678         // 68 78 56 34 12
        push eax                // 50
        mov eax, 0x87654321     // B8 21 43 65 87
        jmp eax                 // FF E0
    }
}

You could also try doing it by hand the Intel Manual, but that is a lot more work.

That was totally tedious and retarded. But that’s ok! Now we just have to stuff it into a block of memory, and tell Windows that we are okay if EIP somehow finds its way there:

char code[] = {
    0x58,               // pop eax
    0x68, 0, 0, 0, 0,   // push 0
    0x50,               // push eax
    0xB8, 0, 0, 0, 0,   // mov eax, 0
    0xFF, 0xE0          // jmp eax
};

// YEEHAW
*((int*)(code + 2)) = 31337;
*((Function1*)(code + 8)) = the_func;

void* buffer = VirtualAlloc(0, sizeof(code), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpy(buffer, &code, sizeof(code));

Also, we need to let the CPU know that we have just done Crazy Things with executable code:

FlushInstructionCache(GetCurrentProcess(), buffer, sizeof(code));

Now we whip out our big fat reinterpret_cast, and savor the thrill of the impossible:

Function0 f0 = reinterpret_cast<Function0>(buffer);
f0();

To let go of this memory once we are through, we use VirtualFree()

VirtualFree(buffer, 0, MEM_RELEASE);

Next time, I will either demonstrate the same technique for the __thiscall calling convention, or go into awful details about abusing C++ templates to build a sweet interface. I have yet to decide which, so it will be a surprise!!!

Also, I think I found a bug in Mass Effect

Uncategorized ,