Moving Link Through Binary Instrumentation
“I want to be inside your darkest everything."
— Frida Kahlo, ‘The Diary of Frida Kahlo’
Focusing on a tool sometimes gets a bad rap. There’s always this faint aura of incompetence present. In the previous article I’ve already talked about believing in strong fundamentals, which means learning what’s actually going on instead of learning how to use a tool someone else has written. In the end, though, it’s all about the mindset. If we gain some knowledge and understanding while also learning how to use the new tool, I think it’s perfectly fine to spend some quality time with it.
And spending some quality time we will! This is quite the lengthy article that highlights my journey more than anything. I’ve tried to be as explicit as possible, but there might still be things I simply assume. The truth is, I have no idea who (if anyone) reads this article, so it might not strike the perfect balance. If there are any questions, please holla at me.
With that out of the way, let’s get to know today’s star: The world-class dynamic instrumentation toolkit
Introduction to Frida
Frida is a “dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers” created by Ole André V. Ravnås. What does that mean, though? It means
Frida lets us inject snippets of
Frida comes with a ton of features and different APIs, which are all neatly documented over at the project’s website. Let’s quickly set the stage for what features we’re actually using, in case someone is already familiar with
We’re mainly dealing with the
NativeFunction APIs and some other functionality to make those work. With regards to our mode of operation, we’re only using injected today.
Now let’s begin our journey!
Last time we used
LD_PRELOAD to hook into shared library functions. In order to get to know
Frida, let’s simply try to mimic this behavior.
Why not start with a variation of one of the first programs I’ve ever copied1:
A simple guess the number game, that’s it. We seed the random number generator with the current time in line 10 and create a new value on every loop iteration in line 13. Let’s compile the program and give it a try.
Finally! That endless loop made me really anxious, though. We could have interrupted the process by pressing
CTRL + C, but what about that nagging feeling of utter defeat in the back of our heads? So let’s escape the loop by winning EVERY. SINGLE. TIME!
We have a few options. We could overwrite
time() to always return a predefined value, which makes the random number generation predictable. But that would be a very roundabout way of cheating. Let’s just make
rand() return whatever we please!
It’s time to meet the real star of this article:
Frida's Interceptor API. Reading the documentation, we immediately see two interesting methods:
Interceptor.replace(). While the latter is more akin to our first experiments with
LD_PRELOAD, we’re still going to start with the former. It’s a lot flashier, that’s why!
Interceptor.attach() enables us to hook into an arbitrary function of the target process, meaning we can read and modify the state in the context of said process. The documentation gives us the function’s signature:
Interceptor.attach(target, callbacks[, data])
So in order to make this work, we first need to provide the
target, which is a NativePointer to the function we want to intercept. There are a couple of different ways to get hold of such a pointer. We want to intercept
rand(), a standard library function, so we simply get its address from
libc itself. Because we’re injecting into a running process, there’s no need to worry about ASLR.
The next argument is
callbacks, an object that can hold two functions,
onLeave. The names should be pretty self-explanatory. The former gets executed before the actual function runs (which lets us modify the arguments), while the latter gets executed before the function returns (which lets us modify the return value). Let’s look
at a simple example script:
In line 5 we defined
onEnter(). Even though its body is empty, we still pay a performance penalty. That’s because
Frida sets up a bunch of things in the background. So in this case, we would be better off to just remove it. What we actually use is
onLeave(), where we change the return value in line 12.
Interceptor.attach() this way is kind of redundant, because all we do is returning
0 every time. It’s still a lot flashier and we will look at more elaborate examples later on.
So now that we have our script ready, we can politely ask
Frida to inject it for us. They say a picture is worth a thousand words. So what about a thousand pictures?
We won! Let’s quickly recap what happened: On the bottom we start our guessing game and interact with it. Afterwards we start
Frida on the top by providing the process name and a script to load via the
-l-flag. This loads the above script into our
guess process. So now every time our program calls
rand(), the script’s code gets executed. In this case we simply return
0, which means an input of
1 will let us win2 EVERY. SINGLE. TIME!
But let’s take a step back. What actually happened here? We won’t cover all the gory details, because the creators already did. Instead we’ll cautiously peek behind the curtain.
We’ll use our trusty debugger, the GEF-enhanced
GDB to gain more insights.
Now that we’re attached, let’s have a look at
rand() before the instrumentation is applied:
rand() itself calls
random()? Who would have thought?
Now let’s inspect what happens after we inject
jmp instruction certainly wasn’t there before! But what’s happening after we jump on this trampoline?
Alright! Lines 6 and 7 are the ones that were replaced by the trampoline. The jump in the last line leads back to the correct point in the original function (which happens to be the
random() call). This leaves us with the two magical jumps at line 3 and line 5 respectively.
This concludes our very narrow introduction to
Frida. Next we’re going to look at something more practical…
Video games seem like a natural fit for things one wants to manipulate. It’s already magical to press a button and things happen on screen, but what if we could have n-levels of indirection and abstraction to practically achieve the same result?
So while searching for a lightweight “real world” use case (not your typical SSL Pinning Bypass5), I thought about instrumenting a
Game Boy emulator in order to enable a Twitch Plays-like interaction.
SameBoy as the emulator, because it seems like a mature project and just works like a charm. Let’s go ahead an grab a copy of the source code so that we can grep around in it, shall we?
But before we fly through the codebase, let’s take a step back and loosely define a scope so that we don’t get lost. I think for now I’m content with remotely controlling the emulator. That means interactions with the internals of the “Game Boy” itself are out. Don’t worry, there is
PyBoy which comes equipped to handle such needs6.
How do we find those interesting places to hook into with
Frida? Well, we simply reconnoiter7 the codebase. There is an assumption we can safely make: An emulator, just like a video game itself, must have some sort of endless loop. In there input gets taken, states get updated and frames get rendered.
We could take a top-down approach by manually tracing the execution flow starting from
main() until we find something that handles key presses. Or we could
grep for things like “input”, “press”, “release” and “key” and go from there.
But do you know what’s even scarier than the endless loop of our guess the number game? Approaching a large8
C codebase by diving straight into its heart! So don’t mind me slowly starting from the outset:
main(). But where is it?
And here I am, thinking every program has one
main() function. Because I was told so! Let’s see what we’re dealing with:
Weird flex but okay.
Apparently there are seven different
mains. But the one file that sticks out is
SDL/main.c. Like most, I know what
SDL is. But I’ve never used it. We’ll have to learn as we go. We’re fine.
Alright, we have our entry point, so let us begin.
To be clear, the following exploration is no detective work. We’re simply cruising around in order to get inspired about possible
The start of
main() sets up and configures a ton of things like the window, while eventually calling
run(). Down there begins the actual emulation with the initialization of a struct representing the
Game Boy's state. If we ever wanted to mess with the internals of the
Game Boy, it would definitely involve this object, as a pointer to it is passed around the whole codebase. But at the moment we’re only interested in the inputs, so let’s continue.
At the end of
run() we finally have our endless loop:
Immediately there’s the
handle_events() function that sticks out. This specific call site is locked behind some conditions, but using our editor of choice’s jump to references reveals a couple more. If we look inside, we get yet another
Wow, that’s a pretty big
switch statement. The real interesting thing here is
SDL_PollEvent(), though. I think it’s time to read some documentation.
Being a good citizen, I first started using the
man-pages. However, the ones for
SDL (at least on my system) are from 2001!9 Of course I didn’t realize this immediately. So let’s spare us the pain and simply use the web and the codebase itself. Go ahead, grab it!
SDL project has a nice wiki where we find all the information about
SDL_PollEvent(). Its signature is
int SDL_PollEvent(SDL_Event* event). If there’s an event in the queue, it will get stored in the
event struct and the function returns 1. This explains the condition of the above
while loop. It simply drains the queue until there are no events left. Further down the wiki page we can even see some example code that does this exact thing.
Now that we have a basic understanding of what’s happening, let’s take a closer look at the
switch statement. What functions get called if the event corresponds to a key press?
Well, there’s a really scary case that handles
SDL_KEYDOWN events that eventually falls through into the case that handles
SDL_KEYUP events. So if all else fails, we end up in that case, where we see this:
Alright, so we loop through every key, see if its scancode matches a configured one10 and finally call
GB_set_key_state() in line 421. As mentioned before, we could end up here with either an
SDL_KEYDOWN, or an
SDL_KEYUP event. That’s what the third argument is for: Is the key pressed or released?
Looking at it now, it’s pretty basic stuff. But having never really took the time to read more
C than your typical one-page examples, it was certainly a little adventure.
Now that we have discovered
GB_set_key_state(), there’s a new fork in the road: We could simply use
Frida to get hold of a pointer to said function. The following example needs debug symbols, so we follow the project’s build instructions to get a non-stripped version. Afterwards we can verify that it worked:
Yay, a pointer! This pointer could be used with
Frida’s NativeFunction API, which would allow us to call
While true, there’s one major downside: Threading!
The first time I watched Ole’s talks11 about instrumenting
Quake, I didn’t quite understand why he wouldn’t simply call “the shoot function”. Instead, he opted to hold back until just the right time. Why go the extra mile, though?
It turns out that
Frida runs in its own thread in the target process. Well, of course it does. If we now start calling functions from this thread, we could disturb the work of other threads. Maybe we operate on some old state, or maybe the thing crashes. I honestly have no experience with threaded code to be certain.
But how do we make sure that we let the right thread do the work? Well, we pick a function that’s likely a good target for being called by the right thread and do our work in there. That scenario screams for our trusty
Interceptor.attach(), doesn’t it?
It does! Ole did exactly that. So to recap:
Fridaruns in its own thread
- randomly calling things from it may cause some headaches
- we somehow need to remember what to do / call
- we use
Interceptor.attach()to intercept a function that’s likely executed by an appropriate thread
- we do our work in there, which means the right thread does it
Uff, that sounds like a lot of work. But hold on! Didn’t I talk about a fork? Well, what’s the other tine?
As we scrolled through the
SDL wiki, we might have noticed the related functions section. Now
SDL_PushEvent() sounds like a function we can relate to!
Its signature is
int SDL_PushEvent(SDL_Event* event). The documentation states the following:
That’s a bingo! We don’t hook into
SameBoy's code, but go directly to the source: The
SDL event system. Because we can do whatever we want!
We now enter serious
Frida territory. So what’s our plan? Firstly, we need a way to call native functions from our script. I briefly mentioned the
NativeFunction API, so let’s see what we can do with it:
Voila! Well, I cannot be bothered to make another screen cast for this little demo, so you simply have to believe me. Or, you know, give it a try yourself.
In line 13, we create a new
NativeFunction by providing an address (in the form of a
NativePointer, obtainable because
puts is a
libc export), the return type and an array containing the arguments’ types12. I specifically picked
puts(), because its signature is so simple. But if you have higher demands like passing structs or classes by value (instead of just a pointer) or deal with variadic functions,
Frida still got you covered!
In line 14 we let
Frida do the allocation of our string and use the pointer to it as an argument for the
puts wrapper in the next line. And then? The string gets printed into the
guess output. Amazing, right? No no, I mean it!
This is only the beginning, though.
Frida helped us with allocating the string, but
SDL_PushEvent() needs a pointer to an
SDL_Event. Well, what is it exactly? The documentation states:
Alright, it’s a union of all possible event structs. This means a variable with the
SDL_Event type can hold any event structure. Its size is therefore equal to the size of the biggest event structure. Take for example the
SDL_KeyboardEvent, this time directly from the
SDL codebase (
SDL_Event could be any event, we need a way to differentiate between them. That’s what the
type member is for. We saw it earlier in the
handle_events() function, where it’s used as the basis of the humongous
One more note before we continue: Different event types can share the same event structure. Let’s look at an example, taken from the wiki.
|Event Type||Event Structure|
Cool! So we built a bit of background knowledge. What do we do with it? Brag? No, we put it into practice!
Remember that we want to call
SDL_PushEvent() from within our
Frida script. As an argument, we need a pointer to an
That’s all great, but where do we actually point to? We used
Memory.allocUtf8String() before, but there’s also the more generic
Memory.alloc(). It takes the size as an argument and returns a pointer to the heap. The underlying memory gets freed once all handles to it are gone, so we have to keep that in mind.
Alright, we have a plan:
- allocate some memory via
- craft a
SDL_KeyboardEventstructure in that location
- call said function with the pointer to our crafted event
- cross fingers
I’m trying to remotely move Link in an emulator, but I feel like I’m planning a sandbox escape. You’ve got to start somewhere, right?
Let’s craft the struct. But what’s the actual size of it? The definition from above gives us the sizes of all members, except for
SDL_Keysym. We can find its definition in
Let’s continue with
SDL_Scancode has its own file,
SDL_scancode.h. In there all available scancodes are part of an enum, with the maximal possible value of
512. Apparently enums store their members as
ints, but the compiler may optimize this behavior. Let’s write a quick sanity check:
Size of scancode: 4
int it is!
SDL_Keycode also has its own file (
SDL_keycode.h) with an enum defining all the possible key codes. We’ll skip over this and simply assume a size of 4 bytes, too.
Armed with all that knowledge, we could go ahead and try to craft a
SDL_KEYDOWN event. Let’s back up for a moment, though.
Up to this point, we did the analysis statically. But wouldn’t it be interesting to get some runtime information? We’re using
Frida, A world-class dynamic instrumentation toolkit, after all!
We’re going to use
Interceptor.attach() to hook into
SDL_PollEvent(), check if it actually polled one and in case it did try to parse it:
There’s a lot to discuss here. As usual, we get the address of our target function in order to use it as an argument for
Interceptor.attach(). It gets interesting in line 25, though.
SDL_PollEvent() takes a pointer to an empty
SDL_Event struct in order to populate it with values. We can get hold of said pointer with
args, because it’s the first (and only) parameter.
But as I said, the structure is uninitialized! The function has to do its thing before we see any results. So
onLeave() callback is the right place to inspect the populated struct, right? Well, almost!
There’s a problem: We only have access to the return value. The function, however, doesn’t return the struct itself, but a status code indicating if an event was taken from the queue. But
Frida has got our back, again!
As we can see in line 25, we are able to store arbitrary data via the
this keyword13. Let’s go ahead and save our pointer, so that we can check it again after
SDL_PollEvent() populated the fields. We only want to take action if an event was taken from the queue, so we check if the return value is not
0 in line 30.
Furthermore we’re only interested in
SDL_KEYDOWN events, so we check the type field that all event structs have in common. Where can we find that information?
SDL_events.h contains the
So we’re looking for types
0x301. Now let’s actually parse the events. Our script’s
parseEvent() function uses the base pointer of the struct, adds data type appropriate offsets and
NativePointer.read*() functions to get the values. Those functions will automatically deal with endianess14. It’s not pretty, but we’re still in our discovery phase.
Does this work?
Yes it does! We could verify by checking the scancodes in
SDL_scancode.h. But you know me. I already did.
Get a move on
Now it’s time to get moving! To make our life easier, let’s have a quick look at the raw bytes of a left press. We can use
NativePointer.readByteArray() to get the contents and simply print them with
console.log(). With a length of 30, we get the following result:
We’re looking at a press in line 2 (
00 03) and a release in line 6 (
01 03) of the left arrow key (
00 00 00 50). The left arrow is defined as
SDL_SCANCODE_LEFT = 80, so we’re good15.
Before we build some fancy abstractions, let’s quickly verify that we’re onto something with our approach. We’ll keep it down-to-earth by taking the two event structs from above and pushing them to
Can we get away with it?
No Bill, what are you doing? You’re in a Game Boy game from 1991, not in some convention breaking indie game! You can’t go left, there’s nothing there! You always go right, stupid!
Anyway, we got away with it. The attentive reader may have noticed that we also reused the timestamps. It simply didn’t matter. In fact, we could just zero out the bytes and we’d still be fine. Why, you might ask? Well, we get our answer in
Okay, okay, I’m going to stop with the code analysis. It’s just so much fun to get to the bottom of things. Maybe not rock bottom…
Now we’re almost done here. Let’s create a more fleshed out final implementation of our idea. We want to be able to remotely “press” any of the relevant buttons.
We need a bit more ceremony to make it work. The following example uses TypeScript (as suggested by the
Frida docs). This comes with a lot of benefits like code completion, inline docs and - who would have guessed - type checking! But don’t worry, we just define a couple of types to make things more readable. Nothing too fancy!
The code is commented, but there are a few things that I want to highlight. That I did highlight!
In line 62 we define the
startServer method inside the
rpc.exports object. Every method of said object gets exposed to the outside, meaning we can consume them from all the available
Frida bindings. We’ll get there in a second. But first let’s take that second and appreciate the fact that we are able to start a HTTP server inside our target process! It just never gets old…
We’re not doing it just for the lulz, though. I’m really, really tired and don’t want to implement the whole “Twitch Plays” concept end-to-end. The HTTP server is just a convenient way to get commands inside our agent in order to test the core functionality.
Because we also allow the duration of the button press to be specified, things like holding up the shield in Link’s Awakening are possible.
Onwards with the whole ceremony: We’re going to consume the exported function from the
Python bindings, so let’s have a quick look at the script:
What’s happening? Well, in line 5 and 6 we read the contents of our
Yes, we did. But
Frida doesn’t deal with
TS directly, so we had to compile it. There’s this (official) example project, which makes the process friction less16.
In line 13 we call the exported function, which internally starts the
HTTP server. Notice that we originally called the method
startServer (in camelCase).
Frida is all about that sweet pythonicness, so it converted the name for us.
Did all that hard work pay off?
We actually did it. We moved Link. I’m almost a little moved myself.
Now that we’re done with the core functionality, let’s think about what’s missing before we could get a “Twitch Plays” session started.
First of all, we want to receive our commands from the chat, so we have to integrate with the Twitch API. We can have a look at the documentation to get an idea how that might work. Looks doable, doesn’t it?
Knowing that there’s a
Python in favor of the
Node.js bindings. That way everything is neatly contained in one (hell of an) ecosystem.
Finally there needs the be some form of throttling and filtering. Because trolls keep trolling!
Still looks doable, doesn’t it? I should put my money where my mouth is17, but I won’t. Not this time, anyway, so deal with it!
Uff, what a ride! Let’s recap:
Frida to inject a
Afterwards we did a tiny bit of code auditing in order to discover a good point for hooking into
SameBoy so that we can remotely send input commands. We discovered that the
SDL library is a great place for that.
A quick aside here:
The high-level approach taken by some “Twitch Plays” implementations is to send input via tools like xdotool, which covers a broader range of possible emulators and games, because they are not dependent on
SDL being available. This approach comes with its own hurdles, though. It might not be portable across operating systems and things like window focus could come into play.
The low-level approach would be to just manipulate
SameBoy directly. Somewhere the state of the buttons is tracked. I’m confident that we could reach it with
Frida! While certainly a fun discovery process, this approach is the least generic and portable. And it doesn’t prove the point I’m about to make.
We sit comfortably in the middle: In theory, every
SDL project gets covered. Which are a lot (e.g. the wonderful PICO-8)! I haven’t tested this yet, but with some tweaks, our approach could surely work with those.
Asides aside, after discovering the
SDL_PushEvent() function, we looked into how we could call it from our script in order to push our events into the
SDL event queue. Reading more of the source gave us the right format. We confirmed our theory by intercepting the legitimate events, again thanks to
In our final version we allocate a buffer on the heap, craft an event structure in there and call
SDL_PushEvent() with a pointer to it.
I cannot overstate how cool that is!
Frida let us interact with code that’s not even used by
SameBoy. It just goes to show how much stuff actually is inside our programs (runtime, libraries etc.).
Code reuse attacks like return-to-libc and more “recently” return-oriented programming are exploiting this very fact18. Those attacks are just extremely fascinating. I’m looking forward to cover them in some future articles.
And just like that, we’re done! I’m sure the information in this article could have been more condensed. But as I mentioned in the introduction, I’ve tried to make the process of discovery as visible as I possible can without really being boring. Personally, I do enjoy articles that highlight the journey. If I only get an end result, the topic often seems too intimidating. Hopefully you won’t feel this way about
Frida after reading this far. Thank you!
Resources and Acknowledgments
- Leon Jacob’s frida-boot workshop (Thanks for letting me hit the ground running!)
- Ole’s talks mentioned in footnote
5(Demo time? Every presentation is just a single take. What now, demo gods?)
- Game hacking with
Fridaby X-C3LL (A really playful introduction.)
- A talk about
- A couple of older presentations from
Thanks to all people who contributed to
Frida, first and foremost to Ole. I’m really grateful for being able to use something which must have taken a lot of time to make!
Thanks Al for teaching me how to program! I mean it. ↩︎
Why 1? Because our code specifies the valid range like so:
(rand() % 10) + 1↩︎
I had to detach
GDBfrom the process in order for
Fridato be able to inject its agent into it. ↩︎
And beyond my current knowledge to be honest. ↩︎
Maybe the one thing that
Fridais most known for. ↩︎
You live and learn :). ↩︎
I actually don’t know if it’s a big project. Running
clocon the project’s
Coredirectory suggest a cool 14866 lines of code. The complete repository comes in at 47272 lines. That’s big in my book. ↩︎
“Tue 11 Sep 2001, 22:59” to be precise. I can’t think of a better way to relax after a stressful day than to write documentation ¯\_(ツ)_/¯. ↩︎
SameBoysupports button remapping, so we have a layer of indirection here. ↩︎
When in doubt, just check the respective man-page. ↩︎
SDL_KEYUPtakes 4 bytes, which looks like this on my x86-64 little-endian machine: 01 03 00 00 ↩︎
80in decimal is
50in hex. ↩︎
That is, if you know your way around the
JSecosystem at least a tiny bit. ↩︎
In the case of
ROPwe can construct functionality that’s not even present at all! ↩︎