Baby's First Binary Modifications
“There, that hole. Take a closer look."
— Patches, ‘Dark Souls’
Working with a raw binary has something magical. It’s this little black box, created through an arcane process, where all high-level concepts collapse into a single block of bytes. The thought of poking around in binaries seemed intimidating to me at first, but many great tools and resources exist to aid the learning process. First and foremost I want to mention Practical Binary Analysis by Dennis Andriesse, which acted as a catalyst for my already prevalent interest in these topics.
Hopefully this post marks the beginning of a series of articles that illustrate my journey into the deep hole that is binary analysis. I’ll move slowly and retread old ground, but I want this to be a proper reflection of my learning process. For now we’re only working with ELF files for x86-64.
This time we’re going to look at an example of a misbehaving binary. We’re going to learn about two different methods of modifying the binary’s behavior. While contrived, it should still highlight the important aspects of analysis and modification.
The Premise
I created a simple program that compares two strings:
|
|
Unfortunately I always mix up the return value of strcmp()
. Those man pages1, however, are pretty verbose, so I could not be bothered to read up
on it. If the strings don’t match, it’s either a positive or a negative integer, that much I know. Let’s just hope for the best.
|
|
Damn it! The first string is certainly not greater. But what’s the funny looking number? Okay, okay, it’s time to finally read the man page:
|
|
Our return value is negative, so str1
should be less than str2
2. I knew I’d mess those branches up! There’s no way I’m gonna compile the program again. So what else can we do about it?
Method 1: Patching
The first thing we can do is to manually patch the instruction that is responsible for taking the branch. Let’s disassemble main
and have a look at it:
|
|
Because we disabled optimizations, we pretty much get a one-to-one mapping from our C code to assembly. Our conditions still got re-arranged, resulting in the highlighted jle
instruction instead of our greater than condition.
The instruction is encoded as 7e 1b
. The first byte is the instruction itself, the second one the target of the jump. It’s a relative offset, meaning the target gets computed like this:
|
|
jump target: 0x11ef
Now we know what instruction to patch. But what does the patch look like? Simply inverting the logic should do the trick. Is there a “Jump Greater Than” instruction? Well, of course there is!3 Luckily we only have to change a single nibble. Actually we only have to change a single bit!
|
|
0b1111111
0b1111110
But because we’re not rowhammering or something, our goal stays the same: Turn 7e
into 7f
.
There are many ways to patch a binary, including convenient ones like using Ghidra. But we’re all about minimalism here, so let’s just use any hex editor.
I’m going with Emacs’ hexl-mode. The process stays the same: Find the instruction at offset 0x11d2
and patch it.
Are you finished? Great, me too. Let’s see if it worked:
|
|
Looks promising, but let’s actually run both versions:
|
|
We did it!
While certainly a fun exercise, this method of patching the binary directly has many disadvantages.
In our case the process was simple, but what if the new instruction doesn’t fit into the space we are given? That’s a bad thing, because many instructions rely on offsets that we would break.
It’s also a very laborious and error-prone process. We have one call to strcmp()
in our program, but what if we would have one million calls? Or, well, thirty. Manually patching doesn’t scale in those cases.
We need a different, more dynamic approach.
Method 2: Shared Library Hooking
There are two ways to handle calls to external libraries: The linker either adds those functions to the binary itself (statically linked), or the calls get resolved by a dynamic linker at runtime (dynamically linked). Without additional flags, gcc
will choose dynamic linking by default.
We can find out how a program is linked by running file on it:
|
|
So ls
is dynamically linked, but what libraries does it need?
|
|
ldd
tells us that ls
has four runtime dependencies. In line 4 we can see libc
and in line 5 ld-linux
, the dynamic linker/loader itself.
The .so
extension stands for “shared object”. As far as I know, the term is interchangeable with “shared library”.
So far, so good. The heading implies that there’s a way of hooking into these dynamic calls, so let’s start exploring.
The dynamic linker of most Unixes supports the LD_PRELOAD
environment variable. It will load one or more specified libraries before any others, including libc
itself. If those preloaded libraries provide a function with the same name as one of the “official” library functions, then this first function will be selected at runtime, allowing us to override any function we want, even things like printf()
or sleep()
!
Let’s create our own shared object that overrides strcmp()
.
|
|
Because we’re creating a shared object, we have to pass some additional flags to gcc
:
|
|
We compile it with the -fpic
flag to let gcc
create position independent code, which gives the library greater flexibility with regards to where it can be mapped into memory without overlapping (e.g. with another library).
As we can see in line 3, we successfully created a shared object. Let’s test it!
|
|
Yeah, right… Anyway, it worked! Even though we messed up the greater or lesser than conditions, at least our program handles the “null return” correctly.
As you can see, we specify an absolute path to the library by prepending our current working directory. Because LD_PRELOAD
is an environment variable, child processes inherit its value, but they may have a different working directory than their parent process.
While this is fun, it’s not that useful. We just completely swapped out the original implementation, which may be a valid use case for some. But we actually need the original functionality. Just, you know, backwards!
Remember that LD_PRELOAD
makes it so that the dynamic linker provides our program with our own implementation of strcmp()
. But the original one is still available in libc
, which we of course still link against. So how do we obtain its address?
There is a function that does exactly that: dlsym()
4. Let’s see it in action.
|
|
Reading through the man page gives us a hint: We have to include the #define
in line 2 before including the dlfcn
header file. We do this so that we can use the RTLD_NEXT
handle in line 11, which gives us the next occurrence of a strcmp
symbol in the list of loaded libraries, if it exists. And because libc
gets loaded after our own library, we’re good to go.
There’s another important thing we learned from the man page: We need to specify a gcc
flag, -ldl
.
Let’s compile and see if we were successful:
|
|
Now that’s more like it! Our fake strcmp()
calls the real one, inverts its return value and returns it. This works, because we’re absolutely sure that we always mixed up the return value in the same way throughout the whole code base! Talking about contrived examples…
The main benefit of the LD_PRELOAD
approach over manually patching is that there’s one central place for the modification. That said, every change requires recompilation of the library. And we’re only able to modify library calls.
Conclusion
In this first article we had a look at two simple techniques for binary modification. We’ve gotten to know some basic tools that aided us in analyzing the problem at hand. While the examples were contrived, the same techniques can be used for more elaborate modifications.
The first method directly modified the binary, while the second method hooked into its library calls. There are countless exciting things we could (and maybe will) do with these two techniques alone, but for now I’m content with this overview. Again, this is mostly basic stuff. However, I’m a firm believer in strong fundamentals, which is exactly why we’re taking small steps in this series.