The Heap

April 4th, 2023

Table of Contents

The Malloc Maleficarum

"In late 2001, 'Vudo Malloc Tricks' and 'Once Upon A free()' defined the exploitation of overflowed dynamic memory chunks on Linux. In late 2004, a series of patches to GNU libc malloc implemented over a dozen mandatory integrity assertions, effectively rendering the existing techniques obsolete. It is for this reason, a small suggestion of impossiblity, that I present the Malloc Maleficarum." - Phantasmal Phantasmagoria

The Malloc Maleficarum is a masterpiece, a grimoire of techniques, a true cornerstone of heap exploitation. It was written by an equally fascinating sorcerer of exploitation, Phantasmal Phatasmagoria in 2005. The reason for its creation was that during the end of 2004, glibc malloc got hardened. Moreover, the techniques that were being used back then, like the malloc unlink exploit became deprecated. Phantasmal's work documents various techniques (categorized in "houses") that can be used to successfully exploit the heap using malloc, hence Malloc Maleficarum.

I apologize for this tangent but I think it would be pretty cool to cover the etymology of the words present in the Malloc Maleficarum. It certainly warrants that curious "tilt of the head." it's just so cool to me, the name, the technique names, the topics at hand, it's like a real-life Skyrim Oghma Infinium. Anyways, just expand the expandable below if you want to learn the meaning of the name!

Etymology of "Maleficarum"

So, after searching for the word "maleficarum", I was presented with a couple of hits to a book of . The book was titled "Malleus Maleficarum" which gets translated to "Hammer of Witches." The book was known for its treatise on witchcraft. It was at this point that I decided to research the latter word since it appeared that Malloc Maleficarum was a play on the words "Malleus Maleficarum". You know... Malleus... Malloc... yeah. Moving on. After consulting some etymology sites, I finally got what I was looking for. "" is actually the feminine plural of the word "Maleficus." Maleficus is derived from the Latin words, male (meaning "badly" or "wrongly"), and the suffix "-ficus" (which denotes "making" or "doing"). Therefore, in this context, it can be implied that the techniques present in the Malloc Maleficarum are intended to cause harm or damage to the program/system being targeted!

There are many houses/techniques of exploitation covered in the Malloc Maleficarum including (but not limited to, as we'll see later on):

Please note that there are many more houses of exploitation out there. This isn't an all-inclusive list but I'll add more and more above the more I do. Also, the necessary details of each technique above will be documented in the page that I create solely for that technique, to avoid this singular page getting a Guinness World Record for the world's largest blog post, of course.

Dynamic Memory Allocation

Whenever your program needs memory dynamically allocated/deallocated to it, it will generally go to the heap. The heap is a segment of memory that's managed by an OS or a programming language's runtime system. Speaking of programming languages, most of them (the modern ones at least) actually already dynamically allocate and free memory. Lower-level languages like C, C++, etc. still require you to do manual memory management. If you don't a lot of bugs, vulnerabilities, etc. may arise - which is great for us as hackers; but not so great for developers, potential users, or what have you. As you should know by now, the stack grows downwards. The heap, however, grows upwards. Furthermore, the heap isn't a linear data structure like the stack is; no no no, it's actually hierarchical. There are also multiple parts in play with a procedure like dynamic memory allocation. We have things called arenas, heaps, and chunks.

Arenas: Is the actual memory assigned to a thread and can contain multiple heaps. A main thread's arena is called the main arena. It also doesn't have multiple heaps.
Heaps: These are called contiguous regions of memory which can be subdivided/chopped up into multiple "chunks." A heap belongs to a single arena.
Chunks: These are parts of the heap that can actually be allocated, freed, reallocated, etc. Furthermore, a chunk exists in one (1) heap and belongs to one (1) arena.

I absolutely love this diagram, from an incredible source of knowledge on the topic of heap exploitation, Ch0pin:

API

Hey, remember how we've been delving into some Malware Development and how we've been using the Windows API (and even the native API) to do our exploits? Yeah, well, glibc offers its own API which lets us do three (3) main things when it comes to dynamic memory. Allocating memory with malloc, freeing it after use with free, and then reallocating memory (if needed) from the now free chunks using realloc.

Allocating Memory

With the use of the malloc function, we can request the number of bytes we want to allocate.

void *p = malloc(size);

From the man pages, we can see the following:

(...)

The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized. If size is 0, then malloc() returns a unique 
pointer value that can later be successfully passed to free().

(...)

A thing we'll see is that no matter how much we actually request from malloc, we'll end up getting a bit more, which we'll discuss. First, let's create a super simple C-program that'll just use the malloc API in order to request 1 byte of memory from the heap.

#include <stdlib.h>

int main(void){
    void *p = malloc(1);
    return 0;
}

Obviously, you shouldn't just allocate memory and then exit the program before freeing it, but for our example, and for the purposes of learning; let's compile this thing to tinker with it. After compiling the program (with the flag for GDB: -g, so it's a better debugging experience), we can open it in a debugger and set a breakpoint right on the call to malloc:

Let's run the program and wait until we get the breakpoint:

If we inspect the virtual memory of this process with the vmmap command, we'll see that in this moment of time, in this moment of the program's execution, there isn't even a heap section (blue, as seen in the LEGEND) present within the target's virtual address space:

Furthermore, we can also verify this with the following commands:

pwndbg> heap
heap: Heap is not initialized yet.
pwndbg> vis_heap_chunks
vis_heap_chunks: Heap is not initialized yet.

However, if we move forward with one (1) instruction using the next (n) command, we can see that we get a heap segment and that we can now view the heap chunks.

The heap region is typically available for use from the moment a process starts up, but it is not initialized or mapped to physical memory until the first memory allocation request is made, which would be after our call to malloc.

We can see that the heap segment is now there and that its size is 0x21000 which is around ~135k bytes:

We could go further and actually display the heap chunks with vis_heap_chunks or vis

This output is a little borked right now, but basically, the thing that we need to understand is the start of the heap section; which is at the address 0x555555559000. You can ignore the blue section, it seems to be some allocation that automatically occurs and inspecting it reveals that there is quite a lot that gets allocated. Oh well, the show must go on. So, The next section, in pink, is the part where we've actually allocated our memory!

User Data Chunk

The region of our user data (which is the area of the memory that our actual data would go) starts at the address: 0x5555555592a0. We can prove this by looking at where we are in the code and printing the value of the variable that we assigned to the malloc function after it's been called and the heap's been initialized. The variable, which we know from the source code, is p:

(...)
0x555555559290	0x0000000000000000	0x0000000000000021	........!.......
0x5555555592a0	0x0000000000000000	0x0000000000000000	................
0x5555555592b0	0x0000000000000000	0x0000000000020d51	........Q.......	 <-- Top chunk
pwndbg> print p
$1 = (void *) 0x5555555592a0

We're given three (3) quad words at 8 bytes each, which is 24 bytes of user data. Hold on a second... we literally only requested a single byte from malloc. How come we're given so much more than that? Turns out, this is actually the from malloc. Even if we create a script like the following:

#include <stdlib.h>

int main(void){
    void *p = malloc(1);
    malloc(0);
    return 0;
}

We'll see that even if we request zero (0) bytes from malloc, it'll still give us a chunk size of 24 bytes for our user data:

I also compiled the same program for 32-bit architecture just to see the minimum size that malloc would allocate for us. After reaching the breakpoint of the malloc(0) command, we can see that we received three (3) double words which are four (4) bytes each, for a total of 12 bytes of user data:

Heap Chunk Header

There's this weird section here in the first part of the allocation chunk, before our user data: 0x0000000000000021. What does this mean? If we take this value 0x21 and turn it into a decimal:

pwndbg> print /d 0x21
$1 = 33

We can see that this is 33 bytes. To solve the mystery of all of these extra bytes, let's consider what we have so far:

0x0000000000000018 (24 bytes)

24 bytes of user data

We have 24 bytes of user data which means that 24 of those 33 bytes are accounted for. To figure out where the rest of those 9 bytes come from, keep on reading.

These extra bytes in the header are for bookkeeping/metadata. Just like how the stack has some data like base pointers, return addresses, etc. stored on itself, the heap stores its metadata in itself as well. The chunk header (0x0000000000000021) stores information like the size field - which stores the sum total of bytes that make up the chunk; including the size field itself. So, what about those mysterious extra nine (9) bytes?

0x0000000000000020 (32 bytes)

24 bytes of user data

8 bytes for the header itself

We had 24 bytes of user data, and the chunk header itself is another quadword (8 bytes) which brings us to 32 bytes or 0x20.

However, we're not seeing 0x20, we're seeing 0x21, what's up with that extra byte? Without getting too far into it, that least significant byte (the rightmost byte (1) inside of 0x21) is just denoting that the previous chunk of lower memory adjacent to it is in use. From the source code of malloc.c, we can see that this bit gets set:

(...)
/* size field is or'ed with PREV_INUSE when previous adjacent chunk in use */
#define PREV_INUSE 0x1
(...)

There can also be other bits that get set instead of 0x1 as well; which we can see if we look at an amazing diagram, like the one from Azeria Labs:

With that now explained, we can understand why we're given 33 (0x21) bytes in the heap chunk header!

0x0000000000000021 (33 bytes)

24 bytes of user data

8 bytes for the header itself

PREV_INUSE (0x1) bit set

Great, we've solved the mystery of our heap chunk header and the user data section. Now all that's left is to explain the last section.

The Top Chunk

This part of the heap memory is the easiest part to understand. The last address that we see from something like the vis command is just the top chunk of the heap memory. It's simply just the amount of bytes left that can be allocated. If you recall, when the heap first got initialized, it was initialized with a size of 0x21000 bytes, which if we turn it into a decimal:

Is around 135,000 bytes. Now, as we start allocating memory with malloc we'll see that this top chunk changes its value to reflect the amount of bytes that have been used and more importantly, the number of bytes that are left for allocation.

So far this has been extremely intuitive! Isn't the heap amazing? ❤️😄

Freeing Allocated Memory

With the use of the free function, we can free the previously allocated memory from malloc.

free(p);

So, let's use the same script from before, except after allocating 24 bytes on the heap, we also free them after using the free function:

#include <stdlib.h>

int main(void){
    void *p = malloc(24);
    free(p);
    return 0;
}

We do the same ol' and get to the point after the call to malloc such that the heap chunk is visible:

(...)
0x555555559290	0x0000000000000000	0x0000000000000021	........!.......
0x5555555592a0	0x0000000000000000	0x0000000000000000	................
0x5555555592b0	0x0000000000000000	0x0000000000020d51	........Q.......	 <-- Top chunk

Now, if we move forward one (1) instruction, we'll have called the free function on this allocated memory which should have unallocated it.

I know it looks a bit weird right now but tcache would be an entire blog post on its own so I'll cover it in more detail at a later time. The important thing to know is that the memory should've now been unallocated. And indeed, if we use the heap command, we can see that this section of our memory in the heap is labelled a "free chunk":

Heap Chunk Incrementation

A little bit of a tangent is in order for us to understand the incrementation of the heap chunks allotted to us. It turns out, that the heap gives out chunks in increments of 16 bytes. So, you know how we've been getting 24 bytes of user data this entire time for the minimum amount? Let's request 25 bytes from malloc and see what happens.

#include <stdlib.h>

int main(void){
    void *p = malloc(24); /* smallest user data chunk size allocatable */
    malloc(25);
    return 0;
}

We set a breakpoint on main and run the program. We can see that we're about to allocate the minimum chunk size we can from malloc which is 0x20 bytes (32 bytes) or 0x21 (33 bytes) if the PREV_INUSE bit is set (which it's going to be):

We move forward one (1) instruction and see what we've been seeing this entire time, an allocation of 0x21 bytes:

Now, if we move forward another instruction to try to allocate more than this minimum amount, just by a single byte, we'll see that the heap chunk header will increment by 16 bytes.

We can see that we've allocated 0x31 bytes for the heap chunk as opposed to the 0x21 bytes we've been seeing this entire time.

pwndbg> print /d 0x21
$1 = 33
pwndbg> print /d 0x31
$2 = 49

That's right! There's a difference of 16 bytes! So, the heap gives out memory, depending on how much you want it, in increments of 16 bytes. Let's try it with one last example just to really internalize this:

#include <stdlib.h>

int main(void){
    void *p = malloc(25); /* 0x31 bytes */
    malloc(50); /* should be 0x41 bytes (+16 from 0x31) */
    return 0;
}

Awesome, we've proven that the heap chunks increment by 16 bytes!

Reallocating Memory

From the man pages, we can see what realloc does:

Let's consider the following source code:

#include <stdlib.h>

int main(void){
    void *p = malloc(24);
    free(p);
    realloc(p, 25);
    return 0;
}

So, here we do exactly the same thing as we were doing up until freeing the allocated memory. After this part, we're going to pass the freed memory pointer, with a size of 25 bytes. If all goes accordingly, then we should have a new chunk size of 0x31 since we know that anything over the 24 bytes minimum will increment the heap chunk size by 16 bytes and the process will repeat for the next chunk size, and so on.

So, what we're expecting to see at the end of this program is a chunk header with the value of 0x31 as well as the five (5) quadwords (40 bytes) of user data that comes with an allocation size this big.

Et voila! It worked amazingly. We now understand the basic workings of the heap's functionality, its quirks, and how some of the APIs are involved in doing all of this dynamic memory allocation! Give yourselves a huge round of applause, this is some super interesting stuff and you should be ecstatic because you are now a HEAP MASTER.

References

https://dl.packetstormsecurity.net/papers/attack/MallocMaleficarum.txt

.:: Phrack Magazine ::.

Unlink Exploitheap-exploitation

Heap Exploitation Part 1: Understanding the Glibc Heap ImplementationAzeria-Labs

The toddler’s introduction to Heap exploitation (Part 1)Medium

Overview of GLIBC heap exploitation techniquesLow-level adventures

Linux Heap Exploitation - Part 1Udemy

Etymology References

PreviousHeap NextHouse of Force I

Last updated 1 year ago

Was this helpful?