The Heap
April 4th, 2023
The Malloc Maleficarum
"In late 2001, 'Vudo Malloc Tricks' and 'Once Upon A free()' defined the exploitation of overflowed dynamic memory chunks on Linux. In late 2004, a series of patches to GNU libc malloc implemented over a dozen mandatory integrity assertions, effectively rendering the existing techniques obsolete. It is for this reason, a small suggestion of impossiblity, that I present the Malloc Maleficarum." - Phantasmal Phantasmagoria
The Malloc Maleficarum is a masterpiece, a grimoire of techniques, a true cornerstone of heap exploitation. It was written by an equally fascinating sorcerer of exploitation, Phantasmal Phatasmagoria in 2005. The reason for its creation was that during the end of 2004, glibc malloc
got hardened. Moreover, the techniques that were being used back then, like the malloc unlink exploit
became deprecated. Phantasmal's work documents various techniques (categorized in "houses") that can be used to successfully exploit the heap using malloc
, hence Malloc Maleficarum
.
I apologize for this tangent but I think it would be pretty cool to cover the etymology of the words present in the Malloc Maleficarum. It certainly warrants that curious "tilt of the head." it's just so cool to me, the name, the technique names, the topics at hand, it's like a real-life Skyrim Oghma Infinium. Anyways, just expand the expandable below if you want to learn the meaning of the name!
There are many houses/techniques of exploitation covered in the Malloc Maleficarum including (but not limited to, as we'll see later on):
Please note that there are many more houses of exploitation out there. This isn't an all-inclusive list but I'll add more and more above the more I do. Also, the necessary details of each technique above will be documented in the page that I create solely for that technique, to avoid this singular page getting a Guinness World Record for the world's largest blog post, of course.
Dynamic Memory Allocation
Whenever your program needs memory dynamically allocated/deallocated to it, it will generally go to the heap. The heap is a segment of memory that's managed by an OS or a programming language's runtime system. Speaking of programming languages, most of them (the modern ones at least) actually already dynamically allocate and free memory. Lower-level languages like C, C++, etc. still require you to do manual memory management. If you don't a lot of bugs, vulnerabilities, etc. may arise - which is great for us as hackers; but not so great for developers, potential users, or what have you. As you should know by now, the stack grows downwards. The heap, however, grows upwards. Furthermore, the heap isn't a linear data structure like the stack is; no no no, it's actually hierarchical. There are also multiple parts in play with a procedure like dynamic memory allocation. We have things called arenas, heaps, and chunks.
Arenas
: Is the actual memory assigned to a thread and can contain multiple heaps. A main thread's arena is called the main arena. It also doesn't have multiple heaps.Heaps
: These are called contiguous regions of memory which can be subdivided/chopped up into multiple "chunks." A heap belongs to a single arena.Chunks
: These are parts of the heap that can actually be allocated, freed, reallocated, etc. Furthermore, a chunk exists in one (1
) heap and belongs to one (1
) arena.
I absolutely love this diagram, from an incredible source of knowledge on the topic of heap exploitation, Ch0pin:
API
Hey, remember how we've been delving into some Malware Development and how we've been using the Windows API (and even the native API) to do our exploits? Yeah, well, glibc
offers its own API which lets us do three (3
) main things when it comes to dynamic memory. Allocating memory with malloc
, freeing it after use with free
, and then reallocating memory (if needed) from the now free chunks using realloc
.
Allocating Memory
With the use of the malloc
function, we can request the number of bytes we want to allocate.
From the man
pages, we can see the following:
A thing we'll see is that no matter how much we actually request from malloc
, we'll end up getting a bit more, which we'll discuss. First, let's create a super simple C-program that'll just use the malloc
API in order to request 1 byte
of memory from the heap.
Obviously, you shouldn't just allocate memory and then exit the program before freeing it, but for our example, and for the purposes of learning; let's compile this thing to tinker with it. After compiling the program (with the flag for GDB: -g
, so it's a better debugging experience), we can open it in a debugger and set a breakpoint right on the call to malloc
:
Let's run the program and wait until we get the breakpoint:
If we inspect the virtual memory of this process with the vmmap
command, we'll see that in this moment of time, in this moment of the program's execution, there isn't even a heap section (blue, as seen in the LEGEND
) present within the target's virtual address space:
Furthermore, we can also verify this with the following commands:
However, if we move forward with one (1
) instruction using the next
(n
) command, we can see that we get a heap segment and that we can now view the heap chunks.
The heap region is typically available for use from the moment a process starts up, but it is not initialized or mapped to physical memory until the first memory allocation request is made, which would be after our call to malloc
.
We can see that the heap segment is now there and that its size is 0x21000
which is around ~135k
bytes:
We could go further and actually display the heap chunks with vis_heap_chunks
or vis
This output is a little borked right now, but basically, the thing that we need to understand is the start of the heap section; which is at the address 0x555555559000
. You can ignore the blue section, it seems to be some allocation that automatically occurs and inspecting it reveals that there is quite a lot that gets allocated. Oh well, the show must go on. So, The next section, in pink, is the part where we've actually allocated our memory!
User Data Chunk
The region of our user data (which is the area of the memory that our actual data would go) starts at the address: 0x5555555592a0
. We can prove this by looking at where we are in the code and printing the value of the variable that we assigned to the malloc
function after it's been called and the heap's been initialized. The variable, which we know from the source code, is p
:
We're given three (3
) quad words at 8 bytes
each, which is 24 bytes
of user data. Hold on a second... we literally only requested a single byte from malloc
. How come we're given so much more than that? Turns out, this is actually the from malloc
. Even if we create a script like the following:
We'll see that even if we request zero (0
) bytes from malloc
, it'll still give us a chunk size of 24 bytes
for our user data:
I also compiled the same program for 32-bit architecture just to see the minimum size that malloc
would allocate for us. After reaching the breakpoint of the malloc(0)
command, we can see that we received three (3
) double words which are four (4
) bytes each, for a total of 12 bytes
of user data:
Heap Chunk Header
There's this weird section here in the first part of the allocation chunk, before our user data: 0x0000000000000021
. What does this mean? If we take this value 0x21
and turn it into a decimal:
We can see that this is 33 bytes. To solve the mystery of all of these extra bytes, let's consider what we have so far:
24 bytes
of user data
We have 24 bytes
of user data which means that 24
of those 33
bytes are accounted for. To figure out where the rest of those 9 bytes
come from, keep on reading.
These extra bytes in the header are for bookkeeping/metadata. Just like how the stack has some data like base pointers, return addresses, etc. stored on itself, the heap stores its metadata in itself as well. The chunk header (0x0000000000000021
) stores information like the size field - which stores the sum total of bytes that make up the chunk; including the size field itself. So, what about those mysterious extra nine (9
) bytes?
24 bytes
of user data
8 bytes
for the header itself
We had 24 bytes
of user data, and the chunk header itself is another quadword (8 bytes
) which brings us to 32 bytes
or 0x20
.
However, we're not seeing 0x20
, we're seeing 0x21
, what's up with that extra byte? Without getting too far into it, that least significant byte (the rightmost byte (1
) inside of 0x21
) is just denoting that the previous chunk of lower memory adjacent to it is in use. From the source code of malloc.c
, we can see that this bit gets set:
There can also be other bits that get set instead of 0x1
as well; which we can see if we look at an amazing diagram, like the one from Azeria Labs:
With that now explained, we can understand why we're given 33 (0x21) bytes
in the heap chunk header!
24 bytes
of user data
8 bytes
for the header itself
PREV_INUSE (0x1)
bit set
Great, we've solved the mystery of our heap chunk header and the user data section. Now all that's left is to explain the last section.
The Top Chunk
This part of the heap memory is the easiest part to understand. The last address that we see from something like the vis
command is just the top chunk of the heap memory. It's simply just the amount of bytes left that can be allocated. If you recall, when the heap first got initialized, it was initialized with a size of 0x21000
bytes, which if we turn it into a decimal:
Is around 135,000
bytes. Now, as we start allocating memory with malloc
we'll see that this top chunk changes its value to reflect the amount of bytes that have been used and more importantly, the number of bytes that are left for allocation.
Freeing Allocated Memory
With the use of the free
function, we can free the previously allocated memory from malloc
.
So, let's use the same script from before, except after allocating 24 bytes
on the heap, we also free them after using the free
function:
We do the same ol' and get to the point after the call to malloc
such that the heap chunk is visible:
Now, if we move forward one (1
) instruction, we'll have called the free
function on this allocated memory which should have unallocated it.
I know it looks a bit weird right now but tcache
would be an entire blog post on its own so I'll cover it in more detail at a later time. The important thing to know is that the memory should've now been unallocated. And indeed, if we use the heap
command, we can see that this section of our memory in the heap is labelled a "free chunk":
Heap Chunk Incrementation
A little bit of a tangent is in order for us to understand the incrementation of the heap chunks allotted to us. It turns out, that the heap gives out chunks in increments of 16 bytes
. So, you know how we've been getting 24 bytes
of user data this entire time for the minimum amount? Let's request 25 bytes
from malloc
and see what happens.
We set a breakpoint on main
and run
the program. We can see that we're about to allocate the minimum chunk size we can from malloc
which is 0x20
bytes (32 bytes
) or 0x21 (33 bytes)
if the PREV_INUSE
bit is set (which it's going to be):
We move forward one (1
) instruction and see what we've been seeing this entire time, an allocation of 0x21
bytes:
Now, if we move forward another instruction to try to allocate more than this minimum amount, just by a single byte, we'll see that the heap chunk header will increment by 16 bytes
.
We can see that we've allocated 0x31
bytes for the heap chunk as opposed to the 0x21
bytes we've been seeing this entire time.
That's right! There's a difference of 16 bytes
! So, the heap gives out memory, depending on how much you want it, in increments of 16 bytes
. Let's try it with one last example just to really internalize this:
Awesome, we've proven that the heap chunks increment by 16 bytes
!
Reallocating Memory
From the man
pages, we can see what realloc
does:
Let's consider the following source code:
So, here we do exactly the same thing as we were doing up until freeing the allocated memory. After this part, we're going to pass the freed memory pointer, with a size of 25 bytes
. If all goes accordingly, then we should have a new chunk size of 0x31
since we know that anything over the 24
bytes minimum will increment the heap chunk size by 16 bytes
and the process will repeat for the next chunk size, and so on.
So, what we're expecting to see at the end of this program is a chunk header with the value of 0x31
as well as the five (5
) quadwords (40 bytes
) of user data that comes with an allocation size this big.
Et voila! It worked amazingly. We now understand the basic workings of the heap's functionality, its quirks, and how some of the APIs are involved in doing all of this dynamic memory allocation! Give yourselves a huge round of applause, this is some super interesting stuff and you should be ecstatic because you are now a HEAP MASTER
.
References
Last updated