Friday, February 23, 2024

Linux Syscall Numbers: A Journey through Binary Analysis


Reverse engineering binaries is a crucial activity impacting both the realms of security and safety. When examining a program without the context syscall numbers provide, interpreting the code behavior can be daunting.

Consider this snippet of code devoid of syscall numbers:

0x0000000000000000: 48 C7 C0 01 00 00 00 mov rax, 1 0x0000000000000007: 48 C7 C7 01 00 00 00 mov rdi, 1 0x000000000000000e: 48 C7 C6 2e 00 00 00 mov rsi, 0x2e 0x0000000000000015: 48 C7 C2 0D 00 00 00 mov rdx, 0xd 0x000000000000001c: 0F 05 syscall 0x000000000000001e: 48 C7 C0 3C 00 00 00 mov rax, 0x3c 0x0000000000000025: 48 C7 C7 00 00 00 00 mov rdi, 0 0x000000000000002c: 0F 05 syscall 0x000000000000002e: 48 65 6C insb byte ptr [rdi], dx 0x0000000000000031: 6C insb byte ptr [rdi], dx 0x0000000000000032: 6F outsd dx, dword ptr [rsi] 0x0000000000000033: 20 57 6F and byte ptr [rdi + 0x6f], dl 0x0000000000000036: 72 6C jb 0xa4 0x0000000000000038: 64 21 00 and dword ptr fs:[rax], eax

Without knowledge of syscall numbers, understanding the program’s functionality, particularly its interactions with the operating system, becomes challenging. However, simply providing the context of syscall numbers can dramatically simplify comprehension. What if I tell you that the first syscall is write and the second is exit? Instantainly it became trivial to understand!

0x0000000000000000: 48 C7 C0 01 00 00 00 mov rax, 1 0x0000000000000007: 48 C7 C7 01 00 00 00 mov rdi, 1 0x000000000000000e: 48 C7 C6 2e 00 00 00 mov rsi, 0x2e 0x0000000000000015: 48 C7 C2 0D 00 00 00 mov rdx, 0xd 0x000000000000001c: 0F 05 syscall ; write(1, 0x2e, 13) 0x000000000000001e: 48 C7 C0 3C 00 00 00 mov rax, 0x3c 0x0000000000000025: 48 C7 C7 00 00 00 00 mov rdi, 0 0x000000000000002c: 0F 05 syscall ; exit(0) 0x000000000000002e: 48 65 6C insb byte ptr [rdi], dx ; H e l 0x0000000000000031: 6C insb byte ptr [rdi], dx ; l 0x0000000000000032: 6F outsd dx, dword ptr [rsi] ; o 0x0000000000000033: 20 57 6F and byte ptr [rdi + 0x6f], dl ; \b W o 0x0000000000000036: 72 6C jb 0xa4 ; r l 0x0000000000000038: 64 21 00 and dword ptr fs:[rax], eax ; d ! \0

It become easier to spot that bytes after the exit can’t be code, and that if write references them in a write to 1, stdout, they must be text!

In the snippets I just used as example spot the syscall number is trivial as mov rax, 1 for the first syscall, and mov rax, 0x3c for the second. So the example can just provide an hint of the problem. In a complex code the mov rax in question can be far away from the actual usage and in some convoluted cases, it can be moved from a register to another.

Automating the identification of syscall numbers not only aids reverse engineering for security but also can also enhance code safety. By automating this process, the amount of code that requires manual verification can be restricted. This reduction in manual effort streamlines the verification process, contributing to improved code safety and reliability.

Differences Between Architectures RISC vs CISC:

In the realm of computer architecture, the dynamic between CISC and RISC is akin to a timeless dance. CISC represents the old school, while RISC stands as the new kid on the block — although, truth be told, RISC has been strutting its stuff for at least 30 years now, which is quite a few lifetimes in architectural terms.

RISC architectures boast uniform instruction sizes, treating every instruction with equal importance. This uniformity simplifies the compiler’s task, ensuring impartial treatment of instructions. For instance, in aarch64, a 16-bit literal can effortlessly fit within a single instruction’s operating code, blending seamlessly with others within the 64k limit — a typical boundary for syscall numbers.

Conversely, CISC architectures march to a different beat. Loading an immediate value can entail performance and memory footprint costs. Even with minimal optimizations, varied instructions are often used to achieve the same outcome of loading a register with a specific value.

C8 1a 80 D2MOV X8, #0xd6
01 00 00 D4SVC 0
RISC: aarch64
93 08 e0 05li a7, 94
0f 05ecall
48 c7 c0 0c 00 00 00mov rax,0x0C
73 00 00 00syscall
CISC: x86_64

A7 19 01 04lghi r1, 0x0104
0a 00svc 0
CISC: s390x

This architectural dichotomy sheds light on the nuances of identifying syscall numbers. RISC architectures, with their uniform instruction structure, facilitate easier identification, while CISC architectures, with their diverse and sometimes costly instructions, present a more formidable challenge.

Guess syscall numbers: How can it be done?

In the early days of computing, CPU designers anticipated the need for system call functionality directly embedded within instruction sets. For instance, the int instruction in 16-bit x86 architectures and traps in m68k systems exemplify this approach. However, as early as the 1980s, operating system developers recognized the limitations of such implementations, as the 256 functions allowed by these instructions proved insufficient. Legacy x86 systems, DOS for instance, differentiated between BIOS and operating system functions using the int number, with the OS typically consolidating its functions under a single int number, such as int 0x21. Today, the impracticality of embedding syscall numbers directly into instruction opcodes is evident, leading newer architectures like RISC-V and x86_64 to eschew this practice.

By hand: How do they do it?

Manual identification of syscall numbers is a cumbersome process, typically undertaken in two steps:
  • Locating a syscall instruction within the assembly code.
  • Tracing backward from the syscall instruction to identify registers, particularly the one containing the syscall number.
  • This meticulous process is essential for determining the specific syscall invoked by the program but highlights the need for automated tools and techniques in modern reverse engineering practices.

    Radare ESIL: How it does it?

    Radare 2 tackles the challenge of identifying syscall numbers by leveraging its ESIL framework. ESIL serves as a semantic representation of assembly instructions, enabling Radare 2 to evaluate individual instructions and emulate their behavior. Similar to LLVM-IR, ESIL provides a high-level abstraction of CPU opcode semantics, facilitating analysis and interpretation of assembly code.

    ESIL operates through a stack-based interpreter, akin to calculators, where values and operators are pushed and popped from a stack to perform operations. This post-fix notation allows for the expression of various CPU operations, including binary arithmetic, and memory operations. For example, Radare 2 can use ESIL to translate assembly instructions into understandable operations, shedding light on a program’s behavior, even for obscure architectures where traditional debugging tools may not be available. By harnessing ESIL’s capabilities, Radare 2 empowers analysts to efficiently identify syscall numbers and understand program execution flow.

    Ghidra: How it does it?

    Ghidra does not porovide an official way to solve the problem of guessing syscall numbers. Still, there’s a very well known third party script that uses Ghidra framework to provide the functionality. The Ghidra script employs symbolic propagation, a technique in reverse engineering that tracks symbolic values of program variables instead of executing the program. Specifically, the script analyzes x86 and x64 Linux binaries to resolve system calls. It identifies system call instructions, symbolically determines the values of syscall registers, and creates function definitions for each syscall. Symbolic propagation enables reasoning about register contents without actually executing the program.

    My Proposition: Static Analysis + Emulation.

    The initial concept is to emulate the code and examine the contents of registers when a syscall instruction is executed. However, a significant challenge arises: there's no assurance that the flow of code execution will necessarily reach the point of interest—the syscall.

    To address this challenge, I employ a technique I refer to as guided execution. Essentially, this method involves guiding the execution to ensure it passes over the specific instruction I need to examine.

    In the general case, this approach wouldn't provide guarantees about the accuracy of register contents. However, when it comes to syscalls, I am confident that the path I traverse to reach a syscall cannot determine the syscall number. This confidence stems from the fact that syscalls have diverse interfaces, implying that if multiple paths exist from the function start to a syscall, they will ultimately lead to the same syscall number. Subsequently, I outline the detailed procedure I employ to achieve this.

    The process commences with the user-provided function binary image, a representation of compiled machine code specific to a given function. This binary image undergoes disassembly using the libcapstone disassembler, converting the machine code into a list of assembly instructions.

    Following disassembly, an elaborate procedure ensues to generate the CFG for the function in question. The CFG serves as a graphical depiction of all feasible execution paths within the function. Each node within the graph corresponds to an instruction block, with edges denoting potential transitions between these blocks.

    An illustrative example of this intermediate step pertains to the glibc function __pthread_mutex_lock_full.

    Within this resulting CFG image, purple blocks denote instruction blocks housing syscalls, while cyan bubbles signify blocks containing a ret instruction.

    Subsequently, the task shifts to identifying a path from the function’s entry point to the instruction block housing the syscall.

    Once the path is discerned, a further operation ensues, scanning the code to locate any call instructions along the path. These call instructions are then substituted with nop instructions, ensuring guided execution remains focused solely on the path leading to the syscall instruction block and doesn’t veer off due to calls to other functions absent from the binary image.

    With call instructions replaced, guided execution is executed using libunicorn. During this process, each instruction block along the path is executed sequentially, with the context (CPU state) of each block utilized to initiate execution of the subsequent block.

    Upon reaching the syscall instruction, register values are inspected to ascertain the syscall number.

    Addressing concerns and outlining why it’s effective in this use case.

    The PoC

    To apply this concept in a practical context, I aimed to create a Proof of Concept (PoC), opting to develop it as a Radare2 plugin. Below, I present the implementation for both versions of Radare:

  • Radare2.
  • Rizin.
  • Patching the Code

    As I delve into the intricacies of Linux binaries, the necessity of patching becomes evident in my quest to uncover the elusive syscall numbers. Patching serves as a crucial tool to remove distractions and ensure a clear path to my destination.

    However, the task of patching is notably more complex in CISC architectures compared to RISC. In CISC, such as x86, branch instructions vary in length, adding layers of intricacy to the process. This complexity is compounded by the need to decipher the true meaning behind seemingly innocuous instructions like the nop instruction.

    In the realm of x86 architecture, the nop instruction is a facade for the xchg (e)ax, (e)ax instruction. This revelation sheds light on the true nature of nop, enabling me to craft multibyte instructions that effectively clear my path of obstructions.

    To achieve this, I employ the Gauss formula, a stroke of genius that allows me to create a function capable of generating multibyte nop instructions of any desired size. Armed with this solution, I navigate the intricate landscape of x86 architecture with newfound confidence, inching closer to my ultimate goal with each patch applied.

    const char multibyte_nop_x86[]= "\x90" "\x66\x90" "\x0f\x1f\x00" "\x0f\x1f\x40\x00" "\x66\x66\x66\x66\x90" "\x66\x0f\x1f\x44\x00\x00" "\x0f\x1f\x80\x00\x00\x00\x00" "\x0f\x1f\x84\x00\x00\x00\x00\x00" "\x66\x0f\x1f\x84\x00\x00\x00\x00\x00"; const char multibyte_ret_nop_x86[]= "\xc3" "\xc3\x90" "\xc3\x66\x90" "\xc3\x0f\x1f\x00" "\xc3\x0f\x1f\x40\x00" "\xc3\x66\x66\x66\x66\x90" "\xc3\x66\x0f\x1f\x44\x00\x00" "\xc3\x0f\x1f\x80\x00\x00\x00\x00" "\xc3\x0f\x1f\x84\x00\x00\x00\x00\x00"; #define MBNOP(n,t) (t==NOP_ONLY?multibyte_nop_x86 + (((n-1) * n)/2):multibyte_ret_nop_x86 + (((n-1) * n)/2))

    No Return Functions.

    The primary method employed by this plugin is emulation, where it does not have access to a context, meaning that function arguments are undetermined. Instead of emulating the entire function, it selectively executes the sequence of blocks of code leading from the function’s entry point to the point where a syscall is invoked, which is the guided execution I presented earlier.

    To achieve this, the plugin performs an analysis of the target function and creates a CFG. The CFG serves as a navigational tool to identify the path of blocks leading to the block that contains the syscall.

    In this context, “blocks,” or instruction blocks, are defined as sequences of instructions without any branching, typically terminating at a branch instruction. CALL instructions are typically treated as NOP instructions and are patched out of the binary.

    However, a unique challenge arises when dealing with the glibc library, specifically, when encountering the __libc_fatal function. Unlike other functions, this troublemaker does not conform to the assumed behavior of returning, rendering the standard treatment of call instructions inappropriate.

    I’ve decided a special treatment for this __libc_fatal function implementing ad hoc solution that transforms it into a RET instruction, allowing the plugin to function effectively in such cases.

    Jump Tables: Tackling complexities introduced by jump tables.

    The current plugin implementation effectively follows the CFG of a given function, providing accurate results in most scenarios. However, there’s an edge case that the current algorithm cannot handle correctly. This occurs when a switch case is implemented using a jump table. The current approach handles jmp instructions by considering no forward path, relying solely on the branch path. However, in cases involving jump tables, the branch path can be multiple and located far away from the code under examination.

    A temporary solution focusing specifically on the syscall number can be summarized as follows:

  • Collection of Syscall Numbers: During the examination of the function’s code, the plugin should collect information about syscall numbers and their corresponding positions within the code.
  • Handling Non-Reached CFG Blocks: If the CFG analysis encounters non-reached blocks containing a syscall, the plugin could execute them to determine if the syscall number can be deduced.
  • Epilogue:

    This endeavor stands as a PoC for a concept, while this blog post acts as a platform for others to voice their thoughts on this notion. I find it quite promising, although it requires refinement on multiple fronts. Summarizing key insights and takeaways. Encouragement for further exploration and research in the realm of binary analysis.