Thursday, June 20, 2024

Unidentifid Kernel symbols: Syscall macro expansion

When navigating kernel symbols, it is not uncommon to encounter symbols that do not appear to be declared in the source code. This is often the case with symbols related to syscalls. We know that symbols are created during preprocessing (see my previous blog posts), but syscall declarations seem to be more complex. Let's look at an example:

#include "kernel.h" SYSCALL_DEFINE4(test, unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth) { printk("hello"); }

The nice function above, after being preprocessed, spawns a few other functions:

struct pt_regs; static inline int is_syscall_trace_event(struct trace_event_call *tp_event) { return 0; } asmlinkage long __arm64_sys_test(const struct pt_regs *regs); ALLOW_ERROR_INJECTION(__arm64_sys_test, ERRNO); static long __se_sys_test(__MAP(4,__SC_LONG,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)); static inline long __do_sys_test(__MAP(4,__SC_DECL,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)); asmlinkage long __arm64_sys_test(const struct pt_regs *regs) { return __se_sys_test(__MAP(4,__SC_ARGS ,,regs->regs[0],,regs->regs[1],,regs->regs[2] ,,regs->regs[3],,regs->regs[4],,regs->regs[5])); } static long __se_sys_test(__MAP(4,__SC_LONG,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)) { long ret = __do_sys_test(__MAP(4,__SC_CAST,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)); __MAP(4,__SC_TEST,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth); __PROTECT(4, ret,__MAP(4,__SC_ARGS,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)); return ret; } static inline long __do_sys_test(__MAP(4,__SC_DECL,unsigned long, first, unsigned long, second, unsigned long, third, unsigned long, fourth)) { printk("hello"); }

This example is for the aarch64 architecture, but other architectures undergo the same processing. The main function called when a syscall is invoked is __arm64_sys_test, which in turn calls __se_sys_test, and then __do_sys_test. Please note that the user code is part of this latter function. As we know, compilers perform complex optimizations when building user code and do not always honor the inline specifier. This is why, when looking at symbols (for example, in kallsyms), you may or may not see do_sys_* functions. The rationale behind this is:

If in a kernel splat backtrace you happen to see:
__do_sys_set_mempolicy_home_node+0xdc/0x1e4 __arm64_sys_set_mempolicy_home_node+0x20/0x2c
and in another build of the same kernel, you only see:
__arm64_sys_set_mempolicy_home_node+0x1d0/0x360
The error might have actually occurred at the same line of the source code.

No comments:

Post a Comment