Thursday, January 9, 2025

Security implications with printk

Introduction

Kernel debugging is inherently a complex task due to the intricate and low-level nature of kernel operations. Surprisingly, one of the most proficient and useful tools for tackling this challenge is the printk function. While it may seem like a simple utility for printing messages, printk is a cornerstone of kernel debugging, offering critical insights into kernel behavior. The printk function in the Linux kernel might appear trivial at first glance, simply serving to print messages for debugging and logging purposes. However, it is one of the most intricate and critical components of the kernel. Its complexity arises from the requirement to function reliably in all possible kernel contexts, including interrupt handlers, non-preemptive sections, and even in cases of kernel panics. This complexity has made printk a major obstacle to the integration of the preempt_rt (real-time preemption) patch into the mainline kernel, as achieving deterministic behavior and low-latency logging in real-time systems poses significant challenges. So, kernel debugging often involves analyzing log messages to diagnose issues or understand system behavior. Among the formats used for printing data in the kernel, %pK and %pS serve specific purposes when dealing with pointers. However, their combined usage in the same message can introduce unintended information leaks, potentially undermining Kernel Address Space Layout Randomization (KASLR) security measures. This blog post explores the problem of combining %pK and %pS in a single message. We’ll start with an introduction to the problem, delve into how these formats work, and discuss specific scenarios, such as those involving kmemleak and module loading, where these issues can arise.

Potential Information Leak from Combining %pK and %pS

The kernel uses %pK to mask sensitive pointer addresses in logs based on the privilege level of the user reading the logs. This is particularly critical for preserving KASLR offsets, which are integral to modern system security. On the other hand, %pS resolves pointers to symbols, printing the function name and offset, or falling back to the raw address if the symbol cannot be resolved. When %pK and %pS are used together, the masking provided by %pK can be voided if %pS prints the same address as a raw pointer. This creates a potential vector for leaking sensitive information, especially when kallsyms fails to resolve the symbol and %pS defaults to showing the raw address.


Kernel Print Formats

To better understand this issue, it’s essential to look at the various print formats available in the kernel. The Documentation/core-api/printk-formats.rst provides an in-depth guide to these formats.

Pointer Type Formats

The printk function offers a variety of powerful format specifiers for handling pointers, enabling developers to extract and display detailed information about kernel symbols, memory addresses, and resource ranges. Depending on the specifier, pointers passed to printk can be printed as raw addresses (%px), symbolic names with or without offsets (%pS, %ps), kernel or user memory strings (%pks, %pus), physical or DMA addresses (%pa[p], %pad), or even complex structures like resources (%pr) or ranges (%pra). Each of these formats is designed to provide flexibility and precision in debugging and introspection, often requiring integration with kernel features such as kallsyms or security mechanisms.

%pS: Symbolic Representation of Function Pointers

The %pS specifier is used to print the symbolic name of a function pointer, including the offsets. For example, it outputs function_name for a given pointer. This feature relies on kallsyms, a kernel mechanism for resolving symbols, which must be enabled at build time. If kallsyms is disabled, %pS falls back to printing the raw address, as symbolic resolution is unavailable. This makes %pS an invaluable tool for debugging, providing human-readable insights into function pointers, especially in backtraces or dynamic kernel environments.

%pK: Security-Conscious Printing of Kernel Pointers

The %pK specifier addresses the security implications of exposing kernel pointers. By default, it prints masked or hashed values (e.g., 00000000) unless the kptr_restrict sysctl parameter allows unrestricted access. This behavior is essential for protecting kernel memory layout information, particularly against exploits like kernel address space layout randomization (KASLR) bypasses. The interaction with the Linux Security Module (LSM) subsystem, such as SELinux, adds another dimension of control. When SELinux is active, additional access checks might apply, ensuring that %pK outputs are aligned with the system's security policy. For instance, even privileged users may encounter restricted pointer output if SELinux policies enforce strict controls.

Complexity Behind a Simple printk

While printk appears to be a simple logging tool, passing a pointer to it can invoke deeply integrated kernel features. Printing with %pS may involve symbol resolution and handling optional features like kallsyms, while %pK necessitates checks against security configurations and LSM policies. This intricate interplay between debugging utility and security subsystem demonstrates how printk transcends its apparent simplicity to become a critical component of kernel functionality and protection.

Real-World Scenarios: kmemleak and Module Loading

There are practical cases where the combined usage of %pK and %pS manifests. One such example is in kmemleak debugging messages. Kmemleak is a kernel memory leak detector that maintains a log of unreferenced memory allocations. A concrete example of this issue can be seen in kmemleak debugging messages when kptr_restrict is set to 1. In this configuration, %pK effectively masks the kernel addresses to prevent leaking sensitive information. However, if %pK and %pS are used together, the masking becomes ineffective. For instance:
unreferenced object 0xffff465a8eb90000 (size 2048): comm "insmod", pid 129, jiffies 4294953078 hex dump (first 32 bytes): 80 c0 5e 8e 5a 46 ff ff 01 00 00 00 62 00 3c 04 ..^.ZF......b.<. 00 00 00 00 00 00 00 00 1c 02 b9 8e 5a 46 ff ff ............ZF.. backtrace (crc 2f5e480d): [<0000000000000000>] kmemleak_alloc+0xb4/0xc4 [<0000000000000000>] __kmem_cache_alloc_node+0x23c/0x270 [<0000000000000000>] kmalloc_trace+0x3c/0x90 [<0000000000000000>] 0xffffac0d743b204c [<0000000000000000>] do_one_initcall+0x178/0xc90 [<0000000000000000>] do_init_module+0x1d8/0x63c [<0000000000000000>] load_module+0x10a0/0x1670 [<0000000000000000>] init_module_from_file+0xdc/0x130 [<0000000000000000>] idempotent_init_module+0x2d8/0x534 [<0000000000000000>] __arm64_sys_finit_module+0xb4/0x130 [<0000000000000000>] invoke_syscall.constprop.0+0xd8/0x1d4 [<0000000000000000>] do_el0_svc+0x158/0x1dc [<0000000000000000>] el0_svc+0x54/0x130 [<0000000000000000>] el0t_64_sync_handler+0x134/0x150 [<0000000000000000>] el0t_64_sync+0x17c/0x180
In this example, even not considering the first line, where the pointer is printed using %08lx, %pK masks the address on the lines in the backtrace, but %pS exposes it in the fourth line if the symbol cannot be resolved. The redundancy of %pK and %pS in the same line can undermine the intended security provided by %pK.

Is this case rare or what?

The line [<0000000000000000>] 0xffffac0d743b204c appears in the log when %pS is unable to resolve an address into a symbol, falling back to printing the raw address instead. This situation is not uncommon and in this case occurs because the address corresponds to a module's initialization function that allocated the memory, is marked with the __init attribute. Functions marked as __init are automatically discarded once their execution is complete, freeing up memory. As a result, kallsyms cannot resolve the symbol since it no longer exists in the kernel's symbol table, leading to the fallback output of the raw address.

Conclusions

Combining %pK and %pS in kernel messages might seem like a harmless redundancy at first glance. However, this practice can introduce vulnerabilities by inadvertently exposing sensitive kernel information. Understanding the nuances of kernel print formats and their appropriate usage is essential for developers to maintain both system security and effective debugging capabilities.