Friday, January 5, 2024

Unveiling CPU Count in a System: A Dive into Linux Utilities and Functions


Introduction:

Understanding the computational capabilities of a system can be pivotal in optimizing software performance. The quest to determine the number of CPUs within a system often leads to exploration and evaluation of various methods. This pursuit becomes particularly critical when developing low-level tools for Linux, where having an accurate insight into the available CPU count is essential for efficient resource utilization and task allocation. In this article, I delve into different approaches to uncovering this crucial system attribute, aiming to provide insights for developers grappling with similar challenges.

Initial Proposition:

In my pursuit of determining the number of CPUs within a Linux system, my initial approach mirrored what I typically did from the command line: cat /proc/cpuinfo. It’s worth noting that while the command line utility nproc caters to the same query, my familiarity with the proc method predates my acquaintance with nproc. Hence, it was my instinctive choice for the initial implementation of a function to retrieve this vital information programmatically.

int nproc() {
        int cpu_count = 0;
        char line[100];
        FILE *file;

        file = fopen("/proc/cpuinfo", "r");
        if (file == NULL) {
                perror("Error opening file");
                exit(EXIT_FAILURE);
        }

        while (fgets(line, sizeof(line), file)) {
                if (strncmp(line, "processor", 9) == 0) {
                        cpu_count++;
                }
        }

        fclose(file);
        return cpu_count;
}

Does this implementation work? Yes, it does provide the expected CPU count. However, despite its functionality, I find myself dissatisfied with this method for several reasons, which I’ll delve into shortly.

Concerns on the Solution:

While the initial /proc/cpuinfo approach effectively retrieves CPU information, several concerns arise, prompting dissatisfaction with this simplistic method:

  1. Oversimplicity: The code’s apparent simplicity, merely opening a file and parsing statements, raises skepticism. Its straightforward nature seems inadequate for a task as critical as determining CPU count within a system. This simplicity might overlook nuances or potential intricacies within different system environments.
  2. Uncertainty in /proc/cpuinfo Stability: /proc/cpuinfo provides information designed for human readability. This readability doesn’t guarantee immutable structure or content, leaving room for future modifications by kernel developers. While changes aren’t imminent, relying solely on this file may pose a risk of unexpected alterations in its format or content, impacting the function’s reliability.
  3. Dependency on Procfs: The reliance on procfs introduces a constraint, assuming its presence and accessibility, which isn’t universally guaranteed. In specific embedded Linux systems or constrained environments, procfs might not be mounted or accessible. Assumptions like these could jeopardize the function’s portability and reliability across diverse Linux distributions and setups.
  4. Preference for Syscall Stability: A preference emerges for utilizing syscalls, known for their stability and less prone to alterations compared to file-based interfaces like procfs. Syscalls, by design, tend to remain more consistent across Linux distributions and versions, offering a more robust foundation for critical functionalities like retrieving CPU count.

The sysconf() Function:

Is there a standardized, reliable way to retrieve CPU count programmatically? After a comprehensive search, one function stood out as a prevailing best practice: sysconf(_SC_NPROCESSORS_CONF). This function, available in the POSIX standard, offers a promising solution for obtaining the number of processors in a system.

What is sysconf()?

At its core, sysconf() is a system call that allows access to configurable system variables. Specifically, _SC_NPROCESSORS_CONF is a parameter used to query the number of processors configured in the system. Additionally, _SC_NPROCESSORS_ONLN exists, reporting the number of CPUs effectively active.

Does it provide a solution for this?

Indeed, sysconf(_SC_NPROCESSORS_CONF) serves as a robust and standardized means to obtain CPU count programmatically. Its utilization ensures compatibility across various Linux distributions and versions, contributing to code portability.

How sysconf() is implemented in the guts of libcs:

glibc implementation:

Navigating through the labyrinthine structure of glibc, a colossal library accommodating various systems, one discovers numerous implementations catering to different operating environments. Amidst this complexity, glibc’s handling of sysconf() for Linux systems proves enlightening.

In scrutinizing glibc’s codebase, particularly for Linux-specific implementations, it’s evident that glibc relies on sysfs as its primary source for retrieving CPU count information. The main method employed by glibc to provide this data involves accessing /sys/devices/system/cpu/possible. This file seemingly contains the requisite CPU count information, hence serving as the cornerstone for glibc’s approach.

However, acknowledging the potential unpredictability of specialized file system resources, glibc developers have implemented a fallback system in case the primary method fails. The fallback function, get_nprocs_fallback(), comprises alternative methods to retrieve CPU count information:

  1. get_nproc_stat(): This method mirrors my initial implementation; it parses /proc/cpuinfo and tallies occurrences of cpu, akin to counting processors. While functional, it remains subject to the same concerns surrounding /proc/cpuinfo stability and dependencies on procfs availability.

  2. __get_nprocs_sched(): Addressing concerns about specialized filesystem accessibility, this method adopts sched_getaffinity(), essentially a syscall, to deduce the number of CPUs. This syscall-based approach aligns with the desire for a more reliable, system-level means to retrieve CPU count information, bypassing potential dependencies on file system structures.

Despite glibc’s vastness and varied system support, its Linux implementation of sysconf() prioritizes sysfs as the primary resource for CPU count retrieval, supplemented by fallback methods for contingencies or specialized system scenarios.

musl implementation:

Developed with a distinct focus on constrained systems, musl libc embodies the quintessential choice for environments with stringent resource limitations. Unveiling musl’s methodology for retrieving CPU count information unveils a strategy tailored to suit such constrained environments.

In contrast to glibc’s expansive versatility, musl prioritizes efficiency and minimalism, aligning with the requirements of resource-constrained systems. musl’s approach to CPU count retrieval centers around a singular method, reliant on the sched_getaffinity() syscall to infer CPU count.

This method, employing sched_getaffinity(), emphasizes a syscall-based approach for determining CPU count information. While not leveraging sysfs or other file system structures, musl’s implementation demonstrates a steadfast reliance on system-level calls for this critical system attribute. This approach aligns with musl’s ethos of simplicity and efficiency, avoiding potential dependencies on file system structures and instead relying solely on syscall interactions to obtain CPU count details.

musl’s singular reliance on sched_getaffinity() for CPU count retrieval underscores its commitment to efficiency and simplicity in constrained system environments, reflecting its status as a libc tailored specifically for resource-limited setups.

ulibc implementation:

ulibc, akin to musl, stands as a libc tailored explicitly for resource-constrained systems. Its implementation strategy for CPU count retrieval reflects a minimalistic approach, aligning with the requirements of such constrained environments.

Despite the expectation of multiple methods to cater to contingencies, ulibc’s decision to rely solely on a sysfs based approach might initially surprise. The implementation involves scanning the sysfs directory /sys/devices/system/cpu, specifically enumerating directory entries featuring the substring ‘cpu[0-9]’. This singular method utilizing sysfs as the source for CPU count determination aligns with ulibc’s ethos of minimalism and efficiency.

This reliance on sysfs, while potentially limiting if sysfs isn’t mounted, resonates with ulibc’s targeted service provision to highly constrained systems. The decision potentially avoids additional overhead that could arise from using syscall-based methods, as it requires a support structure having a non-negligible footprint, incompatible with resource-limited environments.

nproc Utility:

Examining the nproc utility, it’s pertinent to delve into two notable implementations: Coreutils and Busybox. Despite the varying nature of these utilities, the expectation leans toward both implementations relying on support from the underlying libc to provide CPU count information.

Coreutils Implementation:

Coreutils, a fundamental suite of Unix utilities including nproc, typically relies on libc support for system-specific information retrieval. The nproc utility in Coreutils is expected to utilize system calls or libc functions to fetch CPU count data. Upon inspecting the nproc utility’s codebase, it becomes evident that the utility itself does not house specific code for CPU count retrieval. Instead, it relies on certain environment variables typically provided by the OpenMP library. In the absence of these variables, nproc falls back on a function named num_processors_ignoring_omp(), which finds its implementation within the gnulib.

The gnulib implementation might initially appear complex, owing to its handling of the task across diverse platforms, including one for Windows. However, restricting the investigation to Linux unveils an implementation primarily reliant on the sched_getaffinity() syscall. Remarkably, this implementation appears independent of direct dependencies on the libc, offering an alternative method for CPU count retrieval.

By leaning on sched_getaffinity(), the gnulib implementation for Linux within nproc exemplifies a straightforward syscall-based approach for CPU count determination, suggesting a certain level of autonomy from standard libc functionalities.

Busybox Implementation:

Busybox, renowned for its compactness and versatility, reimagines system utilities, often independent of direct dependencies on standard libraries like libc. The implementation of CPU count retrieval within Busybox resonates with this independent ethos, showcasing a self-contained logic mirroring methodologies akin to libc implementations.

Busybox’s CPU count retrieval method primarily involves scanning the sysfs directory /sys/devices/system/cpu, specifically enumerating directory items containing the substring cpu. However, this method assumes prominence only when a configuration symbol is enabled, signifying Busybox’s adaptability based on configuration choices.

The surprising element lies in Busybox’s delineation of preferences: designating sysfs-based enumeration as the primary and fallback method, whereas the sched_getaffinity() syscall serves as an alternative choice. This nuanced approach differs from conventional expectations, underscoring Busybox’s nuanced perspective on reliability and system adaptability.

Despite sharing similarities with libc-based implementations in sysfs enumeration, Busybox’s developers view the syscall-based approach as a secondary yet dependable method. This acknowledgment highlights sched_getaffinity() as an alternative in scenarios permitting multiple methods, but also as the preferred method when only one can be chosen.

Consideration about Portability:

Having explored various existing methods for accessing CPU count data, it’s apparent that despite the availability of libc-provided methods, many userspace utilities opt to implement their own functions for this purpose. Reflecting on this, in a new implementation, the recommendation leans toward leveraging the libc-provided method, specifically sysconf(_SC_NPROCESSORS_CONF). This approach ensures compatibility and adherence to standard interfaces across diverse Linux environments.

However, in specific scenarios where unique constraints exist, such as employing ulibc in an environment without accessible or mountable special filesystems, it may necessitate a custom implementation for CPU count retrieval. In such cases, an alternative suggestion would be to base the implementation on the sched_getaffinity() syscall, given its usability across varied contexts.

Method description usable memory footprint
Sysfs file /sys/devices/system/cpu/possible the file gets accessed and it contains the number we need when sysfs is mounted lowest
sysfs directory /sys/devices/system/cpu the directory is scanned and cpu[0-9]+ subdirectories are searched. Count the items provides the number. when sysfs is mounted low
procfs file parsing /proc/cpuinfo the file is accessed and the contents parsed. Counting CPU occurence provides the number. when procfs is mounted medium
sched_getaffinity() the scheduler affinity table is copied to the userspace, items are counted. always possibly high

sched_getaffinity() sample implementation

To end this article I want to provide a sched_getaffinity() based sample implementation.

#define _GNU_SOURCE
#include <sched.h>

#define MAX_CPUS 2048

int nproc(void){
        unsigned long mask[MAX_CPUS];
        unsigned long m;
        int count = 0;
        int i;

        if (sched_getaffinity(0, sizeof(mask), (void*)mask) == 0) {
                for (i = 0; i < MAX_CPUS; i++) {
                        m = mask[i];
                        while (m) {
                                if (m & 1) count++;
                                m >>= 1;
                        }
                }
        }
        return count;
}

No comments:

Post a Comment