Introduction:
Understanding the computational capabilities of a system can be pivotal in optimizing software performance. The quest to determine the number of CPUs within a system often leads to exploration and evaluation of various methods. This pursuit becomes particularly critical when developing low-level tools for Linux, where having an accurate insight into the available CPU count is essential for efficient resource utilization and task allocation. In this article, I delve into different approaches to uncovering this crucial system attribute, aiming to provide insights for developers grappling with similar challenges.
Initial Proposition:
In my pursuit of determining the number of CPUs within a Linux
system, my initial approach mirrored what I typically did from the
command line: cat /proc/cpuinfo
. It’s worth noting that
while the command line utility nproc
caters to the same
query, my familiarity with the proc
method predates my
acquaintance with nproc
. Hence, it was my instinctive
choice for the initial implementation of a function to retrieve this
vital information programmatically.
int nproc() {
int cpu_count = 0;
char line[100];
FILE *file;
file = fopen("/proc/cpuinfo", "r");
if (file == NULL) {
perror("Error opening file");
exit(EXIT_FAILURE);
}
while (fgets(line, sizeof(line), file)) {
if (strncmp(line, "processor", 9) == 0) {
cpu_count++;
}
}
fclose(file);
return cpu_count;
}
Does this implementation work? Yes, it does provide the expected CPU count. However, despite its functionality, I find myself dissatisfied with this method for several reasons, which I’ll delve into shortly.
Concerns on the Solution:
While the initial /proc/cpuinfo
approach effectively
retrieves CPU information, several concerns arise, prompting
dissatisfaction with this simplistic method:
- Oversimplicity: The code’s apparent simplicity, merely opening a file and parsing statements, raises skepticism. Its straightforward nature seems inadequate for a task as critical as determining CPU count within a system. This simplicity might overlook nuances or potential intricacies within different system environments.
- Uncertainty in
/proc/cpuinfo
Stability:/proc/cpuinfo
provides information designed for human readability. This readability doesn’t guarantee immutable structure or content, leaving room for future modifications by kernel developers. While changes aren’t imminent, relying solely on this file may pose a risk of unexpected alterations in its format or content, impacting the function’s reliability. - Dependency on Procfs: The reliance on
procfs
introduces a constraint, assuming its presence and accessibility, which isn’t universally guaranteed. In specific embedded Linux systems or constrained environments,procfs
might not be mounted or accessible. Assumptions like these could jeopardize the function’s portability and reliability across diverse Linux distributions and setups. - Preference for Syscall Stability: A preference
emerges for utilizing syscalls, known for their stability and less prone
to alterations compared to file-based interfaces like
procfs
. Syscalls, by design, tend to remain more consistent across Linux distributions and versions, offering a more robust foundation for critical functionalities like retrieving CPU count.
The sysconf()
Function:
Is there a standardized, reliable way to retrieve CPU count
programmatically? After a comprehensive search, one function stood out
as a prevailing best practice:
sysconf(_SC_NPROCESSORS_CONF)
. This function, available in
the POSIX standard, offers a promising solution for obtaining the number
of processors in a system.
What is sysconf()
?
At its core, sysconf()
is a system call that allows
access to configurable system variables. Specifically,
_SC_NPROCESSORS_CONF
is a parameter used to query the
number of processors configured in the system. Additionally,
_SC_NPROCESSORS_ONLN
exists, reporting the number of CPUs
effectively active.
Does it provide a solution for this?
Indeed, sysconf(_SC_NPROCESSORS_CONF)
serves as a robust
and standardized means to obtain CPU count programmatically. Its
utilization ensures compatibility across various Linux distributions and
versions, contributing to code portability.
How
sysconf()
is implemented in the guts of libcs:
glibc implementation:
Navigating through the labyrinthine structure of glibc, a colossal
library accommodating various systems, one discovers numerous
implementations catering to different operating environments. Amidst
this complexity, glibc’s handling of sysconf()
for Linux
systems proves enlightening.
In scrutinizing glibc’s codebase, particularly for Linux-specific
implementations, it’s evident that glibc relies on sysfs
as
its primary source for retrieving CPU count information. The main
method employed by glibc to provide this data involves accessing
/sys/devices/system/cpu/possible
. This file seemingly
contains the requisite CPU count information, hence serving as the
cornerstone for glibc’s approach.
However, acknowledging the potential unpredictability of specialized
file system resources, glibc developers have implemented a fallback
system in case the primary method fails. The fallback function,
get_nprocs_fallback()
, comprises alternative methods to
retrieve CPU count information:
get_nproc_stat()
: This method mirrors my initial implementation; it parses/proc/cpuinfo
and tallies occurrences ofcpu
, akin to counting processors. While functional, it remains subject to the same concerns surrounding/proc/cpuinfo
stability and dependencies onprocfs
availability.__get_nprocs_sched()
: Addressing concerns about specialized filesystem accessibility, this method adoptssched_getaffinity()
, essentially a syscall, to deduce the number of CPUs. This syscall-based approach aligns with the desire for a more reliable, system-level means to retrieve CPU count information, bypassing potential dependencies on file system structures.
Despite glibc’s vastness and varied system support, its Linux implementation of sysconf() prioritizes sysfs as the primary resource for CPU count retrieval, supplemented by fallback methods for contingencies or specialized system scenarios.
musl implementation:
Developed with a distinct focus on constrained systems, musl libc embodies the quintessential choice for environments with stringent resource limitations. Unveiling musl’s methodology for retrieving CPU count information unveils a strategy tailored to suit such constrained environments.
In contrast to glibc’s expansive versatility, musl prioritizes
efficiency and minimalism, aligning with the requirements of
resource-constrained systems. musl’s approach to CPU count retrieval
centers around a singular
method, reliant on the sched_getaffinity()
syscall to
infer CPU count.
This method, employing sched_getaffinity()
, emphasizes a
syscall-based approach for determining CPU count information. While not
leveraging sysfs or other file system structures, musl’s implementation
demonstrates a steadfast reliance on system-level calls for this
critical system attribute. This approach aligns with musl’s ethos of
simplicity and efficiency, avoiding potential dependencies on file
system structures and instead relying solely on syscall interactions to
obtain CPU count details.
musl’s singular reliance on sched_getaffinity()
for CPU
count retrieval underscores its commitment to efficiency and simplicity
in constrained system environments, reflecting its status as a libc
tailored specifically for resource-limited setups.
ulibc implementation:
ulibc, akin to musl, stands as a libc tailored explicitly for resource-constrained systems. Its implementation strategy for CPU count retrieval reflects a minimalistic approach, aligning with the requirements of such constrained environments.
Despite the expectation of multiple methods to cater to
contingencies, ulibc’s decision to rely solely on a sysfs
based approach might initially surprise. The implementation involves
scanning the sysfs
directory
/sys/devices/system/cpu
, specifically enumerating directory
entries featuring the substring ‘cpu[0-9]’. This singular method
utilizing sysfs as the source for CPU count determination aligns with
ulibc’s ethos of minimalism and efficiency.
This reliance on sysfs, while potentially limiting if
sysfs
isn’t mounted, resonates with ulibc’s targeted
service provision to highly constrained systems. The decision
potentially avoids additional overhead that could arise from using
syscall-based methods, as it requires a support structure having a
non-negligible footprint, incompatible with resource-limited
environments.
nproc
Utility:
Examining the nproc
utility, it’s pertinent to delve
into two notable implementations: Coreutils and Busybox. Despite the
varying nature of these utilities, the expectation leans toward both
implementations relying on support from the underlying libc to provide
CPU count information.
Coreutils Implementation:
Coreutils, a fundamental suite of Unix utilities including
nproc
, typically relies on libc support for system-specific
information retrieval. The nproc
utility in Coreutils is
expected to utilize system calls or libc functions to fetch CPU count
data. Upon inspecting the nproc
utility’s codebase,
it becomes evident that the utility itself does not house specific code
for CPU count retrieval. Instead, it relies on certain environment
variables typically provided by the OpenMP library. In the absence of
these variables, nproc
falls back on a function named
num_processors_ignoring_omp()
, which finds its
implementation within the gnulib
.
The gnulib implementation
might initially appear complex, owing to its handling of the task across
diverse platforms, including one for Windows. However, restricting the
investigation to Linux unveils an implementation primarily reliant on
the sched_getaffinity()
syscall. Remarkably, this
implementation appears independent of direct dependencies on the libc,
offering an alternative method for CPU count retrieval.
By leaning on sched_getaffinity()
, the gnulib
implementation for Linux within nproc
exemplifies a
straightforward syscall-based approach for CPU count determination,
suggesting a certain level of autonomy from standard libc
functionalities.
Busybox Implementation:
Busybox, renowned for its compactness and versatility, reimagines system utilities, often independent of direct dependencies on standard libraries like libc. The implementation of CPU count retrieval within Busybox resonates with this independent ethos, showcasing a self-contained logic mirroring methodologies akin to libc implementations.
Busybox’s CPU count retrieval method primarily involves scanning the
sysfs
directory /sys/devices/system/cpu
,
specifically enumerating directory items containing the substring
cpu
. However, this method assumes prominence only when a
configuration symbol is enabled, signifying Busybox’s adaptability based
on configuration choices.
The surprising element lies in Busybox’s delineation of preferences:
designating sysfs-based enumeration as the primary and fallback method,
whereas the sched_getaffinity()
syscall serves as an
alternative choice. This nuanced approach differs from conventional
expectations, underscoring Busybox’s nuanced perspective on reliability
and system adaptability.
Despite sharing similarities with libc-based implementations in
sysfs
enumeration, Busybox’s developers view the
syscall-based approach as a secondary yet dependable method. This
acknowledgment highlights sched_getaffinity()
as an
alternative in scenarios permitting multiple methods, but also as the
preferred method when only one can be chosen.
Consideration about Portability:
Having explored various existing methods for accessing CPU count
data, it’s apparent that despite the availability of libc-provided
methods, many userspace utilities opt to implement their own functions
for this purpose. Reflecting on this, in a new implementation, the
recommendation leans toward leveraging the libc-provided method,
specifically sysconf(_SC_NPROCESSORS_CONF)
. This approach
ensures compatibility and adherence to standard interfaces across
diverse Linux environments.
However, in specific scenarios where unique constraints exist, such
as employing ulibc in an environment without accessible or mountable
special filesystems, it may necessitate a custom implementation for CPU
count retrieval. In such cases, an alternative suggestion would be to
base the implementation on the sched_getaffinity()
syscall,
given its usability across varied contexts.
Method | description | usable | memory footprint |
---|---|---|---|
Sysfs file /sys/devices/system/cpu/possible | the file gets accessed and it contains the number we need | when sysfs is mounted | lowest |
sysfs directory /sys/devices/system/cpu | the directory is scanned and cpu[0-9]+ subdirectories are searched. Count the items provides the number. | when sysfs is mounted | low |
procfs file parsing /proc/cpuinfo | the file is accessed and the contents parsed. Counting CPU occurence provides the number. | when procfs is mounted | medium |
sched_getaffinity() | the scheduler affinity table is copied to the userspace, items are counted. | always | possibly high |
sched_getaffinity()
sample implementation
To end this article I want to provide a
sched_getaffinity()
based sample implementation.
#define _GNU_SOURCE
#include <sched.h>
#define MAX_CPUS 2048
int nproc(void){
unsigned long mask[MAX_CPUS];
unsigned long m;
int count = 0;
int i;
if (sched_getaffinity(0, sizeof(mask), (void*)mask) == 0) {
for (i = 0; i < MAX_CPUS; i++) {
m = mask[i];
while (m) {
if (m & 1) count++;
m >>= 1;
}
}
}
return count;
}
No comments:
Post a Comment