Friday, April 28, 2023

Is BIOS Self-Test Still Effective for Modern Motherboards?


Introduction

Out of nowhere, my system has stopped functioning. I'm not entirely sure why, but both the Gigabyte B450M DS3H V2 and the AMD Ryzen 5 5500, which had been working fine for a few days, are now experiencing issues. I want to mention that I used this motherboard in an unusual way - as a worker for compilations. As a kernel engineer, I have a lot of compiling to do in my daily work, so I decided to purchase a reasonably powerful system to speed up the process. Since a full computer with keyboard and monitor was unnecessary, I opted for just the motherboard. And, being a bit of a nerd, I decided to leave it exposed. To manage the setup, I created a self-made device that acted on ATX controls and interfaced with the serial port. This way, I could bring up the system remotely without leaving my chair. Although I took all possible precautions to make the control device safe, I cannot entirely rule out the possibility that it may be the root cause of the problem. However, I find this to be highly unlikely. In this post, I would like to delve deeper into the malfunctioning of this motherboard and hopefully determine if the cause is my device or something else.

Power on self test (POST)
It's been a while since I've had to troubleshoot these types of problems. In the past, I remember using a POST card to troubleshoot these issues. However, it has been over 20 years since I last used one, and the POST card I made in my high school electronic laboratory is no longer usable. The real question is, given all the technological changes that have occurred, is the POST still implemented in modern BIOSes?
And if so, how can it be used when the CPU bus is no longer exposed?
BTW, what is the POST?
Motherboard POST, or Power-On Self-Test, is a diagnostic mechanism that has been used for decades to detect hardware issues during the boot-up process of a computer.
When a computer is turned on, the BIOS runs a series of tests on the motherboard and connected components to ensure that they are functioning correctly.
POST is effective because it runs before the graphics and text systems are initialized, allowing it to the bios to emit information about hardware faults that prevent the system from booting.
Over the years, the concept of POST has remained largely the same, but the underlying technologies and buses that connect components in Intel based mainboards have changed significantly.
In the early days of computing, motherboards were relatively simple and relied on a standardized bus, the ISA bus.
The ISA bus consisted of a set of parallel signal lines that allowed data, addresses, and control signals to be transmitted between components.
One of the key features of the ISA bus was its simplicity. Unlike more modern buses, such as PCIe, the ISA bus did not use complex packet-based communication or advanced error detection mechanisms. Instead, it relied on a straightforward master-slave architecture, where the CPU was the master and all other devices on the bus were slaves.
This simplicity made it relatively easy to intercept signals from the bus and expose them for diagnostic purposes.
For example, it was common to use a logic analyzer or oscilloscope to capture signals on the ISA bus and analyze them to diagnose hardware faults.
Additionally, the ISA bus pinout was well-defined and standardized, making it easier for technicians to create diagnostic tools and adapters that could interface with the bus using discrete components such as logic gates.
Here is the ISA bus pinout:
PinNameDirDescription
A1/I/O CH CKINPUTI/O channel check; active low=parity error
A2D7BIDIRECTIONALData bit 7
A3D6BIDIRECTIONALData bit 6
A4D5BIDIRECTIONALData bit 5
A5D4BIDIRECTIONALData bit 4
A6D3BIDIRECTIONALData bit 3
A7D2BIDIRECTIONALData bit 2
A8D1BIDIRECTIONALData bit 1
A9D0BIDIRECTIONALData bit 0
A10I/O CH RDYINPUTI/O Channel ready, pulled low to lengthen memory cycles
A11AENOUTPUTAddress enable; active high when DMA controls bus
A12A19OUTPUTAddress bit 19
A13A18OUTPUTAddress bit 18
A14A17OUTPUTAddress bit 17
A15A16OUTPUTAddress bit 16
A16A15OUTPUTAddress bit 15
A17A14OUTPUTAddress bit 14
A18A13OUTPUTAddress bit 13
A19A12OUTPUTAddress bit 12
A20A11OUTPUTAddress bit 11
A21A10OUTPUTAddress bit 10
A22A9OUTPUTAddress bit 9
A23A8OUTPUTAddress bit 8
A24A7OUTPUTAddress bit 7
A25A6OUTPUTAddress bit 6
A26A5OUTPUTAddress bit 5
A27A4OUTPUTAddress bit 4
A28A3OUTPUTAddress bit 3
A29A2OUTPUTAddress bit 2
A30A1OUTPUTAddress bit 1
A31A0OUTPUTAddress bit 0
B1GND Ground
B2RESETOUTPUTActive high to reset or initialize system logic
B3+5V +5 VDC
B4IRQ2INPUTInterrupt Request 2
B5-5VDC -5 VDC
B6DRQ2INPUTDMA Request 2
B7-12VDC -12 VDC
B8/NOWSINPUTNo WaitState
B9+12VDC +12 VDC
B10GND Ground
B11/SMEMWOUTPUTSystem Memory Write
B12/SMEMROUTPUTSystem Memory Read
B13/IOWOUTPUTI/O Write
B14/IOROUTPUTI/O Read
B15/DACK3OUTPUTDMA Acknowledge 3
B16DRQ3INPUTDMA Request 3
B17/DACK1OUTPUTDMA Acknowledge 1
B18DRQ1INPUTDMA Request 1
B19/REFRESHBIDIRECTIONALRefresh
B20CLOCKOUTPUTSystem Clock (67 ns, 8-8.33 MHz, 50% duty cycle)
B21IRQ7INPUTInterrupt Request 7
B22IRQ6INPUTInterrupt Request 6
B23IRQ5INPUTInterrupt Request 5
B24IRQ4INPUTInterrupt Request 4
B25IRQ3INPUTInterrupt Request 3
B26/DACK2OUTPUTDMA Acknowledge 2
B27T/COUTPUTTerminal count; pulses high when DMA term. count reached
B28ALEOUTPUTAddress Latch Enable
B29+5V +5 VDC
B30OSCOUTPUTHigh-speed Clock (70 ns, 14.31818 MHz, 50% duty cycle)
B31GND Ground
    
C1SBHEBIDIRECTIONALSystem bus high enable (data available on SD8-15)
C2LA23BIDIRECTIONALAddress bit 23
C3LA22BIDIRECTIONALAddress bit 22
C4LA21BIDIRECTIONALAddress bit 21
C5LA20BIDIRECTIONALAddress bit 20
C6LA19BIDIRECTIONALAddress bit 19
C7LA18BIDIRECTIONALAddress bit 18
C8LA17BIDIRECTIONALAddress bit 17
C9/MEMRBIDIRECTIONALMemory Read (Active on all memory read cycles)
C10/MEMWBIDIRECTIONALMemory Write (Active on all memory write cycles)
C11SD08BIDIRECTIONALData bit 8
C12SD09BIDIRECTIONALData bit 9
C13SD10BIDIRECTIONALData bit 10
C14SD11BIDIRECTIONALData bit 11
C15SD12BIDIRECTIONALData bit 12
C16SD13BIDIRECTIONALData bit 13
C17SD14BIDIRECTIONALData bit 14
C18SD15BIDIRECTIONALData bit 15
D1/MEMCS16INPUTMemory 16-bit chip select (1 wait, 16-bit memory cycle)
D2/IOCS16INPUTI/O 16-bit chip select (1 wait, 16-bit I/O cycle)
D3IRQ10INPUTInterrupt Request 10
D4IRQ11INPUTInterrupt Request 11
D5IRQ12INPUTInterrupt Request 12
D6IRQ15INPUTInterrupt Request 15
D7IRQ14INPUTInterrupt Request 14
D8/DACK0OUTPUTDMA Acknowledge 0
D9DRQ0INPUTDMA Request 0
D10/DACK5OUTPUTDMA Acknowledge 5
D11DRQ5INPUTDMA Request 5
D12/DACK6OUTPUTDMA Acknowledge 6
D13DRQ6INPUTDMA Request 6
D14/DACK7OUTPUTDMA Acknowledge 7
D15DRQ7INPUTDMA Request 7
D16+5 V  
D17/MASTERINPUTUsed with DRQ to gain control of system
D18GND Ground
In the early days of AT computers, when such ISA bus was the standard, and it was relatively simple to intercept signals from the bus and expose them. This made it easy to create diagnostic devices such as the POST card, which could catch data sent over an I/O port and display it on a LED readout.

Is post still implemented in BIOSes?
Before proceeding with any further investigations, it's important to confirm whether POST is still implemented in modern BIOSes. To do this, the first step would be to dump a BIOS and analyze it to see if POST is present.

sudo dd if=/dev/mem of=bios.bin bs=1M count=1
Executing the command may encounter difficulties due to security measures in modern Linux kernels. For instance, if the kernel is started with secureboot, a Linux Security Module (LSM) called lockdown would prevent any read operation from /dev/mem. However, if the command succeeds, it will produce a file containing the first megabyte of memory. It is worth noting that BIOS mapping is quite intricate, but it is possible to start with the reset vector, which is located at f000:fff0. This address was used by the old i8086, the precursor of the entire x86 family, and is still used by modern processors such as the AMD Ryzen, 40 years later.
Do you recall the famous quote often attributed to Bill Gates, "640K is enough for anyone"?

This statement stemmed from the memory allocation limitations of the old i8086 machines, where the bus was only 20 bits wide, resulting in just one megabyte of addressable memory.

As the upper memory was reserved for the ROM, the system could only utilize the lower 640k of RAM.
Because modern x86 systems start in real mode, which operates in the same way as an i8086 with the same limitations, the downloaded megabyte must contain both the reset vector and a portion of the BIOS, including at least the initial instructions.

Upon further investigation of the BIOS I downloaded from my computer, I have discovered that modern BIOSes still utilize the traditional method of sending POST codes through port 0x80. Furthermore, additional tests performed on the BIOS update downloaded from the malfunctioning motherboard producer's website revealed that it also employs this same method of emitting POST codes. Investing a few dollars in a modern POST card would be a sensible decision, but the choice of technology depends on the system being tested. Based on the code observed in the BIOS, it seems impractical to invest in a PCI board. Unless the PCI bridge has complex logic, these instructions won't reach any PCI device. Since it's impossible to assume the type of PCI bridge present, alternative options must be explored.

LPC bus
Given that ISA is no longer available and the CPU bus is no longer accessible, the question arises as to how the "out 80h, al" data can be retrieved.
One possible solution is to use what makes the SuperIO chip work. SuperIO still supports legacy functions such as the FDC controller or the legacy PIC that needs to use signals the same as a POST card would need.
Looking at how SuperIO chip communicates with the PCH, you can see a dedicated bus exists: the LPC bus.
The Low Pin Count (LPC) bus was introduced in the mid-1990s as a replacement for the ISA bus in desktop and mobile computers.
The LPC bus was designed to be a low-cost, low-speed bus that could be used for simple I/O operations, system management functions, and firmware updates.
The LPC bus was integrated directly into the Southbridge chip, which was responsible for managing the I/O operations of the motherboard.
On modern motherboards, the LPC bus is typically located near the edge of the motherboard, close to the BIOS chip and the Super I/O chip.

The LPC bus can be used for a variety of system management functions, such as power management, temperature monitoring, and fan control. The Southbridge chip communicates with the BIOS and the Super I/O chip over the LPC bus, allowing it to monitor system activity and adjust settings as needed. Additionally, some modern motherboards also include a TPM (Trusted Platform Module) chip, which is also connected to the LPC bus. The TPM chip provides hardware-based security functions, such as data encryption and secure boot, and is used to protect sensitive data on the system.

Worth note that in newer motherboard architectures, the Northbridge and Southbridge chips have been replaced by the Platform Controller Hub (PCH). The PCH is a single chip that integrates many of the functions that were previously handled by separate chips, including the Southbridge, PCIe controllers, and system management functions. Like the Southbridge, the PCH includes an LPC bus interface, allowing it to communicate with the BIOS and other components on the motherboard. Despite this change in architecture, the LPC bus remains a critical component of modern motherboards, providing important system management and firmware updating functions.
The only known point in which the LPC bus is exposed is in TPM connectors, let's try to exploit



Epilogue
In summary, it turns out that the same POST method used in the 1980s is still being used in modern BIOSes today. The most dependable port for accessing the POST is the TPM port, which is present on most modern motherboards because it provides access to the LPC bus.
This means that any LPC-enabled POST card can provide the POST service and deliver the POST codes for analysis.
However, it seems that the mainboard in question is currently stuck in a loop within the BIOS.

Referring to the GIGABYTE postcodes source, it is revealed that my motherboard is displaying the codes de, ad, a1, 35, 02, 56 repeatedly, indicating a problem with the CPU (56 Invalid CPU type or speed). Therefore, I can safely assume that the device I used to manage the mainboard is not responsible for this issue.