ECC Memory & AMD’s Ryzen – A Deep Dive
Turning to Linux
The very latest iteration of the Linux kernel – version 4.10 – fully supports AMD’s new Ryzen processors, and arguably the easiest way to take advantage of that kernel is by installing Ubuntu 17.04 (Zesty Zapus). We are also once again fortunate to be using an ASRock X370 Taichi since we were not able to load Ubuntu on a GIGABYTE AX370-Gaming 5.
Since we are interested in determining whether ECC is functional on this new AM4 platform, the next step was to install edac-util, which is an incredibly useful program that reads and reports error detection and correction (EDAC) information. Specifically, this program can tell you if ECC is enabled, and if it is it will report any corrected error (CE) or uncorrected error (UE).
Let’s see what we can find:
Success! One memory controller with ECC functionality detected, and no errors to report. We can now confirm that ECC is enabled in Linux.
For more information, we decided to use the dmesg (display message or driver message) command to see what the kernel and/or kernel modules had to say about ECC:
As you can see clear as day: “DRAM ECC enabled”. While some of those RAM parameters are obviously not being read correctly, the mention of “x8 syndromes” is another confirmation since syndromes are the eight extra bits that are used for the error detection and correction process.
Just to be thorough, we installed non-ECC memory, ran the same command, and got an ECC disabled message.
We also tried the popular dmidecode utility (sudo dmidecode -t memory), but since it’s not EDAC aware it did not detect ECC.
While this all looks incredibly positive, the next step is obviously to determine whether error detection and correction is actually working, and to what extent it is working. Will the operating system detect and log errors? Will the hardware function together to correct single-bit errors and ideally halt the system when there’s a two-bit error? That is what we are going to find out next.