What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

ECC Memory & AMD's Ryzen - A Deep Dive Comment Thread

diversity

Member
Joined
Apr 8, 2020
Messages
6
Humbly, I barely understand half of what you guys are talking about, however, if I'm to build a video editting rig, ECC is better even if I'm not getting reports?
IMHO if one can't assess it's working then it's worthless. I am yet to find / read about a recent Ryzen setup that reports ECC errors
 

diversity

Member
Joined
Apr 8, 2020
Messages
6
Am I interpreting it correctly that Passmark memtest86 pro 8.3 lists ECC memory injection support only for old processors that are hardly available and would not be an ideal choice in this day and age?

  • AMD Bulldozer (15h)
  • AMD Steamroller (15h)
  • AMD Jaguar (16h)
  • AMD Ryzen (17h) [Note: Injection is disabled in most AMD retail CPUs. To enable, please consult the Processor Programming Reference document]
  • AMD Steppe Eagle SoC
  • AMD Merlin Falcon SoC
  • Intel Nehalem
  • Intel Lynnfield
  • Intel Westmere
  • Intel Xeon E3 family (Sandy Bridge)
  • Intel Xeon E3 v2 family (Ivy Bridge)
  • Intel Xeon E3 v3 family (Haswell)
  • Intel Xeon E3 v4 family (Broadwell)
  • Intel Xeon E3 v5 family (Skylake)
  • Intel Xeon E3 v6 family (Kaby Lake)
  • Intel Atom C2000 SoC
  • Intel Broadwell-H SoC
  • Intel Apollo Lake SoC
To test support for Ryzen 9, I was given 8.4 rc2 build 1000. But also no luck yet on build 1001.
I have ordered a few different x570 boards to try again:

ASRock X570 Pro 4
Gigabyte X570 Aorus Elite
ASRock X570M Pro4
ASRock X570 Extreme4
ASROCK X570 PHANTOM
ASUS Prime X570-P

i'll try and add the ASrock x570 creator aswell
 
Last edited:

Mastakilla

Member
Joined
Oct 22, 2019
Messages
13
@Mastakilla, exactly what server grade x470 asrock rack mobo did you mention in your first message in this thread?

I have been trying now with an ASrock Rack X470D4U (latest bios with ECC enabled, ECC injection enabled, Platform First error handling disabled) with a Ryzen 9 3950x with no luck.
I am using Passmark Memtest pro 8.4 rc2 Build 1001 but Passmark is still troubleshooting on the basis of my debug logs.

AMD is assuring me that a previous setup I tried
Asrock x570 (AMD AM4 socket, AMD X570 Chipset ) Creator with a Ryzen 9 3950x
does support ECC correction and reporting but I started using memtest86 pro after I had returned that setup.
I am willing to try that setup again in case the audience here will find that useful.

Anyway could it be the problem is not the CPU or Bios persee but perhaps other components?
I have for one on my X470D4U mobo the following:
Processor System
CPU - AMD AM4 Socket Ryzen™ PRO/ Ryzen™ 2nd and 3rd generation series processors
Socket - AM4 PGA 1331
Chipset - AMD Promontory X470
I have the ASrock Rack X470D4U2-2T, which should be same in the ECC aspects. ECC indeed depends on the combination of RAM, CPU (which has the memory controller), Mobo (where the BIOS determines how all of this plays together).

Humbly, I barely understand half of what you guys are talking about, however, if I'm to build a video editting rig, ECC is better even if I'm not getting reports?
ECC without reporting is better than no ECC in my opinion. And I do have "the feeling" that ECC corrections do happen / work, because when I overclock, it is usually rock solid stable or not booting at all. Finding something in between (that throws errors) is pretty hard (undervolting did the trick for me).
When overclocking with non-ECC memory, that is completely different...

But Diversity is also right, that is just a feeling, I am not sure of that...
 

diversity

Member
Joined
Apr 8, 2020
Messages
6
I am happy to report ECC reporting functional on at least 2 mobo's.
Setups tested so far
* AMD Ryzen 9 3950x & ASUS Prime X570-P
* AMD Ryzen 9 3950x & Asrock rack X470D4U
For me the X470D4U was the most important so I don't think I will continue checking the other boards.

The method I used was to use the inner wires of some electrical cable and stick it in the memory bank with 8GB ECC UDIMM in it.
using
proxmox
root@pve:~# pveversion
pve-manager/6.1-8/806edfe1 (running kernel: 5.4.24-1-pve)
I saw the report of corrected errors. From the 2 boards only the X470D4U has IPMI as far as I could tell but the IPMI log is not showing any ecc errors :( which is a mayor oversight from the manufacturer if you ask me. Let's hope they can fix that in an update.
 

rake

New member
Joined
Jun 8, 2020
Messages
1
I have ordered a few different x570 boards to try again:

ASRock X570 Pro 4
Gigabyte X570 Aorus Elite
ASRock X570M Pro4
ASRock X570 Extreme4
ASROCK X570 PHANTOM
ASUS Prime X570-P

Did you ever test the Gigabyte X570 board? Looking into new workstation and server stuff and I have an allergy past issues with ASUS kit.

Cheers
 

diversity

Member
Joined
Apr 8, 2020
Messages
6
Did you ever test the Gigabyte X570 board? Looking into new workstation and server stuff and I have an allergy past issues with ASUS kit.

Cheers
No, and I think I will never will in regards to x570. Gigabyte's official statement via 1st line support ( the way us mere mortals have to gather information) was something along the lines of: "ecc works by default, no bios settings. Single bit errors are not reported (detected and corrected only) and multi bit errors are not detected and reported.

I will give Gigabyte one more try with a strx40 board and a threadripper. If that also fails than it's game over for gigabyte as far as I am concerned.

I have great news though. You should go with an ASUS prime x570 - p board or if you would like IPMI then Asrock Rack x470d4u2-2t (x470d4u2 confirmation still in progress)

Those 2 boards definitely have proven and demonstratably support for ecc single bit detection, correction and reporting (although asrock rack has still does not report to IPMI, only to OS, they are still working on it) and multi bit detection and reporting (although asrock rack has still does not report to IPMI, only to OS, they are still working on it).
 

Summerbreeze

New member
Joined
May 8, 2020
Messages
2
The method I used was to use the inner wires of some electrical cable and stick it in the memory bank with 8GB ECC UDIMM in it.
using
Hi.
What contacts did you connect?

I found something else yesterday.
In the "AMD Generic Encapsulated Software Architecture (AGESA™) Interface Specification" from the year 2017 describes something interesting.
Site: 200
BLDCFG_ENABLE_ECC_FEATURE
This turns on the correction action and enables the ability of the MCA subsystem to report errors. It does not activate the MCA error report interrupts.
If the build option is set to include the code for the ECC feature, then this setting activates the feature. If the feature code is removed from the build, then this element has no affect.
Otherwise, you can also read through the error handling starting from chapter 3.1.3
3.1.3 Error Reporting Two methods of error reporting are provided: return codes and error logs. The return codes are described in “4.3.1 Returned Status Codes” on page 41 and provide enough detail for most host environments. For more detailed error reporting, the error log system is provided (see section “4.3.2 Error Logging” on page 42 for more details).
Maybe this fact was only overlooked by all programmers? And the error report just needs to be activated additionally.

Otherwise it's probably worth looking for "MCA", "ECC" in the "Revision Guide for AMD Family 17h Models 00h-0Fh Processors".
Since I'm not a programmer, I can not classify this correctly. In any case it looks like it's worth to have a look in there.
Unfortunately, I cannot say whether these documents represent the most recent edition.

Via a web search it is possible to find other, possibly interesting things from AMD, about this matter.

P.S. My English may sound a little bumpy sometimes. That's because I come from Germany. ;)
 
Last edited:

Latest posts

Twitter

Top