What's new
  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

ECC Memory & AMD's Ryzen - A Deep Dive Comment Thread

Vittra

Well-known member
Joined
Oct 30, 2009
Messages
1,094
Location
Ontario
L1 Tech's Wendell has tested and reported findings in 2 videos I've seen. I wanted more information from another source to corroborate or refute the findings so I'm glad to written article that covered this thoroughly from HWC. Thanks!

Edit - To be clear, this review is far more comprehensive, conclusive, and the information easy to find. I see Wendell already retweeted it instantly :thumb:
 
Last edited:

ZZLEE

Well-known member
Joined
May 31, 2009
Messages
2,443
Location
KANATA
Makes a build fore something like video encoding look promising.
 

EmptyMellon

Well-known member
Joined
Nov 6, 2010
Messages
522
Informative and useful, thank you. Looking forward to seeing what AMD has in store for us with their HEDT processors/platform.
 

chithanh

New member
Joined
Mar 31, 2017
Messages
4
HardwareCanucks said:
That is an uncorrected error (UE), otherwise known as a multi-bit error or a hard error. Multi-bit errors cannot be corrected by ECC memory. What is supposed to happen when they occur is that they should be detected, logged and the system should be immediately halted.
I disagree. An uncorrectable error does not mean that the system must be halted.

A machine check exception (MCE) will be raised and the operating system will be informed about the error. The operating system can then look up how the affected memory region is used, and take action based on this.

If it is kernel memory or dirty cache, then usually the system must be halted to prevent further corruption.
If it is unused memory, then nothing needs to happen.
If the memory belongs to a process, then that process can be terminated.
If it is non-dirty cache memory, then the cache page can be discarded.
 

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,141
Location
Montreal
I disagree. An uncorrectable error does not mean that the system must be halted.
I can certainly check my documentation again, but multiple sources had listed that the default (or just most basic) response to a UE is a kernel panic / system halt. There might certainly be other alternatives, but considering how rare a UE is - and how it's mostly associated with hardware failure - a system halt was highlighted as the most optimal response.
 

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,141
Location
Montreal
Interesting, although even when I set the Stress test to use all available memory, or caused enough instability to start corrupting Ubuntu, nothing happened aside from the UEs being detected. So based on my understanding of ECC I made the caution-minded conclusion that nothing was being done behind the scenes with respect to correcting multi-bit errors.
 
Last edited:

heisryzen

New member
Joined
Mar 31, 2017
Messages
1
Nice article on using ECC w/Ryzen. While I wasn't expecting much in the way of actual support for ECC yet, I was initially curious about the existence of 16GB single rank chips such as this Kingston ECC module. Would be nice to see an evaluation done on those, from a system performance standpoint.
 

MAC

Associate Review Editor
Joined
Nov 8, 2006
Messages
1,141
Location
Montreal
Nice article on using ECC w/Ryzen. While I wasn't expecting much in the way of actual support for ECC yet, I was initially curious about the existence of 16GB single rank chips such as this Kingston ECC module. Would be nice to see an evaluation done on those, from a system performance standpoint.
Hey, regrettably that modules is Registered ECC, so it won't work on this platform.
 

Latest posts

Twitter

Top