xentr_theme_editor

  • Please do not post any links until you have 3 posts as they will automatically be rejected to prevent SPAM. Many words are also blocked due to being used in SPAM Messages. Thanks!

ECC Memory & AMD's Ryzen - A Deep Dive Comment Thread

L1 Tech's Wendell has tested and reported findings in 2 videos I've seen. I wanted more information from another source to corroborate or refute the findings so I'm glad to written article that covered this thoroughly from HWC. Thanks!

Edit - To be clear, this review is far more comprehensive, conclusive, and the information easy to find. I see Wendell already retweeted it instantly :thumb:
 
Last edited:
Informative and useful, thank you. Looking forward to seeing what AMD has in store for us with their HEDT processors/platform.
 
HardwareCanucks said:
That is an uncorrected error (UE), otherwise known as a multi-bit error or a hard error. Multi-bit errors cannot be corrected by ECC memory. What is supposed to happen when they occur is that they should be detected, logged and the system should be immediately halted.
I disagree. An uncorrectable error does not mean that the system must be halted.

A machine check exception (MCE) will be raised and the operating system will be informed about the error. The operating system can then look up how the affected memory region is used, and take action based on this.

If it is kernel memory or dirty cache, then usually the system must be halted to prevent further corruption.
If it is unused memory, then nothing needs to happen.
If the memory belongs to a process, then that process can be terminated.
If it is non-dirty cache memory, then the cache page can be discarded.
 
I disagree. An uncorrectable error does not mean that the system must be halted.

I can certainly check my documentation again, but multiple sources had listed that the default (or just most basic) response to a UE is a kernel panic / system halt. There might certainly be other alternatives, but considering how rare a UE is - and how it's mostly associated with hardware failure - a system halt was highlighted as the most optimal response.
 
Interesting, although even when I set the Stress test to use all available memory, or caused enough instability to start corrupting Ubuntu, nothing happened aside from the UEs being detected. So based on my understanding of ECC I made the caution-minded conclusion that nothing was being done behind the scenes with respect to correcting multi-bit errors.
 
Last edited:
Nice article on using ECC w/Ryzen. While I wasn't expecting much in the way of actual support for ECC yet, I was initially curious about the existence of 16GB single rank chips such as this Kingston ECC module. Would be nice to see an evaluation done on those, from a system performance standpoint.
 
Nice article on using ECC w/Ryzen. While I wasn't expecting much in the way of actual support for ECC yet, I was initially curious about the existence of 16GB single rank chips such as this Kingston ECC module. Would be nice to see an evaluation done on those, from a system performance standpoint.

Hey, regrettably that modules is Registered ECC, so it won't work on this platform.
 

Latest posts

Back
Top