Intel’s Optane & DC P4800X; A Deep Dive
Memory Drive Technology: Solving the Latency issue
When dealing with massive I/O requests per second the largest bottleneck is not the CPU or even the size of the BUS; it is the latency of the memory that is used. In the past non-volatile memory (aka ‘NAND storage’) had latency that was measured in microseconds (µs), whereas volatile memory (aka ‘RAM’) has latency that was measured in nanoseconds (ns). For example a top of the line Intel Data Center P3700 NVMe SSD has a real world latency of about two hundred thousand nanoseconds (200µs), whereas ECC RAM has a real world latency of about two hundred nanoseconds to complete a simple memory access I/O request at low I/O levels and one thousand nanoseconds under heavy loads.
In basic terms this means that RAM has a latency that is about 1,000 times lower than NAND based Solid State Storage. Intel’s Optane DC P4800X has a latency of under 10,000 nanoseconds (10µs) or only 10-50 times slower than RAM. On its own that is still a massive difference. However, latency is only half the equation. The other half is memory management and load management. This is where the true secret to Optane’s revolutionary leap forward happens.
The typical Integrated Memory Controller inside a processor has not changed all that much generation from generation. The reason is that CPU engineers did not need to refine this controller – as nothing was able to even come close to its transient abilities. This means load balancing, and out of order execution is rather primitive and rather basic compared to modern ‘storage’ controllers. Once again this is because it did not need to be refined and the largest performance boost in recent memory was placing it directly on the CPU. RAM is simply the fastest external memory going and only on-core (L1 at ~0.5ns or L2 at ~7ns) memory is faster so improving this area of the CPU was down on the list of priorities. Intel’s Optane engineers on the other hand did have to expend the effort and had to make it a priority to cope properly with Optane.
In most simplistic terms possible, a typical Optane implementation does more with less thanks in part to its next generation NVMe controller and more elegant Non-Volatile Memory Host Controller Interface communications protocol. This protocol is cutting edge and has boosted NAND performance to heights Intel never foresaw when they started to talk about 3D XPoint. Though the majority of the reasons Intel can claim that the Intel Optane DC P4800X can offer 80 percent or more of the performance of RAM without even being close to it in latency is not because of NVMe. Rather it is because of Intel Memory Drive Technology.
MDT is a brand-new technology that resides between the BIOS and the OS and can make the OS believe that a drive like the DC P4800X is indeed memory and not storage. This however is only the tip of the iceberg. In addition to being seen as ‘RAM’ this new technology can in real-time load balance and even re-organize incoming I/O requests so that the CPU is not waiting as long as it was accessing the IMC and the RAM bus. This is what allows the Intel Optane DC P4800X’s real world performance to be very similar to that of RAM.
During the recent Intel Optane conference, Intel used the example of a 675GB MySQL database. To maximize performance of such a data base usually means loading the entirety of it into memory and letting the server churn through the requests without ever accessing the ‘slower’ non-volatile storage. To do show this level of performance Intel equipped a Dual XEON E5-2699 v4 server with 768GB of DDR4 ECC RAM, DC P3700 SSDs and ran it. The result was 1,077 Transactions per second. Then they removed 512GB of RAM and replaced this expensive memory with four 375GB Optane DC P4800X drives and reran the test.
Since the database could not reside in memory it had to use the Optane drives. The end result? 8,70 transactions per second or 80.8% of the performance. If this reflects actual real world performance that is indeed not too bad when you consider high density ECC DDR4 RAM costs over ten US dollar per gigabyte and Intel Optane DC P4800X will be in the four dollar (USD) per Gigabyte range.
There is of course another caveat. Intel MDT will only work with XEON CPUs and requires a motherboard be capable of doing this magic. It is also most likely very limited in the Operating Systems it can work with as well. Thus older generation hardware need not apply, nor should home enthusiasts get their hopes up about it working with vanilla Windows 10. L/Unix users on the other hand may indeed gain a few converts if Intel MDT technology works out in the real world as well as it did in Intel testbeds.