Intel’s Optane & DC P4800X; A Deep Dive
Say Hello to Intel’s Optane
Optane technology is the answer to four issues that Intel has identified with existing Enterprise systems. These issues are loosely grouped as ‘Endurance’, ‘Latency’, ‘IOPS Performance’, and ‘Consistency’. Some of these four building blocks do overlap – as performance and consistency do go hand in hand- but Optane Technology sets out to answer all four at the same time. In order to do this Intel has leveraged existing standards and combined them with new technology to create something entirely new.
That is the background details of Optane Technology, and sets the lens through which this technology has to viewed and judged. So exactly what is Optane Technology and what does it consist of?
At its heart resides NVMe and its greatly improved Host Controller Interface communications protocol. This PCIe-based interface removes the ‘middleman’ or Platform Controller Hub by the simple expedient of moving part of the storage controller duties onto the non-volatile memory device and the rest directly to the CPU.
In the simplest terms NVMe devices talk directly to the CPU via the PCIe bus and need not take any side trips through secondary controllers. As we have shown numerous times in past this removal of both the SATA communication protocol and the PCH dramatically improves overall performance, reduces latency and generally makes a system be much more responsive. Put another way, it allows a server to get more done in less time by making the CPU wait less on I/O requests to/from the non-volatile memory. This is why NVMe based storage has become the de-facto standard for high performance servers.
Upon this solid foundation, Intel has built the ground floor of their Optane Technology by using a second generation NVMe controller. At this time details on exactly what this controller consists of are few and far between. What is known is what it can accomplish via Intel’s upcoming Optane DC 4800X drive. However, its true performance may indeed be bottlenecked by using a 4 lane PCIe interface and future NVDIMM form-factor iterations could provide much higher performance.
To harness the full potential of the NVMe interface and their next generation controller Intel has paired this duo up to truly next generation non-volatile memory. Intel was extremely coy on specifics but what is known is Optane Technology doesn’t use typical NAND solid state memory and rather uses 3D XPoint non-volatile memory technology instead.
3D XPoint does not rely upon NAND floating gate transistors and rather uses an entirely new way of storing data in a 3D Matrix. Most likely instead of using a transnational gate, the properties of the cell material itself changes and remains so long enough to be considered non-volatile. How long will they stay changed before ‘reverting’ back to their base state is unknown but at this time the NAND we are all familiar with is rated in mere months for data stability – so in order to be merely as good as TLC 3D NAND Intel does not have to clear that high a bar.
Much like 3D NAND, 3D XPoint builds ‘up’ and not just in two dimensions like old-school planar NAND. This allows the density of a next generation 3D XPoint IC to be downright massive in comparison to what was available back in early 2015. The recent advent of 3D NAND is however why Intel was unable to keep their promise of improved density and ‘1000x the endurance’ as NAND. Both have continued to evolve and use smaller fabrication nodes so that was impossible. Basically, when comparing SLC 3D XPoint density to 3D TLC NAND density, 3D XPoint is not going to offer the improvements it could have been if it had been launched before 3D TLC NAND memory.
What 3D XPoint does however is improve endurance well beyond what modern NAND can offer. Some of this is due to the memory technology used, but a lot has to do with the fact that 3D XPoint is SLC in nature. Put another way, instead of trying to cram two or three bits of data inside one cell 3D XPoint only stores one bit. This means a cell is either ‘on’ or ‘off’. This alone allows for much greater endurance compared to common NAND implementations and also greater responsiveness.
As for specifics on endurance, Intel is rather tight lipped as 3D XPoint memory is immature right now. However, simple extrapolation from the Intel Optane DC P4800X specifics net us a ballpark number. Specifically, the 3D XPoint that will be available to enterprise consumers can handle 30 drive writes per a day, every day for five years and then some.
A conservative extrapolation would place 3D XPoint endurance at approximately 50,000 cycles versus the 3,000 to 5,000 cycles modern MLC NAND is rated for – or a ‘mere’ 10x endurance improvement. This is a far cry from the 3 to 5 million cycles it would need to meet those initial claims but Intel is adamant that as the technology matures this endurance will increase and not decrease like NAND memory has.
Future predictions aside, this is a first-generation device so Intel is most likely being incredibly conservative in their estimates. A good portion of how Intel claimed a ‘1000x’ endurance improvement is in real world usage scenarios and not the unrealistic method used for NAND.
This is a possible explanation as the controller is going to be incredibly gentle on the 3D XPoint memory compared to NAND thanks to what Intel calls ‘Write in Place Technology’. Write in place technology is a non-destructive write process that takes a page from RAM and is byte addressable and not block addressable. Thus only a mere eight cells needs to be to change its state, instead of writing – or erasing – an entire block at a time.
Could this actually be the secret to how 3D XPoint can claim such low latency? Possibly since each cell does not need a slow transistor and can be written to or read from almost directly. This is actually where the ‘XPoint’ of the name comes into play as woven through the entire ‘3D block’ of memory are cross point wire connectors. Above and below each cell is a positive and negative wire. In each cell is a selector that actually controls the cell state. These wires talk directly to each cells connector and allow for much finer grain control when compared to NAND which only reads at a large block cluster level. This difference does improve overall data transmission efficiency. This improvement nets a higher internal communication speed, and a lower latency for completing I/O requests.
Utilizing this structure, Performance not only skyrockets but internal housekeeping is greatly simplified. It allows the NVMe controller to waste fewer cycles on these necessary tasks while still ensuring that there are always free cells ready to be written too. This incredible granularity is also why the Intel Optane DC P4800X can boast such massive improvements in shallow and deep queue depth performance – as each write requests requires such little overhead compared to NAND.
In this, Intel has set expectations high as they claim a 35X improvement in deep queue depth performance compared to the monstrously powerful DC P3700 and up to 10x the improvement in shallow queue depth performance over that self-same model. That is rather impressive when you consider the DC P3700 was arguably the best enterprise storage device – up until now.
Now to throw a little bit of cold water on the parade. This improvement in overall performance is not 1000x that of the best NAND based memory storage devices, nor is it a 1000x improvement in latency. Once again Intel was a tad late to bringing 3D XPoint to market and in all likelihood the NVMe protocol was never designed with NAND in mind. Rather it was meant for 3D XPoint from the get go. This does explain some of the more curious details in the NVMe protocol and many an expert did wonder aloud about how NAND would ever be able to fully harness such specifications without running into massive engineering issues. In either case, it is why the real world latency improvement is ‘only’ 20-25 times that of say a NVMe NAND based DC P3700, and the real world performance is ‘only’ up to 77 times that of the DC P3700. Yes that is massive and we doubt many will be overly upset with such generational improvements, but it is not a thousand times greater.
That however is not being entirely fair to Intel’s Optane Technology as we have saved the best for last. This new technology may use NVMe as its foundation, and its ground floor may be 3D XPoint, but Intel has added an entire new level to the Optane infrastructure. Unlike past generations of hardware that were pretty much developed independently of other developments, Intel’s Optane Technology department worked hand in hand with other Intel development teams. The end result of this co-operation is Intel Memory Drive Technology. This new technology is the secret weapon in Intel’s Optane Technology’s arsenal and solves not only the latency issue but also the performance issue. The end result is a device that can take on DDR4 RAM roles. We will go over this in the next section.
- Say Hello to Intel's Optane and the DC P4800X
- Memory Drive Technology: Solving the Latency Issue
- Summarizing the Next Steps