Reimagining HPC with Intel Optane DC SSD

Traditionally HPC has always been about “more”, since with more you do things like calculations or data processing much faster. More CPU cores means you can process faster. More RAM means you can handle more at lower latency. More storage means you can store (you guessed it…) “more”.

In CPU technology (and recently in GPU technology) the advancements in performance are moving at a pace. Similarly, with storage we see solid state drives playing an increasingly important role compared to hard disks. In DRAM, the technology is improving, albeit at a much lesser rate compared to that of CPUs and storage (mainly through capacity and increases in RAM speed). DRAM, unfortunately is becoming somewhat of a bottleneck – particularly as its extremely expensive so just adding a whole lot “more” isn’t always an option plus there are the physical limitations of the number of slots on a motherboard.

Enter stage right, Intel’s Optane P4800X data center SSD family. By combining its unique properties and performance, Intel have produced an SSD capable of bridging the gap between volatile memory structures like DRAM and non-volatile like SSDs and HDDs. For HPC this literally changes the conversation.

 

Silos, Fat Memory Nodes, I/O Nodes, Bottlenecks: The Old Way

The FAT memory Node

Storage I/O nodes

When looking at a standard HPC cluster, you will see silos of workload specific nodes which can be broadly split up into compute nodes (CPU intensive), FAT Memory nodes (DRAM intensive) and storage nodes (storage intensive). The entire cluster obviously benefits from the combined performance of each of these node classes however, it doesn’t allow for much flexibility – especially if the cluster must run varied workloads. In a variable workload scenario, the user can easily run into bottlenecks if a particular job is more RAM intensive or storage intensive than the cluster was originally designed for.

 

Flex Nodes based on Intel Optane DC SSD: The New Way

One of the unique features of Intel Optane P4800X SSDs is that in combination with Intel Memory Drive Technology (IMDT) the system can recognize the capacity of the SSD as DRAM. It doesn’t have the same latency as DRAM but it is many times more responsive than normal NVMe SSDs. This is due to the special technology 3D Xpoint that Intel and Micron have co-developed which sets it apart from NAND.

What it means for HPC is game-changing. Rather than having workload specific memory and storage nodes, you can create a “flex-node” who’s function can be defined by software. Need additional DRAM like capacity? No problem, just define the Optane SSDs as memory drives and the cluster suddenly sees hundreds of GB of “RAM”. Need more storage? Then simply run the P4800X SSDs as high throughput, ultra-low latency, high endurance storage devices which can be written to many times without fear of significant wear.

For those looking for a flexible HPC solution, this is somewhat of a holy grail, of course with certain caveats (despite its impressive performance, it still significantly slower than actual DRAM). But for a mixed-workload cluster, this represents a quantum leap forward.

To learn more about this technology and mixed-workload HPC cluster designs based on Optane, please contact info@clustervision.com