SNIA Persistent Memory And Computational Storage Summit, Part 2
SNIA held its Persistent Memory and Computational Storage Summit, virtual this year, like last year. The Summit explored some of the latest developments in these topics. Let’s explore some of the insights from that virtual conference from the second day.
Gary Grinder from Los Alamos National Labs spoke about computation near storage for high performance computing (HPC) applications. In 2023 they run HPC simulation systems with 10PB DRAM, 100PB of Flash and 500PB of HDDs just to do one job with 10PB files and 200PB campaigns. Computing near storage makes routine but intensive data management tasks more efficient and cost effective. It also makes routine data analysis tasks more efficient and cost effective.
They are also changing the way they do data analysis—focusing on records rather than files. Computational storage allows somewhat higher compression rates, expensive coding/decoding to protect against correlated failures, achieve higher per-server and per-device bandwidths and lower server costs and quantities. The image below shows a computational storage array with several partners.
Among some of the improvements with computational storage are near-device indexing and analytics and reducing reliance of massive compute tier as a large merge sort space. Speeding up indexing resulting in a 1,000 speed up in analysis steps.
Charles Fan, CEO and founder of MemVerge spoke about Big Memory in Multi-Cloud. He spoke about multi-cloud memory-as-a-service and composable memory plus an intelligent memory services roadmap—see below.
MemVerge’s Memory Machine enabled higher virtual machine density, lower space, power and cooling and thus lower TCO. In-memory snapshots on CXL provide protection and faster recovery. They report a 60%+ shorter execution time for scientific computing applications. In the cloud, loading the running application in DRAM with these snapshots improves performance. They also allow working with PMem and bursting to the cloud or moving between cloud instances.
Scott Shadley from NGD Systems and Jason Molgaar from AMD spoke about the latest efforts in the SNIA Computational Storage Technical Working Group, which they co-chair. Currently 51 companies are involved with 158 individual members. They have various webinars, blogs and events. They have also a SDXI sub-group collaboration and are ensuring alignment with the Security TWG. In addition, they collaborate with NVM Express and xPU.
Alan Benjamin, CEO of GigaIO spoke about CXL Advancing the Next Generation
of Data Centers. He said that CXL 2.0 introduces memory QoS which can be sued to prevent head of line blocking in heterogenous memory systems (e.g. with DRAM and PMem). It also adds memory interleaving (2, 4 and 8-way interleaving), a standardized register interface and global persistent flush. There is a request to add 3, 6, 12 and 16-way interleaving options. He also had a slide with information on the not yet released. CXL 3.0 specification, see below.
Rob David, VP from NVIDIA spoke about GPU+DPU for Computational Storage. He pointed out that a DPU can accelerate data center workloads and it is more compact than conventional storage arrays or JBOF. NVIDIA is offering a combined GPU and DPU PCIe card, shown below, which is especially useful for computational storage AI solutions.
GPU plus DPU combined computational storage solutions allow analyzing in place, extracting, loading and transforming upon demand.
Shyam Iyer, from Dell and Chair of the SNIA SDXI Technical Work Group spoke about SDXI and standardizing data movement. This is particularly important with the proliferation of accelerator and memory technologies. SDXI stands for Smart Data Accelerator Interface and the SNIA SDXI working group is working towards a 1.0 specification. A memory-to-memory data movement interface is seen as an enabler to persistent memory technologies and computational storage use cases. The v0.9-rev1 version of the specification is out for public review.
Software memcpy is the current data movement standard but it takes away from application performance and incurs software overhead and is not standardized for user-level software. The SDXI approach avoids creating buffer copies in order to move data, data movement is direct between memories. Another important principle for SDXI in data in use memory expansion that can support different tiers of memory and a diversity of accelerator programming methods. The figure below shows different types of accelerators and memories where data movement will be required.
In SDXI data movement is between different address spaces and isn’t mediated by privileged software (once a connection has been established). Tit allows abstraction or virtualization by privileged software. It has the capacity to quiesce, suspend and resume the architectural storage of a per-address-space data mover. It also enables forward and backward compatibility across future specification revisions and incorporates additional offloads in the future. It is a concurrent DMA model.
Andy Walls, IBM Fellow spoke about Computational Storage for Storage Applications. IBM uses processing built into their flash core modules (FCMs) to offload storage application processing to the SSD (such as compression). This work can also be distributed among several FCMs. He noted that the NAND flash has close to 4.6X more bandwidth than the IBM storage system bandwidth.
This extra bandwidth can be used for storage functions, such as compression, but it can also be used for other purposes, such as monitoring real time changes in entropy, heat of access and how data is changed and is changing. It can be used for filtering, searching and scanning as well. This can support various AI operations and also used for malware detection. He said that work is being done by standards bodies to define the APIs for host file systems or databased to scan volumes.
At the end of the second day, Dave Eggleston had another speaker panel discussion and David McIntyre from Samsung gave a recap of the day.
On day 2 of the SNIA PM and CS Summit computation near storage, big memory management in multi-clouds were discussed as well as the latest from the SNIA Computational Storage Group. CXL and PMem were explored as well as combinations of GPU and DPUs. Standardized data movement is ongoing in the SDXI TWG and new computational approaches were explored.