Americas

  • United States
John Edwards
Contributing writer

Startup UniFabriX uses CXL memory technology to boost rack density

Feature
Jul 17, 20236 mins
Data CenterGenerative AIServers

Smart memory node device from UniFabriX is designed to accelerate memory performance and optimize data-center capacity for AI workloads.

A binary matrix overlays a network / datacenter / server room.
Credit: Vladimir Timofeev / Getty Images

Israeli startup UniFabriX is aiming to give multi-core CPUs the memory and memory bandwidth needed to run compute- and memory-intensive AI and machine-learning workloads.

UniFabriX is pitching its Smart Memory Node technology as an alternative to socket-connected DRAM, which restricts memory capacity and bandwidth in CPUs. UniFabriX’s technology is based on CXL (Compute Express Link), an industry-supported interconnect for processors, memory expansion, and accelerators. CXL technology maintains memory coherency between the CPU memory space and memory on attached devices, which allows resource sharing for higher performance, reduced software stack complexity, and lower overall system cost.

The company was launched in 2020 by Ronen Hyatt and Danny Volkind. Hyatt formerly worked as a platform architect in Intel’s data center group. He subsequently joined Huawei as Smart Platforms CTO in 2018. Volkind served as a system architect at Intel and was later employed by Huawei as a chief architect. Both partners began their careers at Israel’s Technion Institute of Technology.

Recognizing the data center market’s challenges and potential opportunities, the duo sought solutions to address a significant hurdle: memory limitations, says Micha Risling, UnifabriX’s co-founder and chief business officer. “They identified that DRAM acts as a barrier to scaling compute operations.”

UniFabriX in the data center

When deployed inside a data center, UnifabriX’s Smart Memory Node is housed inside a 2RU chassis containing 32TB of DDR5 DRAM. “The concept is that attached servers require less local DRAM because they can utilize the shared memory from the Smart Memory Node more efficiently,” Risling explains. “This approach eliminates the issue of local stranded DRAM that’s not being utilized effectively.” UnifabriX estimates that DRAM constitutes approximately 50% of server costs.

The typical use case, Risling says, is a data center that acquires servers at the best possible price with just enough capacity to run basic applications. When deployed in a cluster’s center, UnifabriX’s Smart Memory Node acts as a resource pool that servers can draw from when they run out of memory, capacity, or bandwidth.

Sharing resources within a cluster offers several advantages, including reduced energy consumption, a smaller physical footprint, and increased flexibility. Therefore, even when equipped with the same amount of memory, deploying a cluster with 12 servers equipped with 8TB and three Smart Memory Nodes of 32TB—for a total of 192TB of RAM—would be 25% more cost-effective in terms of operating expenses than a cluster of 24 servers with 8TB of RAM, Risling explains.

UnifabriX’s potential clients, running general-purpose business servers, rarely pay much attention to memory utilization, says John Schick, a consultant with technology research and advisory firm ISG. “But niche workloads like HPC, AI, ML, and in-memory database management systems use much more memory, and clients do pay attention in those environments.”

Schick believes that UnifabriX’s technology should be able to improve throughput without increasing CPU capacity and associated software licensing costs. Risling notes that for cloud service providers (CSP), the technology will allow doubling the number of servers on a rack, from 12 to 24.

Potential benefits of CXL memory

To date, discussions around CXL memory have mostly focused on capacity expansion, memory expansion, and memory pooling, all centered around increasing capacity and reducing total cost of ownership. UnifabriX, however, offers a different perspective, one that prioritizes performance. “As processor core counts have increased, the memory channel bandwidth per core has decreased over time,” Risling notes. “This means that adding more memory to a server reaches a point where it doesn’t improve performance because the cores are unable to fully utilize it, resulting in stranded compute.”

To demonstrate the Smart Memory Node’s impact on performance, UnifabriX used the High Performance Conjugate Gradients (HPCG) benchmark, which stresses the memory subsystem and internal interconnect limitations of a supercomputer by running an application entirely within the computer’s DRAM. According to Risling, the company’s researchers observed that when the Smart Memory Node was activated, and its CXL memory was interleaved with the server’s local memory, HPCG performance “significantly improved” with all cores fully utilized.

Despite CXL memory’s slightly slower access speed of 256GB/sec, compared to DDR5 DRAM’s 300GB/sec pace, the Smart Memory Node effectively measured and addressed issues with local DRAM memory bandwidth, dynamically provisioning additional bandwidth to the socket. “This scaling capability enables enhanced performance,” Risling says. The Smart Memory Node can independently provision capacity and bandwidth based on real-time monitoring of the system and workload performance, optimizing and maximizing overall system performance through an interleaving ratio between local DDR and external CXL memory.

What makes UnifabriX different?

Risling believes that the Smart Memory Node’s key differentiating factor lies in its utilization of CXL “CXL represents the latest and most significant advancement among various standards developed to enable memory composition,” he says. “It’s the first open-standard universally adopted by major CPU vendors, facilitating concurrent transactions of memory semantics and cache semantics alongside the existing IO Semantics of PCIe.”

Risling believes that CXL marks a pivotal milestone in the architecture of compute and data center infrastructures, unlocking a range of new disruptive applications that were previously unattainable. Among these applications, he notes that memory pooling stands out as the most valuable, offering substantial returns on investment by unleashing the full performance potential of the underlying compute, reducing power consumption and total cost of ownership while eliminating the inefficiencies caused by memory and compute stranding.

Within the evolving CXL ecosystem, several firms are pursuing similar objectives, offering a range of solutions that span from line drivers, memory controllers, and memory expansion cards to CXL switches and memory pools. “While some solutions strictly adhere to the implementation of the CXL standard, others incorporate additional layers of innovation,” Risling says.

UniFabriX envisions a substantial market opportunity within the CXL memory space, estimating a total addressable market (TAM) of $20 billion by 2030. “The TAM specifically related to memory pooling is anticipated to fall within the range of $14 billion to $17 billion,” Risling says.

Looking ahead, UniFabriX aims to enhance its solution by incorporating new features and expanding its capabilities to accommodate diverse workloads, Risling says. “The company has a strategic vision to extend its influence and promote the adoption of CXL and its products across the entire industry.”