Israeli startup UniFabriX is aiming to give multi-core CPUs the memory and memory bandwidth needed to run compute- and memory-intensive AI and machine-learning workloads.
UniFabriX is pitching its Smart Memory Node technology as an alternative to socket-connected DRAM, which restricts memory capacity and bandwidth in CPUs.
UniFabriX's technology is based on CXL (Compute Express Link), an industry-supported interconnect for processors, memory expansion, and accelerators.
CXL technology maintains memory coherency between the CPU memory space and memory on attached devices, which allows resource sharing for higher performance, reduced software stack complexity, and lower overall system cost.
The company was launched in 2020 by Ronen Hyatt and Danny Volkind. Hyatt formerly worked as a platform architect in Intel’s data centre group. He subsequently joined Huawei as Smart Platforms CTO in 2018. Volkind served as a system architect at Intel and was later employed by Huawei as a chief architect. Both partners began their careers at Israel’s Technion Institute of Technology.
Recognising the data centre market's challenges and potential opportunities, the duo sought solutions to address a significant hurdle: memory limitations, says Micha Risling, UnifabriX's co-founder and chief business officer. "They identified that DRAM acts as a barrier to scaling compute operations."
UniFabriX in the data centre
When deployed inside a data centre, UnifabriX's Smart Memory Node is housed inside a 2RU chassis containing 32TB of DDR5 DRAM.
"The concept is that attached servers require less local DRAM because they can utilise the shared memory from the Smart Memory Node more efficiently," Risling explains. "This approach eliminates the issue of local stranded DRAM that's not being utilised effectively."
UnifabriX estimates that DRAM constitutes approximately 50% of server costs.
The typical use case, Risling says, is a data centre that acquires servers at the best possible price with just enough capacity to run basic applications. When deployed in a cluster's centre, UnifabriX's Smart Memory Node acts as a resource pool that servers can draw from when they run out of memory, capacity, or bandwidth.
Sharing resources within a cluster offers several advantages, including reduced energy consumption, a smaller physical footprint, and increased flexibility.
Therefore, even when equipped with the same amount of memory, deploying a cluster with 12 servers equipped with 8TB and three Smart Memory Nodes of 32TB—for a total of 192TB of RAM—would be 25% more cost-effective in terms of operating expenses than a cluster of 24 servers with 8TB of RAM, Risling explains.
UnifabriX's potential clients, running general-purpose business servers, rarely pay much attention to memory utilisation, says John Schick, a consultant with technology research and advisory firm ISG.
"But niche workloads like HPC, AI, ML, and in-memory database management systems use much more memory, and clients do pay attention in those environments."
Schick believes that UnifabriX's technology should be able to improve throughput without increasing CPU capacity and associated software licensing costs. Risling notes that for cloud service providers (CSP), the technology will allow doubling the number of servers on a rack, from 12 to 24.
Potential benefits of CXL memory
To date, discussions around CXL memory have mostly focused on capacity expansion, memory expansion, and memory pooling, all centred around increasing capacity and reducing total cost of ownership. UnifabriX, however, offers a different perspective, one that prioritises performance.
"As processor core counts have increased, the memory channel bandwidth per core has decreased over time," Risling notes. "This means that adding more memory to a server reaches a point where it doesn't improve performance because the cores are unable to fully utilise it, resulting in stranded compute."
To demonstrate the Smart Memory Node's impact on performance, UnifabriX used the High Performance Conjugate Gradients (HPCG) benchmark, which stresses the memory subsystem and internal interconnect limitations of a supercomputer by running an application entirely within the computer's DRAM.
According to Risling, the company's researchers observed that when the Smart Memory Node was activated, and its CXL memory was interleaved with the server's local memory, HPCG performance "significantly improved" with all cores fully utilised.
Despite CXL memory's slightly slower access speed of 256GB/sec, compared to DDR5 DRAM's 300GB/sec pace, the Smart Memory Node effectively measured and addressed issues with local DRAM memory bandwidth, dynamically provisioning additional bandwidth to the socket.
"This scaling capability enables enhanced performance," Risling says.
The Smart Memory Node can independently provision capacity and bandwidth based on real-time monitoring of the system and workload performance, optimising and maximising overall system performance through an interleaving ratio between local DDR and external CXL memory.
What makes UnifabriX different?
Risling believes that the Smart Memory Node's key differentiating factor lies in its utilisation of CXL.
"CXL represents the latest and most significant advancement among various standards developed to enable memory composition," he says.
"It's the first open-standard universally adopted by major CPU vendors, facilitating concurrent transactions of memory semantics and cache semantics alongside the existing IO Semantics of PCIe."
Risling believes that CXL marks a pivotal milestone in the architecture of compute and data centre infrastructures, unlocking a range of new disruptive applications that were previously unattainable.
Among these applications, he notes that memory pooling stands out as the most valuable, offering substantial returns on investment by unleashing the full performance potential of the underlying compute, reducing power consumption and total cost of ownership while eliminating the inefficiencies caused by memory and compute stranding.
Within the evolving CXL ecosystem, several firms are pursuing similar objectives, offering a range of solutions that span from line drivers, memory controllers, and memory expansion cards to CXL switches and memory pools.
"While some solutions strictly adhere to the implementation of the CXL standard, others incorporate additional layers of innovation," Risling says.
UniFabriX envisions a substantial market opportunity within the CXL memory space, estimating a total addressable market (TAM) of $20 billion by 2030.
"The TAM specifically related to memory pooling is anticipated to fall within the range of $14 billion to $17 billion," Risling says.
Looking ahead, UniFabriX aims to enhance its solution by incorporating new features and expanding its capabilities to accommodate diverse workloads, Risling says.
"The company has a strategic vision to extend its influence and promote the adoption of CXL and its products across the entire industry.”