<< Back to article Print this page Loading page, please wait...

High-flying Nvidia widens reach into enterprise data centres

Acquisitions bolster Nvidia's position in the data centre and set the stage for more widespread use of GPUs for AI and machine learning workloads

Neal Weinberg (Network World)
12 November, 2020 05:30

Jensen Huang (Nvidia)

Nvidia's plan to buy British chip powerhouse Arm for a cool $40 billion is just the latest move in the company's evolution from a gaming chip maker to a game changer in enterprise data centres.

Nvidia's goal is to take its high-powered processor technology and, through innovation, high-profile acquisitions such as Mellanox, Cumulus and Arm - in addition to strategic alliances with VMware, Check Point and Red Hat -provide a full-stack, hardware and software offering that brings the power of artificial intelligence (AI) to companies that are modernising their data centres.

The company's charismatic and visionary founder and CEO Jen-Hsun (Jensen) Huang "is betting that AI is not really niche, that it's something that's going to be a big part of most companies' data centres in the future," says Linley Gwennap, president of the Linley Group, a research and analysis firm that specialises in the semiconductor market.

While Nvidia's intentions around AI are clear, the company does face challenges related to integrating all of its acquisitions. Plus, the proposed Arm deal comes with additional complexity because of regulatory issues, as well as concerns over the degree to which Nvidia will allow Arm, which licenses its market-dominant technology to all comers, to continue to operate independently.

Nvidia GPUs go from gaming to supercomputers

Nvidia has a long history of innovation. Patrick Moorhead, founder and principal analyst at Moor Insights and Strategy, has been tracking Nvidia since the '90s, when Huang essentially created the consumer graphics industry. One of Nvidia's early wins was when Microsoft decided to use its graphics hardware in the Xbox.

Moorhead calls Huang, often seen at industry events sporting a black leather jacket, "one of the few remaining founder/CEO rock stars."

Nvidia quickly became the leader in high-speed chips for gaming devices, with its graphics processing units or GPUs. Around 10 years ago, Nvidia began to investigate how GPUs could be deployed in other markets, Moorhead says.

One clear opportunity was to use GPUs for specialised applications that required exponentially higher performance than what was available in standard CPUs (central processing units) from traditional chip vendors like Intel or AMD.

Nvidia's GPUs found a home in niche applications like high-performance computers (HPC), supercomputers and machine learning applications, and they were adopted early on by the hyper-scale cloud service providers.

The next obvious target was to expand from specialised high-end applications to the broader enterprise data centre market, where the type of acceleration and AI-based data processing and analytics that GPUs deliver can be applied to storage, networking and security functions.

Huang likes to say that Nvidia isn't trying to take market share away from any incumbent vendors; it's all about creating new markets based on AI technology. That may be a tad disingenuous, but Gwennap agrees that Nvidia isn't specifically targeting traditional data centre incumbents.

GPUs aren't going to replace Intel Zeon x86 chips in the data centre for most applications, and Mellanox high-performance switches and smartNICs aren't going to push out legacy Cisco or Broadcom gear, Gwennap says.

Nvidia also makes its own DGX line of powerful servers specifically designed for data centre AI applications, and the company can even deliver the servers as a turnkey hardware, software and services offering.

Still, Moorhead says that Nvidia does not want to be a box maker and is fully aware that it would not make good business sense to start competing with the same server OEMs that are its main customers.

The Nvidia servers are suited primarily for rarefied academic and scientific research projects that require extraordinary amount of processing power. They are also being deployed as the "tip of the spear" at advanced research groups within some of the most innovative enterprises that want to get a feel for what AI systems could do if widely deployed across the data centre.

"If you're just running a bunch of traditional applications, you probably want to keep using traditional suppliers. But lots of companies are looking at AI as transformative, they're looking at more intensive research type apps, and Nvidia offers solutions more powerful that what you can get from traditional suppliers," Gwennap adds.

How Nvidia's acquisitions fit together

In the first fiscal quarter of 2020, Nvidia reported data centre revenue of $1.1 billion, up 80% from the prior year, primarily through the sale of GPUs and PCI cards to server OEMs. But to create a full stack offering, it needed to go outside the company.

In April, Nvidia closed on the $7 billion purchase of Mellanox, which makes high-speed Ethernet and Infiniband switches, as well as smartNICs. In May, Nvidia announced plans to purchase Cumulus Networks, which offers a Linux-based network operating system.

The Mellanox deal is important because gives Nvidia a networking play. Mellanox smartNIC technology enables enterprises to offload data-intensive functions, such as storage, security and networking, from the server CPU to the smartNIC, so the server CPU can focus on application processing.

High-speed networking and CPU-offload are particularly critical in today's data centres, where applications are being broken up into smaller increments through virtualisation and containerisation, companies are adopting a scale-out architecture, and east-west traffic is increasing exponentially.

The smartNIC not only performs processing faster but also adds programmability and intelligence that plays into the concept of software-defined networking (SDN) and software-defined data centres.

It didn't take long for the Mellanox deal to bear fruit. In October, Nvidia announced a new kind of processor that is calls a data processing unit or DPU, which is based on the Mellanox smartNIC technology.

Under the brand name BlueField, Nvidia has integrated a traditional Arm CPU, the DPU (smartNIC) and a powerful GPU into what it calls a "data centre on a chip" architecture. In addition, Nvidia announced a software architecture to run on top of the hardware and an SDK to enable developers to create new apps designed to run on the platform.

With the Cumulus acquisition, Nvidia gets a network operating system, which fills in another piece of the puzzle. Cumulus also offers a network troubleshooting and management tool called NetQ. The deal solidifies Nvidia's longstanding relationship with Cumulus and will make it easier to more deeply integrate technologies among players who already work together well. For example, Mellanox switches currently ship with Cumulus Linux.

Kevin Dierling, senior vice president of networking at Nvidia, says that in today's data centres it's no longer practical for data centre managers to go around programming individual boxes through a command line interface.

"With new modern workloads everything is moving at such velocity that there's no way for humans to keep up." Everything has to be automated and integrated, and that's where Cumulus comes in, with an open networking framework that companies can use across a hybrid cloud environments.

Moorhead adds, "It's about making the on-premises infrastructure more cloud like. It's the notion that one operator can manage tens of thousands of servers from one console," or that security teams can automate functions like deep packet inspection, encryption or anomaly detection in order to more efficiently and intelligently mitigate threats.

Uncertainly surrounds Arm acquisition

The blockbuster Arm acquisition isn't a done deal, cautions IDC analyst Shane Rau, and there is a chance that it could be blocked. The purchase needs to be approved by regulators in the U.S., where Nvidia is headquartered; England, where ARM is headquartered; and China, because the Chinese company SoftBank is the one selling Arm.

Even if the deal goes through, it will take 18 months to clear regulatory hurdles, and it will likely be another year after that before the effects are felt in the marketplace.

"The mismatch between the two businesses will make it a challenge to pull off," Gwennap adds. He points out that Nvidia sells chips and cards, while Arm licenses intellectual property. Nvidia generates most of its revenue from PC and data centre customers, while Arm gets its revenue mainly from smartphones, consumer electronics, and microcontrollers. In fact, 90 per cent of smartphones on the market run Arm processors.

Nvidia will need to walk a fine line, Rau says. If Arm customers, like Apple, Qualcomm, Samsung or Huawei, get the impression that Nvidia, also an Arm customer, is getting special treatment, such as advance notice of Arm's technology roadmap, they might consider looking elsewhere for a chip supplier.

When the deal was announced, Huang pledged that Arm will remain independent. But Rau points out that Nvidia and Arm already have two close partnerships, one that enabled Nvidia GPUs to be compatible with the Arm ecosystem, and one that brought together Nvidia's deep learning accelerator architecture with Arm's AI platform.

"Despite Nvidia's promise to keep Arm neutral, IDC believes that Nvidia will be tempted to take advantage of Arm IP and technology pipeline where it can for its core businesses," Rau says. "Nvidia's data centre business, which, post the Mellanox acquisition, became Nvidia's largest business in Nvidia's previous fiscal quarter, is synergistic with Arm's emerging server CPU IP business."

The most likely scenario is that Nvidia takes a two-pronged approach, keeping the mobile side of the business independent in order to keep Arm's smartphone customers happy, while at the same time investing heavily into research and development on the data centre side.

From an enterprise customer perspective, additional technology integration between Nvidia and Arm and the innovation that can result from the combined engineering talent can only be a positive.

Nvidia allies with key industry players

Most enterprise IT execs will never sit face to face with an Nvidia sales rep or sign a direct deal with Nvidia. The company will remain a provider of enabling technology that will be embedded into systems sold by OEMs. But Nvidia is starting to work with mainstream networking companies to accelerate existing processes and to enable new AI functionality.

Check Point is partnering with Nvidia to help companies shore up their security defenses and deploy zero-trust architectures. Specifically, Check Point is using the power of Nvidia's DPUs to offload the security processing workload from server CPUs and distribute that processing load to a powerful network card that can be deployed closer to the network edge.

The additional processing power enables customers to load Check Point agents on IoT devices, retail point-of-sale systems, and bank ATMs, for example. And it makes it easier for companies to implement zero trust with microsegmentation to protect against east-west traffic attacks.

VMware and Nvidia recently announced a collaboration that would integrate Nvidia's DPUs into VMware's vSphere, Cloud Foundation and Tanzu platforms. The goal is to accelerate AI adoption, enable enterprises to manage applications with a single set of tools, and deploy AI-ready infrastructure across the data centre, cloud and edge.

The partnership is also expected to deliver expanded application acceleration to all enterprise workloads and provide an extra layer of security through SmartNICs and programmable DPUs.

Red Hat is also strengthening its longstanding alliance with Nvidia "to accelerate the enterprise adoption of AI, machine learning and data analytics workloads in production environments."

By combining Red Hat's open source software with Nvidia's GPU hardware and acceleration libraries, the companies are offering new capabilities for running GPU-accelerated workloads across hybrid cloud architectures.

Betting on Nvidia

While AI is certainly more of a concept than a reality for most enterprise data centres, the potential is tantalising. AI can help companies across nearly all verticals. Think natural language processing, robotics, data analysis in healthcare, recommendation engines for retailers, etc.

The path for most enterprises that are looking to add AI capabilities to their data centre is to build "islands of AI," as Dierling puts it, as well as to backfill existing servers with GPUs. Nvidia's big bet is that enterprises will start their AI deployments sooner rather than later.

Gwennap says he doesn't recommend betting against Nividia. He sums up the company's journey this way: "Jensen started Nvidia when a GPU wasn't even a thing. He created a market for graphics accelerators in the '90s. He clearly saw the opportunity for AI ahead of most other people and was able to build up Nvidia's capabilities before others jumped in.

"He's very bullish on the future of AI and rightly so; there's tremendous momentum for AI technology in the market right now, we're seeing more apps every day and people finding new ways to employ AI."