ARN

How to back-up hyper-converged infrastructure

Backing up hyper-converged infrastructure means backing up virtual machines

Enterprises running hypervisors on hyper-converged infrastructure (HCI) systems typically have back-up options available to them that are not available to those running on generic hardware.

Such customers may also have additional back-up challenges depending on the HCI vendor and hypervisor they have chosen. Let’s take a look.

Traditional hypervisor back-up

When backing up physical servers CIOs might run a full back-up on all of them at one time, and that’s fine. It’s an entirely different situation if all of those servers are actually virtual machines sharing the same physical server.

Even running several full-file incremental back-ups (backing up the entire file even if only one byte has changed) at the same time can significantly affect the performance of a hypervisor. That’s why most customers using server-virtualisation products such as VMware or Hyper-V have switched to more hypervisor-friendly methods.

Options include block-level incremental or source-side deduplication methods. While they’re not technically designed for VM back-ups, they can be very helpful because both significantly reduce the I/O requirements of a back-up by an order of magnitude or more.

They also make it possible to run VM-level back-ups without impacting the overall performance of the hypervisor. One downside is that it reduces the efficiency of virtualisation because it requires the installation and maintenance of client software on each VM. That’s why most people backing up VMs opt for hypervisor-level back-ups.

Hypervisor-level back-ups utilise software or APIs at the hypervisor level. Each major hypervisor offers such an API. Back-up systems interfacing with these APIs are typically able to ask the hypervisor for the blocks that have changed since the last successful back-up.

The back-up system backs up only those changed blocks, significantly reducing the I/O requirement and reduces the amount of CPU required to identify and locate change blocks. The combined effect of these two features significantly reduces the impact of back-ups on the hypervisor.

Snapshot-based back-up

Some storage products have connected hypervisor-back-up APIs with their snapshot capability as a back-up methodology. All customers need to do is to put their datastore on the storage system in question and provide the appropriate level of authentication into the hypervisor.

At the agreed-upon schedule, the snapshot system interfaces with the hypervisor, places the various VMs in the appropriate back-up mode, and takes a storage-level snapshot. The snapshot takes only a few seconds to make, then the VMs can be taken out of back-up mode. This is faster than the previous back-up method and has a lower impact on back-up performance.

The snapshots would need to be replicated to another storage system in order to be considered a valid back-up. Such replication typically requires very little bandwidth and CPU and is relatively easy to accomplish. This allows enterprises using this back-up method to have both an on-premises and off-premises copy without ever having to perform what most people consider to be a back-up.

Snapshot back-ups and HCI

Snapshot-based back-ups – as long as they are replicated to another location – offer some of the fastest recovery time objectives and tightest recovery point objectives in the industry. One downside to using them is that they traditionally require a separate storage product, one that might be quite expensive.

Many hyper-converged infrastructure systems take care of this downside by bundling compute, network, and storage in a single package that also typically includes snapshot-based data-protection mechanisms.

They use the snapshot-based back-up method but without requiring a separate storage system. This single, integrated system makes it easier to create and manage VMs while also making sure that back-ups are happening as well via HCI’s integrated snapshot-based back-up system.

Instead of compute, networking, storage, and back-up systems from four different vendors, the HCI world offers a single vendor that accomplishes all of that. This is a contributing reason why many companies, especially smaller ones, have really taken to HCI.

Some take integrated data protection even further, and integrate these back-ups into the cloud, providing a DR function as well. This allows CIOs to recover an entire data centre to the cloud, without ever running a traditional back-up or replicating data the way they would in a typical disaster recovery scenario.

What about lesser-used hypervisors?

Some HCI vendors do not use Hyper-V or VMware. For example, Scale Computing uses the KVM hypervisor and Nutanix uses the Acropolis Hypervisor (AHV), although Nutanix also supports VMware. The potential concern here is whether these hypervisors have the same level of data-protection APIs offered by VMware and Hyper-V and whether back-up vendors write to those APIs.

Customers using HCI vendors that use other-than-mainstream hypervisors have two basic choices for data protection: find a back-up-software vendor that supports the hypervisor or use the integrated data protection features available in the HCI product.

A few vendors address the back-up needs of this market. The integrated snapshot-based back-up systems available in both Scale Computing and Nutanix are on par with the snapshot-based back-up systems mentioned in other HCI platforms.

The integrated data-protection and disaster recovery features from some HCI vendors meet or exceed what is possible using third-party tools. Such companies argue that it’s simply one more thing they are simplifying, and that’s a solid argument.

If a single product could meet compute, networking, and storage needs, while also making sure a business is protected in case of an outage or disaster – that’s a compelling offer.