NextGen Storage Virtualization

June 29th, 2020 NextGen Storage Virtualization

Robert Young
Sr. Systems Engineer
Data Storage Services

Computing has evolved from Mainframe to Client-Server to the Third Platform, or Platform 3 as a co-worker (shout-out to Cuz!) described this computing paradigm. Basically, Platform 3 is mobile and cloud analytics with a touch of IoT.

Storage virtualization has followed a similar evolutionary trail. Initial virtualization went from “simple” RAID arrays (two-member hardware mirrors) to appliance-based virtualization products, bringing additional RAID array capabilities to the end user. Some of these capabilities included increased performance, replication, migration, and the de-coupling of host from the storage frame itself. By front-ending the storage frame, the pain of migrations when the frame lease ran down or the frame aged off was greatly lessened. In fact, in a study to see how their San Virtualization Controllers (SVC) were being used by their customers, IBM was quite surprised that many purchases were being used for just that purpose – storage migrations. Keep in mind, this survey was conducted at a time when storage migrations were quite painful due to Windows physical machine domination, among other challenges. This article examines recent introductions to storage virtualization technology and peers into near-term futures that will provide much greater autonomy. At some point, the machine must run itself.

First, let’s take a walk through some recent virtualization improvements.

 

Storage Virtualization Platform 2.5

 

In the last few years, we have moved on to new virtualization capabilities introduced by several vendors, including IBM HyperSwap. The master UID (unique identifier, a serial number or WWN as seen in the storage array; how a LUN or disk is registered by a host OS) floats between storage arrays, and if the array/storage/connection goes offline, the other array takes over as the volume master. There are other mechanisms that will allow the master to move, for example when the IO write load exceeds 75% for a period, essentially that array now routes write IO to the other array. There are several failure scenarios, for instance the auxiliary volume may not come into play if the primary volume is still accessible via private array-to-array SAN channels. Essentially, HyperSwap is a mirror of a primary volume to an auxiliary volume between storage arrays that does not go away in the event of a primary storage array outage, adding another level of redundancy/availability to the virtualization mix.

Likewise, a similar high availability solution is found in Pure Storage’s Purity ActiveCluster. In an ActiveCluster, the same volume (UID) is online and active in two separate arrays. Describing advantages to uniform access mode:

“Other non-Pure solutions manage optimized paths on a per-volume basis where each volume can only have optimized paths on one array or the other, but not both at the same time. With Pure Storage Purity ActiveCluster there are no such management headaches. ActiveCluster makes use of ALUA protocol to expose paths to local hosts as active/optimized paths and expose paths to remote hosts as active/non-optimized. However, there are advantages in the ActiveCluster implementation. In ActiveCluster, volumes in stretched pods are read/write on both arrays. There is no such thing as a passive volume that cannot service both reads and writes.” [1]

Here are links to two videos that highlight how ActiveCluster works: here and here.

 

Storage Virtualization Platform 3

 

What is Storage Virtualization Platform 3 (SVP3)? It is where virtualization is essentially zero-touch transparent. It is just there, it works, and you forget about the sophistication.

Here are three things I propose that identify an SVP3 solution:

  1. Totally host transparent. Volume moves/migrations require no changes at the host level short of automated pathing cleanups.
  2. No in-band virtualization appliances. They add latency and are bandwidth constrained. More on latency in a follow-up article.
  3. Sophisticated, policy-driven autonomic volume moves. Expect to see considerable maturity in this space in the next year. More than a wish list item, the idea here is the volume goes to where it belongs based on utilization and performance needs.

What does this look like? Really soon now, an Infinibox array will act as a “node” in a federated cluster of Infiniboxes that Infinidat is calling an Availability Zone (AZ). The AZ will be managed by a cloud-based control plane called InfiniVerse that will treat the AZ as a single, unified storage cloud. Transparent Data Mobility (TDM) will allow volumes to be moved from Infinibox to Infinibox without downtime and no involvement OR awareness of being relocated at a server level.

Describing the value-add of TDM, Infinidat states: “Data placement and mobility decisions within the cluster can be executed manually or by an AI agent that monitors and automatically responds to quality of service policy compliance, all controlled and monitored by InfiniVerse’s simple and intuitive UI and REST API” and “it enables on-demand workload mobility from rack to rack within an AZ, with no downtime and no involvement or awareness at the server level. TDM replicates the capabilities of a virtualization appliance.” [2]

The UID will not change, the volume is the same. This will make storage migration exercises a thing of the past. By imitating the capability of a virtualization appliance, TDM will eliminate the cost and complexity of the appliance itself. Additionally, many more bandwidth-capable arrays are available where an appliance in the storage path will run a considerable risk of being overrun (point 2 above.) Also, as Storage Class Memory, aka SCM SSDs (Optane for example) in the arrays becomes more commonplace, the additional “IO hop” of an appliance will add impactful latency to IO requests.

Recently, Dell introduced their PowerStore array [3] which uses the technology described above to move volumes from one array to a federated array member. Two supported volume migration methods at initial roll-out are manual volume migrations and assisted volume migrations. Initially, the technology is not automatic in nature and the best practice is to stop IO prior to migration. We expect maturity to add to those capabilities.

Some competitors downplay federated storage clustering but maintaining cache consistency across arrays introduces latency. As mentioned, a follow-up article will speak to latency and present additional thoughts around its role in changing what next-gen storage virtualization looks like. In that post, look for what autonomic volume moves look like and why.

Finally, SVP3 will make more sense once faster and more expensive hardware shows up in the next few months. A TDM-like capability will add an interesting wrinkle to how this plays out.

More Information

Mainline works with all the major vendors and has deep knowledge across the entire spectrum of storage options, enabling clients to select the best offerings to suit their business needs and budget. Contact your Mainline Account Executive directly, or click here to contact us with any questions..

You may be interested in:

Blog: Software Defined Storage vs Storage Virtualization

Blog: Why IBM Spectrum Virtualize for Public Cloud is an Integral Part of Storage

[1] https://kb.vmware.com/s/article/51656

[2] https://infinidat.com/sites/default/files/resource-pdfs/Infinidat%20Elastic%20Data%20Fabric.pdf

[3] https://www.delltechnologies.com/en-us/storage/powerstore-storage-appliance.htm

Submit a Comment

Your email address will not be published. Required fields are marked *