Cloud storage, in both public and private cloud configurations, has been a boon for organizations who are looking to transform their data management capabilities. Affordable, resilient, scalable storage is available both on-premises, the private cloud, and off-premises, the public cloud. The public cloud storage options have been dominated by Amazon, Azure, Google and IBM.
Public Cloud for Backup and Archive
When it comes to public cloud storage, a logical workload is for data protection; namely backup and archive jobs. Out of the two workloads, archiving is the best fit because of the characteristics of public cloud storage. Public cloud storage providers offer options depending on the use case. Amazon Web Services (AWS), for example, offers S3-Standard, S3-Standard Infrequent Access (IA), S3-One Zone IA, and Glacier. Each storage service has a different price point, which is usually per GB, per month, and a corresponding performance profile. For instance, it can take hours to get data back from Glacier, which makes it suitable for archiving, but not for backup workloads, because of the performance constraints. S3-Standard would be a better choice for backups and is often what is only supported by backup applications. Google Cloud Platform (GCP) offers high frequency access, with both multi-regional and regional configurations for availability purposes, as well as Near Line and Cold Line storage for archive data. Their storage model is to change the cost of the storage, but not the underlying performance. But, you still need to be careful when selecting the correct storage tier. For example, if you put all your data in Cold Line, and you are accessing it frequently, then it will cost more than Near Line.
Public Cloud Egress Charges
Another cost consideration, with public cloud providers, is egress charges. Not only can there be egress charges for data that is read out of cloud storage, but in the case of AWS S3-Standard-IA, there is a cost per GB retrieval cost. More importantly, there are the networking egress charges to send the data from AWS storage back to on-prem storage. The egress charges can constitute “hidden charges” that may not be accounted for when planning to move data into public cloud storage, not unlike what you might experience with your cell phone bill.
Public Cloud Class of Storage
Furthermore, data in public cloud storage can be life-cycled to different classes of storage, based on polices. Data that is initially sent to S3- Standard IA, could be moved by AWS to Glacier. If the archive application is not aware of this data movement, it will not be able to locate the data upon a request, not to mention the now different retrieval time of the data, as well as possible egress charges that might apply.
IBM Spectrum Protect
A popular backup and archive application is IBM’s Spectrum Protect (SP). SP can backup and archive to the pubic cloud. It has native REST API integration with AWS and Azure, but GCP is also on the roadmap. The native integration is important because SP does not require a gateway device to communicate with public cloud storage. The public cloud storage can be used for archiving purposes, which is the best use case since archived data is usually accessed infrequently, minimizing those egress charges. Additionally, SP can tier to the cloud by making the backup pool “spill over” to the public cloud storage. This tiering allows the latest data to be on-prem to maximize recovery times, while older backups can be stored in lower-cost public cloud Storage.
Public Cloud for Disaster Recovery
What we haven’t discussed is how a backup application like SP can use the public Cloud for Disaster Recovery purposes. I’ll address that use case in a future blog.
Mainline offers cloud storage solutions to suit your business needs and budget. Contact your Mainline Account Executive directly, or contact Mainline directly for more information on storage systems for your data center, the cloud, or remote sites.
Additional Information:
IBM Spectrum Copy Data Management – Building a Private Cloud