Reverse Archiving: How to Mitigate Risk and Reduce Costs From Long Term Back-Up Storage

December 21st, 2015
Chris Dedham
Business Continuity Specialist and Senior Storage Solutions Architect

 

Have you ever heard the expression, “You can have cheap, fast and reliable: Pick two.” How about if you could pick all three, which is a possibility, if an organization makes an effort to separate the backup process of data from archiving process of data. The problem is that the line between backups and archives is often blurred within most organizations. To illustrate this point, it may be time to proclaim a new storage management discipline called “Archups” (aka Backives). Archups are backups that have been attached to long term retention policies. They are created when backups are retained for longer than 30 to 45 days. Backups are used for data protection, and are in place to launch restores. Restores, for the most part, require the latest copy of data, since restoring data that is older that one week is unusual, due to its age and currency to the environment.  When data is restored, it is expected to be copied back, as fast as possible, since a system(s) may be down or partially functioning. Data that is older than 30-45 days is usually being recalled for legal, compliance or regulatory requirements. The speed in which it takes to copy older data is not as critical as a data restore; but it is important that it can be accessed, if there is a policy that says the data does exist, and it is requested.

The difference between Archups and Archives is that Archups represent what can be defined as Passive Archiving, in contrast to Active Archiving, which is conducted from the system of record. Active Archives, by definition, are single copies. Archups represent multiple copies of the same data, which have been generated from the backup process that runs every day. Active Archives are triggered by a time threshold, or an event, and then, only one unique copy of the data is placed in the proper storage repository. The benefits of Active Archiving are significant:

  1. The size of the production applications are kept smaller, which will increase their performance, and reduce their backup windows.
  2. Production storage is consumed at a slower rate, providing cost avoidance of a more expensive storage tier. A lower-cost storage tier can be substituted for archiving.
  3. Active Archives are discovered by intelligent search expressions, making the search process less time consuming.
  4. The Active Archived objects can be kept on immutable storage to guarantee chain of custody.
  5. The lifecycle of the Active Archive can be managed to ensure proper disposal of the data, when necessary, which mitigates risk.

Archups, on the other hand, are stored within the backup storage pools designated for data protection; consequently, they can create a number of challenges to the data protection environment.

  1. Archups can consume high performance backup storage that is in place to maximize restore throughput.
  2. Archups can inflate the size of the backup catalog because of high object counts. The increase in meta data management slows down the backup server, which may create the need for additional backup servers, unnecessarily.
  3. The ability to search Archups is limited to the search capability of the backup catalog, which is typically not indexed for high performance, intelligent search inquiries.
  4. Backups are often copied for DR, so Archups could inadvertently be included in the DR copies, thereby consuming additional storage media and increasing offsite storage costs, or replication costs.
  5. The danger of not being able to discover data is greater with Archups versus Active Archives, which puts organizations at risk.

The argument for doing Active Archiving, instead of Archups, is strong. Fortunately, there are tools available today that can covert Archups to Active Archives, which could be called reverse archiving. Reverse archiving removes Archups, and their pernicious side effects. The challenge with doing Active Archiving, from the onset, is that IT has to work with the application owners, since active archiving often requires the implementation of middleware that touches the production applications, also known as the system of record. But in the end, Active Archiving is faster, cheaper and more reliable.

Mainline