“Which one has my car title?”
I thought, as the warehouse guy and I sifted through the 4 different wooden storage vaults with everything wrapped in brown packing paper (they did the packing for me). I had moved recently, and allowed the moving company to place my things in their warehouse while I looked for a more permanent home. Meanwhile, I decided to sell one of my vehicles and needed the title which was packed in a file cabinet… buried in these vaults somewhere. Finding it amongst my things seemed like an easy task; there was a numbered inventory which clearly showed a file cabinet. However, as it turns out, the wooden vaults they use get packed by whatever fits, not in manifest numerical order. If otherwise, they would have wasted space, as it would have taken more than 4 vaults and increased my costs. “Don’t worry,” the guy says, as we rummage through the vaults, “you paid for the whole hour.” Oh, right. Thanks for the reminder on the cost of our trial-and-error seek time.
This reminds me of state of the storage industry. We have built smarter and smarter storage subsystems that make efficient use out of finite raw space, by using clever raid schemes, thin provisioning, compression and deduplication. However, in order to actually make use of the ever-increasing data store, we need increasingly complex management procedures in place. What server, storage, database, folder, file contains the data you need to answer important business questions? How do you move the data between systems? How often do you need to move the data? What happens when the system breaks? We try and automate a lot of this today. Single-pane of glass management systems, advanced reporting and orchestration tools all contribute to improving storage management. However, at what volume of data are these tools not enough? When you need to transfer a gig or two, it’s no problem. But, what about transferring a terabyte or two, a petabyte or two? Data growth isn’t slowing, and it sure isn’t stopping.
We need to value solutions from vendors, based on its ability to manage data and provide information. This is real business value. Solutions that attempt to make managing storage infrastructures easier are ok, but what if the storage container didn’t matter anymore, and your data was where you need it, when you need it, all the time? Think of my hot and sweaty afternoon in the warehouse opening all the boxes. What if we could precisely see which vault and where in the vault the file cabinet was located? If that was the case, I could have just called and asked to have it out when I got there.
Solutions like IBM Spectrum Scale provide the ability to own a truly global filesystem across all of your data, regardless of the device it is stored in, and therefore is always right where you need it. Performance is maintained as you grow, as it disperses the workload across all devices with features like Global Native Raid (GNR). Archiving of data happens automatically, without heavy administrative management. More importantly, the type of container(s) you buy to store data become less important, as the functions of managing the data are now removed from physical devices. This is the best solution: a virtual filesystem.
If you are planning a new environment, or would like to discuss how a solution like IBM Spectrum Scale can help your data growth, Mainline can help consult on the best approach to begin, as there are different sized approaches depending on your environment. By the way, as luck would have it, it wasn’t the last vault, but the second to last vault that contained my elusive files. In hindsight, I probably should have taken notes and built a directory of the inventory so I don’t have to ask, “OK, which one has my OTHER car title!!”