In the second part of our series on disaster preparedness, Henry Newman looks at replication in HSM systems and reveals the key to long-term archiving.
The use of hierarchical storage management (HSM) software has become more common, since it has some significant advantages over standard backup/restore software for large storage environments.
With HSM, your file system can look like it has a petabyte of disk storage while it only might have 10 terabytes of physical disk. The rest of the data are typically archived on tapes, which support compression and do not require power, and if they are high-end tapes, they could even have higher reliability than disk drives. Check out the bit error rates for Fibre Channel disks and enterprise tape at the Seagate and Imation Web sites if you don't believe me.
There are two important issues to consider if you are using HSM software:
As a good introduction to moving data, please see Preparing for a Disaster, which addresses remote mirroring issues for host-to-RAID and RAID-to-RAID mirroring methods.
- How do you get the data from the main site to the disaster recovery (DR) site?
- How do you handle migration of data to new media or systems?
Moving data using the RAID-to-RAID copy cannot work with HSM, since you do not have a host on the other side to control the file system on the RAID and tapes. HSM software depends on having a host running the HSM file system.
Almost every HSM has a method or methods for replication of data to another site. There are basically three potential candidates:
Moving Data to RAID
- Moving the data to another system and to RAID when the tape is written, and then moving the data from that system to its own tapes.
- Moving the data to another system and just writing the tapes (no disk transfer).
- Using a channel extender to write the tapes remotely.
As data is moved to the remote system and then to the RAID, it should look just like the HSM at the local site. The movement of the data is almost always over a TCP/IP network.
With most products, what generally happens is that once a file is available to be archived or has been archived to tape, a copy of the file is written to the remote system via a TCP/IP socket. At that point, the HSM on the remote system takes over.
How the data is moved and the performance issues involved with buffering should be examined. Some products have tuned the performance of this data movement for high-speed networks, while others have not.
Also, if security is important, you might want to consider movement of the data over ssh2 or another encryption method. Does the HSM support data encryption? You could always use encryption in the WAN routers, but you might want to consider both. If you are doing host-based encryption with the HSM software or ssh2, you need to ensure that you have the CPU power to do it, and it should be tested with your type of hardware and network. It would not surprise me if some systems could not run the network at rate while performing encryption because of a lack of CPU power.
Moving to Tape Remotely
Some HSM products move data to the remote system to tape or virtual tape. Next time we will cover in detail the issues surrounding movement of data directly to tape, so you will have to wait until then to fully understand the issues. Needless to say, it is important to ensure that the tape drive run at full rate, including compression.
Far better than direct tape movement are products that support movement to virtual tape. This is not much different than moving the data to another system that is functionally an embedded HSM that manages the tape cache. The concept for these products has been around a long time in the mainframe world, and provides the tapes with a mechanism that can support high-performance data stream, since large files exist in the cache. You should ensure that the virtual tape product uses highly reliable RAID hardware to ensure that your files are protected before they get to tape so that they do not have to be retransmitted. Also, ensure that the software supports this retransmission in case of any type of failure.
Page 2: The key to long-term archiving
Continued From Page 1
Migrating Data As Technology Changes
Now that you have your data on the remote, at some point you are going to need to move that data to new technology.
A few years ago, a customer asked us how to store data for 50 years. We investigated optical MO technology, optical glass platters, tape and all types of other media. What we determined seemed pretty obvious when we finally thought of it: It's not the media, stupid, it's the tape drive, hardware interface, HSM software, and even the application software.
A good example is seven-track tapes, which were supposed to last more than 30 years. The drives are more than 30 years old now and finding one is really hard. Add to that the hardware interface (I have no idea what interface they used), software drivers, and the application needed to read and process the data, and your data seven-track tapes may still be in fine shape, but they may be of little use to you.
Another example is MS Word 1.0. Remember that product from 1983? Do you think that MS Word XP or even MS Word 2000 can read and convert MS 1.0 files created in 1983? Not likely. The same is true today to a lesser extent. Adobe PDF files are supposedly good for 30 years, maybe longer (http://www.adobe.com/products/acrobat/pdfs/pdfarchiving.pdf). I have seen a number of companies that archived tiff files instead of PDF, and I expect JPEG2000 to replace tiff.
The extent of the problem is huge. Obsolescence for applications, files systems, HSM formats, computers, drivers, interfaces, tape drives, tapes, and data will require migration long before the life of the tape becomes an issue for enterprise-quality tapes.
One of the things we have done for customers is to develop a migration plan as part of the system architecture. This migration plan is based on obsolescence of every single piece part of the system, both hardware and software. Without something like this, management usually has budgetary problems when it comes time to upgrade the system, so a plan is a good thing to have to document the requirements of migration before the system is installed.
Every piece will need to be upgraded to ensure that critical data is available. This can be very expensive for large HSM systems, and often much of the data will never be used again, so determining what is important and what is not and removing the unimportant data is critical. Over a 30-year period, users come and go, but in our current UNIX world, the data is almost always stored by UID and GID. It is a lot to think about.
Getting the data to an off-site facility requires careful planning and good knowledge of your HSM software and the features and methods supported. Ensuring that the network has enough bandwidth is just part of the problem, given what John Mashey said about bandwidth and latency. You really need to know your applications and hardware to use the network efficiently.
Without a plan to migrate, your data is at risk. How important is the data? Is it the results of a drug study that FDA requires you to keep for 50 years or the plans to a new airplane? It could also be results of the weather forecast calculation from today, which could be recalculated with the original imput data and the code (maybe). What if that code was FORTAN90; could you compile that code in 2020? As you can see, a great deal of thought and planning needs to go into deciding what needs to be kept and what needs to be removed, and when that takes place. Not easy questions or answers.
Next time, we will cover a topic that is related to these last two articles — the quality of your data on tape. The issue will be tape wind quality, and issues that surround tape drive design.
See all articles by Henry Newman