Backing Up The Backup...Oops

June 19, 2003
By a Large Hosting and Internet Backup Service

Cosmic Rays? Or Just Cosmic? er, Comic?

  1. First, the Backup System failed (Tape Drive)
  2. Then, the Backup-Backup System failed (Disk Array)
  3. Now is a Good Time for a Hard Disk to Fail (which it did).

Discussion phrases:

  • Destroying a tape set
  • Reported an error
  • Integrate the recovered data
  • Process that can take many days
  • And have excellent faith

From the Company's Web Site:

"First the Update: It looks like our expected FULL recovery time is mid-day Thursday (6/19), though we may begin to restore some directories as early as tonight. We have a list compiled via our help desk and emails of directories to make priority in recovery. The plan will be to create new directories in your home login areas with the recovered data. We WILL NOT overwrite any changes made by you since Saturday without your prior written permission. This will allow you to integrate the recovered data as needed.

"Many of you have asked how this happened, and why we do not have backups, or backups of our backups. The answer is that we do, but the backup systems (two of them in fact) failed prior to the disk trouble on butternut. To make a very long story short, our tape backup system suffered a mechanical failure last week, destroying a tape set in the process. Our 'backup' backup system was then employed. It uses a large (1 terabyte) disk array which is one of a set we use to backup our colocation clients. It started backing up our hosting servers last Wednesday. The backup software reported an error on Friday night that required us to rebuild the catalog from source data. This is a very time-intensive process that can take many days. Of course, less than 24 hours later is when the drive in butternut failed. So, in the end two backup systems failed prior to the loss of one of butternut's drives."