Backup validation – Are you doing it right ?

by on May 12, 2013

Backup and Recovery is the most important task for any DBA.  As a DBA we have to ensure that all our backups are running successfully and reaching netbackup/TSM .

Below are some of the common scenarios we face :

  • We do not keep track of backup failures . Especially when we manage hundreds of servers it is hard to keep track of the scripts and return codes
  • We dont validate  backup (db2ckbkp) always
  • Archive logs are hard to manage . Do you keep track of thousands of archive logs across servers that go to netbackup?
  • Are you confident that all your backup files are reaching netbackup. How many times have you found that netbackup restore failed when you needed it ?
  • Are you doing  backup validation tests ? Are you doing regular recovery tests for critical databases ?



style="display:inline-block;width:250px;height:250px"
data-ad-client="ca-pub-3128777066323817"
data-ad-slot="8110217887">

Make sure that you validate all your critical databases so that you can have a good sleep. Having a good backup strategy and ensuring that the strategy is executed perfectly is the utmost important task for a DBA. Below is how I do it and would like to hear your feedback:

  • Review your backup strategy ; discuss various failure scenarios and come up with an optimal plan that suits your business needs. Are you meeting your defined RPO(Recovery Point Objective) and RTO(Recovery Time Objective) ?
  • Take backup –> Validate backup(db2ckbkp) –> Send to netbackup
  • Send archive logs from LOGARCHMETH1 to netbackup frequently for critical databases. Do not delete the local(disk) copy of these logs for some time
  • Do a weekly check against sysibmadm.DB_HISTORY to make sure backups are successful. I would run a weekly report with a query like :db2 “select START_TIME,END_TIME,SQLCODE,SQLSTATE,entry_status from sysibmadm.DB_HISTORY where operation=’B’ and date(timestamp(end_time)) > (CURRENT DATE – 7 DAYS)”. You can also setup ssh keys and run this against all servers and pull a good weekly backup summary(http://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/).
  • If you are using netbackup get access to run a command like “bplist” that will enable you to validate the netbackup activity. Write a script that parses the output of bplist command and validates against sysibmadm.db_history output discussed above. If you are using TSM you can use “db2adutl” to validate the backups
  • Perform regular recovery tests for your critical databases. Include rollfoward recovery tests to make sure that archive logs are being backed up properly

 

Leave a Comment

Previous post:

Next post: