Disaster Recovery & Business Continuity Blog
Shray Kapoor, published Nov 20, 2007
Information is the key attribute for any business, its software and hardware resources make business policies. Every business can suffer natural or man-made disasters, which can range from flooding, earthquake to a malformed SQL query which can corrupt the data-centre of business application. Therefore it is not only important to protect the IT resources, but also to recover them in case of any emergency. Business continuity planning also termed as disaster recovery plan caters to the above argument of efficiently recovering information and critical resources on which business depends for its continuity. DRP consists of a set of policies and procedures for reacting and recovering from IT disabling disasters, based on the severity of critical resources and probability of occurring an incident. [1]
Planning proceeds in steps, with a feedback loop to assess current strategies. Steps involved are: -
1) Assessment -
An assessment is an act of measuring and comparing. For IT sector assessment implies exploring and defining risks. Risk assessment starts with defining resources, such as software, hardware resources, communication nodes etc. This assessment is carried out using internal and external audits which are done on cycle basis, regularly by an auditing team. After doing an initial assessment of resources, they are ranked quantitatively according to their importance and likelihood of getting compromised. Quantitative analysis focus on estimated loss a threat can cause. Any outrage which can disrupt the normal functioning of business is qualified as a threat. Threat to IT sector is generally manmade such as a security incident or a viral infection, natural threats (flooding, earthquake) do have a major impact but their rate of occurrence disqualifies them quantitatively. Assessing security threats is known as vulnerability testing or penetration testing, which is generally done by third-party vendors and tools specifically designed to assess vulnerabilities in computer systems. [4]
Deliverables of Assessment phase are - Vulnerability assessment and resource definition document Business impact analysis report Detailed definition of requirements.
2) Establishing policies and procedures
Purpose of this step is to plan policies and procedures to mitigate the risk as far as possible. Policy establishes "what is and what is not required?" in context of business goals. A policy should be comprehensive and compact, because bulk of information renders it unmanageable. Every policy should meet the compliance of every department and every user involved in the business. In context of IT sector, policies should address:-
Authorization and authentication management Acceptable IT resources management Data restoration and backup policy Account management Log review Incidence response policies
Procedures to implement the policies include building of recovery teams from among the IT staff to take care of every issue. Procedures define how to deal with various aspects of resources addressed by policies, who is responsible and how the recovery process occurs. For example data recovery procedure should define how frequent the backups should be scheduled, what should be recovered first and how the plan moves in case of any incident?
3) Budgeting
Once the risks and policies are figured out, next step is to calculate the cost of implementing the plans so that they can befit business objectives. During budgeting one is required to assess the overall IT budget against the cost required for implementing a recovery plan. This phase should try to exactly forecast the overall cost and Return on investment (ROI) in implementing the plan. It depends on the risk levels to critical resources and their impact on overall business. Risk assessment matrix thus serves as a vital parameter for deciding which assets should be considered for recovery. Both IT and the business units must agree on which data and applications are most critical to the business and need to be recovered most quickly in a disaster. Ultimately it is the management sector who decides which threats are tolerable and to what extent. Cost estimates for recovery plan is only one part of budget. Staffing requirements, software subscriptions, hiring third-party consultants, performing vulnerability testing, training costs etc. are some other factors which contribute to the overall budget. Final requirement of an effective budget is that it should not be resilient; one should stick to the budget throughout the recovery phase.
4) Initial Plan Implementation and Testing
After having fixed the budget and respective plans, next phase is to implement and test those plans. Testing strategies tailored to the environment are setup and individual policies and plans are tested accordingly [3]. For example, database recovery plans can be tested by doing realistic assessment of backup procedures in a qualified environment with test data. Testing procedures should not interfere or affect any normal functioning of involved systems. In security aspects, implementing a security plan starts with procedures which aids in reducing risk levels, by first mitigating high risks and moving forward towards low risk areas. Penetration testing is done in this phase to test the security plan. Implementation and testing involves educating users, administrators and training them so that they become aware of new policies and meet security standards. Testing results should be recorded to update the DRP for any shortcomings. After the initial implementation and testing, policies are deployed in real environment and monitored regularly.
5) Reporting
Reporting is necessary and important part for any IT program. Reporting mainly addresses the management issues, management need to be made aware of how information and resources are being managed in the organization and what policies are in effect. Reports should include project progress report, risk measurement and ROI documents. Project progress report depicts current progress against schedules, minor and major issues involved and expected progress deadlines. Risk assessment is done on security metrics which involves, measuring vulnerability detection, number of security incidents, number of manmade disasters corrupting data resources, blocked attacks etc. Reporting should address not only security aspects but also malfunction of nodes operating within the network. Ultimately, report is the only document by which management can assess the effectiveness of any DRP.
The final objective of a DRP is to effectively respond to disasters. DRP response guidelines should meet the following objectives: -
Limiting business loss and human injuries Recover and contain the disaster as far as possible Initial assessment of damage.
References:
[1] Glen Kunene, How to Create a Disaster Recovery Plan Available at www.devx.com
[2] ISO 17799, Sarbanes - Oxley, & HIPAA Compliant: Disaster Recovery Plan Template
[3] Computer Security Administration, University of Toronto: Disaster recovery planning
[4] Eric Maiwald and William Sieglein, Security Planning & Disaster Recovery
[5] The security risks analysis directory: An introduction to risk assessment