IT Service Continuity Management

Prepare for the unexpected.

A critical IT service or a full data center outage is not just an operations problem; it is a business problem. The cost of unplanned outages can reverberate far and wide affecting revenue, reputation, and customer loyalty.

With the enterprise's increased dependency on IT and the increased interdependency of systems, cascading failures can cause significant impact, and outages of any type are unacceptable. In a world increasingly driven by digital business, unplanned outages can be fatal.

Vigilance and investment are the best stance when it comes to IT Service Continuity Management (ITSCM).

IT Disaster Recovery services overview of framework — Click to Enlarge

David-Kenneth Group’s ITSCM philosophy is based on years of working with various standard industry recovery and service continuity methodologies. We believe ITSCM is a custom fit that must support the continuity of an organization’s business operations in both an affordable and sustainable way. Our goal is to help you understand, optimize, and rightsize your IT landscape to respond to any kind of outage that would affect your organization.”

Sean McCarthy, M.B.C.P.
Senior Disaster Resiliency Architect
David-Kenneth Group

IT service continuity preparation: establish framework.

Whether your organization is looking to mature its ITSCM program or launch an inaugural program, it begins by establishing a framework so that a strategy and roadmap can be developed, implemented, and exercised toward maturity.

David-Kenneth Group’s ITSCM service is organized into five towers and designed to help organizations align their ITSCM plan with their day-to-day business operations. The five towers are:

Governance Framework
Requirement Analysis
Assessment & Strategy
Design & Build
Implement, Exercise & Maintain

1. Governance Framework

Governance provides guidance and oversight on implementing, managing, and maintaining ITSCM activities that satisfy the business's recovery objectives and requirements.

2. Requirements Analysis

Determining the requirements, i.e. Business Impact Analysis or BIA, helps an enterprise understand and prioritize the IT service continuity of mission critical business processes. The analysis provides the maximum amount of time acceptable for an outage for each business function relative to the risks and the financial impact. The results provide the ITSCM requirements that are used to prioritize critical business functions and supporting applications into ITSCM tiers and the ITSCM mechanisms.

3. Assessment and Strategy

A ITSCM assessment evaluates your current ITSCM capabilities using your current processes, procedures, and infrastructure deployment to identify any vulnerabilities that may impact recovery, fail-over, or fail-back to normal operations. Where gaps are identified, David-Kenneth Group recommends remediation processes and technical improvements to help meet the ITSCM requirements and mitigate vulnerabilities.

A strategy is created with actionable plans and technical procedures to “fill in the gaps” of the current ITSCM or to create a new one. The strategy is incorporated into the ITSCM program and include event identification, escalation, notification, continuity execution, and management of the event from event identification to a return to normal operations.

4. Design and Build

ITSCM design identifies and evaluates design alternatives that support the ITSCM strategy and are suitable for your environment and business goals.

David-Kenneth Group provides customized drawings, technical process specifications, deployment pattern design, and an implementation project plan as viable design alternatives. We also provide estimated capital expenses and operational cost estimates for budgeting and planning purposes

An ITSCM build offers assistance with modifying, building, and deploying the infrastructure and processes across the IT landscape.

5. Implement, Exercise, and Maintain

David-Kenneth Group implements the ITSCM program, provides training, and incorporates the ITSCM program into the Enterprise Services Continuity Management or Business Continuity Plan. We also develop a roadmap to manage change to infrastructure deployments and schedule awareness and exercise training.

When we perform exercises, via either tabletop or physical exercises, we document, observe, and recommend remediation action to improve the recovery and continuity posture. The results are delivered in a post-exercise summary report and can be incorporated into the IT Services Continuity Management or Business Continuity Plan.

David-Kenneth Group also establishes a maintenance schedule to review and update ITSCM posture and continuity procedures annually or after major change.

Assess Your Disaster Recovery Maturity

Take a 17-question assessment and receive instant feedback on your organization’s disaster recovery maturity.

Discovery Methodology for IT Disaster Recovery

To avoid a faulty DR plan, this white paper outlines 7 necessary steps for a thorough discovery. It helps minimize costly risks and ensures business continuity.

Learning Center

Here are some common questions about IT Service Continuity Management (ITSCM):

How important is executive sponsorship for IT Disaster Recovery (IT DR)?

Executive leadership is crucial. Once the risk and potential impact of not having an effective IT DR plan are realized, senior leadership will make the required investments to protect the enterprise. Ultimately, the business case for a comprehensive plan is that it pays for itself when disaster strikes. The three critical success factors are

having a plan,
keeping the plan updated, and
regularly exercising the plan.

What role does virtualization play in IT DR?

Geographical distribution of IT systems has never been easier with tools such as VMware and MS Hyper-V. Virtualization plays a significant role in the portability of applications and data to diverse locations in different weather or impacting zones and on different electric grids and network connections. If designed properly, it provides for greater survivability.

What is the economic risk if core applications go down for a day, a week, or even longer?

Some applications are revenue generating, and some are used for internal support. Each application is important, but not every application is equally important. Some business processes or applications might not have the same impact on any given day, but are vital at certain times of the business cycle. Working with the business units will help to determine an application’s revenue-impact by day, hour, and minute so you can prioritize applications according to financial and non-financial risks.

Is it prudent to protect applications differently?

Perhaps the biggest shift from legacy disaster recovery solutions to today’s in-house or cloud-based DR solutions is the ability to protect specific applications at their best-fit protection level, rather than forcing a one-size-fits-all DR solution. Today’s solutions allows for more streamlined costs – insuring that applications that need high cost protection have it, while not burdening low impact applications with high cost solutions for DR.

How often should we test?

A DR solution that doesn’t work when it’s called upon is a waste of money for the business and an unnecessary increase in IT operational risk. Twice annual testing is the recommended test schedule for most applications. If your applications change significantly, then testing should be increased to keep pace with the updates. There are various types of tests: tabletop/walk-through tests, application tests, application group test, or business function tests. When new applications are deployed, DR testing should be a part of the roll out plan to ensure continuity capability once in production.