What Is RTO and RPO in Disaster Recovery?

Posted on Tuesday, December 15, 2020

Extended periods of downtime are simply not an option for many businesses performing crucial business for their customers. But unforeseen incidents such as power outages, natural disaster and human error can strike at any time and take valuable services offline. This is why having a business continuity and disaster recovery plan in place is so important.

Two of the most important parameters of any disaster recovery plan are Recovery Point Objective (RPO) and Recovery Time Objective (RTO). With these, a business can plan an optimal recovery strategy that is unique to the services they offer as a business.

What do RTO and RPO mean?

Recovery Point Objective (RPO) is all about lost data. It represents the amount of time that can pass before the amount of data lost due to disruption exceeds the tolerable threshold the business has set out. This, of course, will be different depending on the business.

Recovery Time Objective (RTO) is all about downtime. In practical terms, it represents how quickly a service must be up and running again at a normal level of operation following any disruption. This is sometimes defined as the maximum downtime a business can tolerate.

What is the difference between RTO and RPO?

At first glance, these two parameters may seem very similar. But they are distinct from one another. You can think of the difference as being all about points in time – past, or future.

RPO is all about looking back. It represents the amount of time between failure and last valid backup as a way of quantifying a variable amount of data that has been lost or will have to be re-entered during network downtime. 

RTO is all about looking forward. It represents the amount of real time that can pass before disruption to normal operations begins to seriously affect and impede normal business. This should be measured from the point at which users are affected.

The difference can be visualised using an analogy. Imagine you’re writing a report on your PC when suddenly the power dies. RPO can be thought of as the last time you saved the document when you were working on it – how much of your work (or data) can you lose before it seriously affects you? RTO is then the amount of time you can handle being offline. If you’ve got a tight deadline to hit, your RTO will be lower as you need to be up and running sooner so you can recover your work, re-enter any lost data and continue writing.

How to define your RTO and RPO

There is no universal ‘correct’ answer to how much downtime and data loss your business can tolerate. This means that RTO and RPO will be different for each business. If you outsource your IT services, these will often be defined within your service level agreements.

RTO and RPO parameters will also differ between different applications or services within your business. It’s good practice to analyse these and categorise them based on which are critical to guaranteeing your business can stay operational and generate revenue. For example, you may want to use a three-tier model to prioritise asset recovery:

  • Tier 1: mission-critical applications. These are essential to the survival of the business and need to be offline for as little time as possible – no time at all if possible. For example, this could be the electricity supply to premises where servers are housed. Tier 1 recovery time is typically between 0-2 hours.
  • Tier 2: business-critical applications. These are essential to successful business operations. The longer these are offline, the more financial and reputational damage it will do to the company. For example, this could be an online payment processing system for an e-commerce site. Tier 2 recovery time is typically between 4-24 hours.
  • Tier 3: non-critical applications. These contribute to the normal operating level of the company, but you can survive without them temporarily. These should be the lowest priority for getting back online following a disruption. For example, this could be phone lines within your office. Tier 3 recovery time is typically over 24 hours.

The bottom line here is that any critical application is vital to the successful operation of your business. Getting these online quickly must be your priority. Having your assets inventoried and prioritised by their RTO and RPO can help you do this and minimise disruption.

Balancing what is ideal and what is realistic

In an ideal world, both RPO and RTO would be as close to zero as possible. If disruption occurred, there would be backup and recovery strategies already in place that mean next to no data would be lost, and operations would be up and running again in no time at all.

But in reality, this would be incredibly expensive and may not be necessary for smaller businesses. This is why disaster recovery cannot be a ‘one size fits all’ approach. Setting parameters such as RTO and RPO in your business and testing if they are realistic on a regular basis can help you stay prepared in case disaster strikes. 

 

About Mustard IT, your technology partner

Mustard IT is a trusted team, experienced with the latest technology and able to explain complex issues to you in a language you’ll understand. Contact us today to find out how we can help you.