...

Recovery Strategies MindMap

Download FREE Audio Files of all the MindMaps
and a FREE Printable PDF of all the MindMaps

Your information will remain 100% private. Unsubscribe with 1 click.

Transcript

Introduction

Hey, I’m Rob Witcher from Destination Certification, and I’m here to help YOU pass the CISSP exam. We are going to go through a review of the major recovery strategy topics in Domain 7, to understand how they interrelate, and to guide your studies and help you pass the CISSP exam.

Image of recovery strategies table - Destination Certification

This is the fifth of six videos for domain 7. I have included links to the other MindMap videos in the description below. These MindMaps are one part of our complete CISSP MasterClass.

Recovery Strategies

The recovery strategies we are about to discuss are all about getting parts, systems and even whole data centers back online if there is a failure, or even building in redundancy so that there is no downtime at all in the event of a failure.

The closer we get to making systems fully redundant, to minimize downtime, the more expensive the solution is going to be and conversely if we want to save costs it typically means longer downtimes in the event of a failure. Ultimately what should drive the decision of how quickly a system needs to be recovered or the amount of redundancy required is a business decision. The owner of the system needs to determine what is cost justified based on their business needs.

Failure Modes

Before we get into talking about different recovery strategies, we first have to talk about 3 different failure modes - 3 different ways that we can design systems to fail.

Fail-Soft

Fail-soft means we design a system to fail to a LESS secure state. For example, a firewall designed for fail-soft might let ALL traffic through in the event that it fails. This is why fail-soft is often referred to as fail-open.

Fail-Secure

Fail-secure is the inverse. Fail-secure means we design a system to fail to a MORE secure state. For example, a firewall designed for fail-secure might block all traffic in the event that it fails. This is why fail-secure is often referred to as fail-closed.

Fail-Safe

Fail-safe is completely different from the first two. Fail-safe are physical security mechanisms that are designed in such a way as to prioritize the safety of people, above all else, in the event of a failure. Doors in a building unlocking when the fire alarm goes off is a great example of a fail-safe mechanism - priortizes the safety of people.

Backup Storage

Alright, let’s now start with backup strategies, various methods we can use to backup data in the event of hardware failures.

Archive Bit

Image of archive bit -Destination Certification

But before we get into discussing the strategies, lets talk about an important bit, known as the archive bit. Meta data is data about data. And the archive bit is an example of meta data. Every file on a computer has an archive bit associated with it. If the bit is set to zero, no back is required. An operating system will automatically flip the archive but to one whenever a file is created or modified meaning the file needs to be backed up.

Types of Backups

Now we’ll talk about different backup strategies.

Mirror

Mirror backups, also known as stream backups, is an exact copy with no compression, no attempt to shrink the backup size, meaning mirror backups are very fast, but use a lot of storage space.

Full

Full backups are where every file is backed up regardless of what the archive bit is set to. Full backups employ compression, so they are not as fast as mirror backups.

Incremental

Image of incremental type - Destination Certification

Incremental backups are where we backup every change since the last incremental backup. Every time we perform an incremental backup, the archive bit is reset to zero for every file that is backed-up, which means you are only backing up files that have been created or modified since the last incremental backup. This minimizes storage space required for backups, but can lead to lengthy recovery times as multiple incremental backup tapes may need to be pulled and run sequentially.

Differential

Image of differential type - Destination Certification

Differential backups are where we backup changes since the last full backup. The archive bit is left set to 1 for every file backed up, which means during every differential backup you are backing up all new and modified files since the last full backup. This uses more storage space but speeds up recovery times as the maximum number of tapes you will ever need to pull is two: the most recent full backup and the most recent differential.

Table of backup strategy summary - Destination Certification

And here’s a summary of the different backup strategies.

Validation

It is important to validate that backups are occurring correctly.

Checksums / CRC

This can be done in numerous ways including Cyclical Redundancy Checks (CRC checks), checksums, bit-for-bit comparisons of the backup to the original data, or just spot checking select files. And these verification checks can be done while the backup is being performed and also periodically on shelved tapes.

Data Storage

It is important to think about where the backed-up data is being stored, how long it is retained, and how to make the backup process more efficient.

Offsite

Backups should be stored offsite, ideally in a geographically remote location, from the primary system or data center. It’s a wee bit pointless having great backups if they were located right beside the primary system that just burned to the ground or floated away in a flood.

Tape Rotation

Tape rotation schemes are different methods of keeping backup tapes for a period of time, and then re-using the tapes. Overwriting the old data with new data. The exact rotation scheme that an organization chooses needs to be driven by their retention policy which is driven by regulatory and contractual requirements, restoration needs, and costs.

RPO

The Recovery Point Objective is the Maximum tolerable data loss an organization is willing to accept as a measurement of time: 5 seconds worth of data, or 5 minutes, or 5 hours, or 5 days. You get the point. I raise the RPO here as it is a major driver of the cost of a backup solution, the shorter the RPO, the less data an organization is willing to lose, and therefore, the more expensive the backup solution is going to be. So if an owner wants to reduce costs associated with backups they may need to look at reducing their RPO requirement.

Spare Parts

Now let’s switch topics slightly and talk about spare parts: spare power supplies, spare RAM, spare hard drives, etc. Any type of part you might put in a system.

Cold

A cold spare is simply one of these spare parts on a shelf somewhere. With cold spares, If the primary power supply fails, the system is going to be down for minutes, hours, or even longer depending on how long it takes to get the spare part off the shelf and installed in the system so it can be brought back online.

Warm

A warm spare is a spare part installed in a system but it is not powered on and ready to go. With warm spares, if the primary part fails, the system is still going to go down, but recovery time will be much shorter as someone just needs to manually flip a switch, to switch over to the spare part and get the system back up and running.

Hot

Hot spares are spare parts installed in the system AND powered on and ready to go. So, if the primary part fails, there will be an automatic switch over to the spare part and the system will remain up and running.

RAID (Redundant Array of Independent Disks)

Now let’s talk about how we can use multiple hard drives, simultaneously, to achieve greater speed, greater redundancy, or both.

RAID 0 (Striping)

Image of Raidn 0 - Destination Certification

RAID 0, also known as striping, uses two or more hard drives. When a file is sent to the RAID controller, the file is split into two pieces, the first half is written to the first hard drive, and the second half of the file is written to the second hard drive. RAID 0, therefore, is all about speed because we have essentially doubled our read and write speed, but at the expense of redundancy. RAID 0 at least doubles the chance of data loss because if one of these drives fails you’ve lost half your file which is essentially all of your file. So RAID 0 = speed.

RAID 1 (Mirroring)

Image of Raid 1 - Destination Certification

RAID 1, also known as mirroring, uses two or more hard drives. When a file is sent to the RAID controller, the file is copied, the first copy is written to the first hard drive, and the second copy of the file is written to the second hard drive. RAID 1, therefore, is all about redundancy because if we lose a hard drive, we still have a complete copy of the file on the other hard drive. So RAID 0 = redundancy.

It’s not listed here because you now already know what it is. RAID 10 or raid 1 plus 0 is RAID 1 and RAID 0 together. RAID 10 therefore requires a minimum of 4 hard drives. A file is mirrored and then stripped creating four fragments of data which are written to the 4 hard drives.

RAID 5 (Parity)

Image of Raid 5 - Destination Certification

RAID 5 is meant to be the happy medium, you get nearly the speed of RAID 0, you get the redundancy of RAID 1, and you don’t need as many hard drives as RAID 10. RAID 5 requires a minimum of 3 hard drives. When a file comes into the raid controller it is split in half like RAID 0, and then the magic happens, some parity data is calculated using exclusive OR math. This magical parity data allows you to reconstruct either piece of the original file with the remaining piece and the parity data. These three chunks of data, the two pieces of the file, and the parity data are written to the 3 hard drives.

RAID 6 (Double Parity)

The last flavour of RAID that I will briefly mention here is RAID 6. RAID 6 is very similar to RAID 5 in that it provides speed and redundancy, but RAID 6 adds additional parity data such that TWO hard drives can fail and the data can still be recovered. RAID 6 essentially provides double redundancy.

Image of RAID summary table - Destination Certification

And here is a handy dandy summary of the different types of RAID. Remember the minimum number if hard drives required for each type of RAID.

High Availability System

High Availability Systems means we want a system that doesn’t go down in the event of a failure. We want redundancy at the system level. We can achieve high availability through clustering and redundancy.

Clustering

Clustering means we half multiple systems working together simultaneously to support a work load. Think a cluster of web servers behind a load balancer. If one of the members of the cluster goes down, the cluster is still running but at reduced capacity.

Redundancy

Redundancy means there are multiple systems, a primary and one or more secondary systems. These systems are not working together. Rather the primary is doing all the work, and if it fails, the secondary system will take over to fully support the workload.

Recovery Sites

Okay, now let’s talk about how we can recover not just a part, or a system, but whole data centers.

Types of Sites

We are going to talk through the different types of recovery sites starting from the cheapest option which requires the most time to recover, and building up to a redundant site which costs a ton of money but can potentially have zero downtime if the primary site goes down.

Cold

A cold site is just the shell of a building. No cabling has been run, no server racks are in place, no expensive equipment like servers, no data and no people. So cheap but it can take weeks to get a cold site up and running.

Warm

A warm site is a shell of a building plus, the cheap equipment like cabling and racks, but no expensive equipment like servers, no data and no people. A little more expensive and recovery time is down to days.

Hot

A hot site is the building, the cheap equipment and the expensive equipment all setup and ready to go. All that is missing is the data and people to operate the site. Put another way, a hot site is a fully equipped data center. Hot sites are much more expensive but now our recovery time is down to hours, maybe even seconds as I’ll talk about in a moment.

Mobile

There are actually a couple of subtypes of hot sites. The first is a mobile site, which is simply a hot site on wheels. Typically, a shipping container crammed with equipment. Mobile sites can be moved to where they are needed and all that is required to get them up and running is data and people. So, recovery times for a mobile site are hours or possibly days if you have to transport the mobile site across the country first.

Mirror / Redundant

And the second sub-type of hot site is a mirror or redundant site. A mirror site is a fully operational data center, staffed and running 24/7. A mirror site is a fully operational data center operating in parallel with the primary site. So huge cost but recovery times for a mirror site can be seconds and possibly even instantaneous depending on how it has been architected.

Image of recovery site strategies summary table - Destination Certification

The RTO, the recovery time objective, is what is going to drive an owner to select between these different recovery solutions.

Here is a summary of the different recovery site strategies.

Geographically remote

image of geographically remote map - Destination Certification

Any of these redundant sites should be built in a geographically remote location from the primary site. Geographically remote does not imply any exact distance but rather far enough away from the primary site such that whatever disaster has befallen it, earthquake, hurricane, flood, wildfires, massive power outage, etc. will not also affect the recovery site.

Image of recovery strategies table - Destination Certification

And that is an overview of Recovery Strategies within Domain 7, covering the most critical concepts you need to know for the exam.

If you are looking for an in-depth CISSP study guide, you should checkout our book: Destination CISSP and concise guide.

We wrote the book to be as concise as possible and engaging to read with lots of tables, diagrams, summaries, etc. We cover all of these recovery strategies in detail along with all the other major topics you need to know for the exam.

You can find more details on our guidebook here: https://destcert.com/cisspguide/

Image of next mindmap - Destination Certification

If you found this video helpful you can hit the thumbs up button and if you want to be notified when we release additional videos in this MindMap series, then please subscribe and hit the bell icon to get notifications.

I will provide links to the other MindMap videos in the description below.

Thanks very much for watching! And all the best in your studies!

Image of a purple ad - Destination Certification