CCSP Domain 2 - Cloud Data Storage MindMap

Download a FREE Printable PDF of all the CCSP MindMaps!

Your information will remain 100% private. Unsubscribe with 1 click.

Transcript

Introduction

Hey, I’m Rob Witcher from Destination Certification, and I’m here to help you pass the CCSP exam. We are going to go through a review of the major topics related to cloud data storage in Domain 2, to understand how they interrelate, and to guide your studies

Image of Cloud Data Storage table - Destination Certification

This is the second of five videos for Domain 2. I have included links to the other MindMap videos in the description below. These MindMaps are a small part of our complete CCSP MasterClass.

Cloud Data Storage

We’re going to start with a bunch of fundamental concepts related to storing data in the cloud including types of storage, controllers and clusters. Then we will discuss these in relation to the major service models. We’ll end the MindMap by going through the major threats to storage that you need to know about.

Types

So, let’s dive in by talking about the major types of storage that we have access to in the cloud.

Virtual constructs

Starting with the two major types of virtual storage that we have access to. What I mean by virtual here, is that as a customer we are not getting direct access to any sort of physical storage device like a hard drive. Put another way, we are not being assigned any dedicated hardware. Instead we are accessing types of virtual storage.

Volume

A volume is an emulation of a hard drive. A volume is a virtual hard drive. Volume storage is often referred to as block storage, and it works just like a traditional physical hard drive. The operating system manages the filesystem, and a volume is attached as storage for cloud instances (like virtual machines). So, think of a volume as a virtual hard drive for your virtual machine.

Object

Object storage is very different–there is no emulation of hardware here. Instead object storage is essentially a simplification of storage. Object storage stores data as objects in a flat namespace. Each object includes the data itself, metadata, and a unique identifier. Object storage is typically accessed via APIs rather than being attached like a virtual hard drive to a VM. Object storage is designed to handle massive amounts of unstructured data. Think S3 buckets, or Azure blob storage.

CDN

A content delivery network–a CDN–is a distributed network of servers that delivers web content to users based on their geographic location, ensuring faster load times and reducing server load by caching content close to the user's location. CDNs are closely linked to object storage in the cloud. Typically, object storage is the primary repository for large amounts of unstructured data. The CDN accelerates the delivery of this data by caching it at edge locations closer to users, improving performance and reducing latency. Think AWS CloudFront or Akamai as examples of CDNs.A content delivery network–a CDN–is a distributed network of servers that delivers web content to users based on their geographic location, ensuring faster load times and reducing server load by caching content close to the user's location. CDNs are closely linked to object storage in the cloud. Typically, object storage is the primary repository for large amounts of unstructured data. The CDN accelerates the delivery of this data by caching it at edge locations closer to users, improving performance and reducing latency. Think AWS CloudFront or Akamai as examples of CDNs.

Long-term

Long-term storage is designed for storing data that needs to be retained over–you guessed it–a long period of time. Examples include backups, archives, and compliance data. Long-term storage solutions are optimized for durability and cost-efficiency, rather than speed.

Ephemeral

Ephemeral storage refers to the temporary storage that exists only for the duration of a session or instance. When the instance is terminated or stopped, the data stored in ephemeral storage is lost. It disappears into the ether. Poof– gone!

Raw-disk

Raw-disk storage refers to storage volumes directly attached to a virtual machine or physical server. It gives low-level access to the disk, allowing full control for custom configurations and greater performance. Raw-disk is essentially a dedicated physical drive for a specific customer. This is going to be a lot more expensive, however, it can be useful for applications that require direct disk access for performance reasons, such as database systems, or where users need to manage partitions and file systems themselves.

Storage method

Okay, let’s now talk about a cool way that data can be stored in the cloud–through fragmentation and dispersion. Also commonly referred to as bit splitting.

Fragmentation & dispersion

Fragmentation refers to the process of breaking down large files into smaller pieces, or fragments, which are stored separately across a storage system. This is commonly done to optimize space or speed up access. Dispersion refers to the process of spreading or distributing data across different storage locations or nodes. Dispersion can provide redundancy, security, and improve performance.

The basic idea here is that when you save a file to the cloud, the cloud provider's systems will fragment the file into a number of fragments and then disperse these fragments across multiple storage nodes.

SSMS

There are two algorithms you need to know about that can perform this fragmentation and dispersion:

The first is Secret Sharing Made Short. The SSMS algorithm will first divide a file into multiple fragments or "shares". These shares are then distributed across different cloud servers or storage systems. A predefined number (a threshold) of shares are required to reconstruct the original data. If fewer than the threshold number of shares are accessed, no useful information can be gleaned from the individual fragments. By distributing these shares across different servers or geographical locations, SSMS protects against data breaches, as an attacker would need to access multiple cloud servers simultaneously to retrieve the full data.

SSMS is particularly useful for key management or storing highly sensitive data, ensuring that no individual piece of data is useful by itself. It also minimizes overhead, making it more efficient for cloud storage.

AONT-RS

AONT-RS (All-Or-Nothing Transform with Reed-Solomon) combines two key technologies to both secure and make cloud data storage more resilient.

All-Or-Nothing Transform is a cryptographic transform that ensures that data is broken down into multiple fragments, where each piece is essential. Without all of the pieces, no useful information can be retrieved, making data harder to attack.

Reed-Solomon coding adds error correction to the data fragments. If some fragments are lost or corrupted (due to server outages or hardware failures), RS coding allows the original data to be reconstructed using the remaining fragments.

After the data is fragmented and protected, these pieces are distributed across multiple cloud locations. This makes it virtually impossible for a hacker to retrieve the complete data without access to all pieces, while also ensuring that data can be recovered even if some fragments are missing.

Cool algorithms!

Storage controllers

Moving on to storage controllers, which are hardware devices or software components that manage data flow between a computer system and storage devices such as hard drives, SSDs, or storage arrays. Storage controllers act as an interface between the server and the physical storage media.

Storage controllers perform tasks such as data caching, RAID management, and handling read and write operations.

Let’s now talk about three protocols that can be used by storage controllers to move data around.

iSCSI

iSCSI–the Internet Small Computer Systems Interface–is a protocol that enables the transmission of SCSI commands over a TCP/IP network, allowing you to access storage devices over a standard Ethernet connection. iSCSI does not support encryption so other protocols (e.g. IPsec) must be used to encrypt iSCSI traffic.

FC

Fibre Channel is a high-speed networking technology used to connect systems to storage devices in a SAN–a Storage Area Network. FC offers dedicated, lossless, high-bandwidth communication.

As the name, Fiber Channel, suggests, fiber-optic connections are used. So this is an expensive and high-speed connection.

FCoE

FCoE–Fibre Channel over Ethernet–encapsulates Fibre Channel traffic within Ethernet frames, allowing Fibre Channel traffic to run over standard Ethernet networks, enabling network convergence.

FCoE is commonly used to consolidate storage and networking infrastructure, reducing the need for separate cabling and switches for Fibre Channel and Ethernet traffic.

Storage clusters

The next topic is storage clusters. A storage cluster is a group of networked storage devices or servers that work together to provide a unified storage solution. Clusters enhance data availability, scalability, and fault tolerance by distributing data across multiple nodes.

Tightly coupled

In a tightly coupled cluster, nodes (servers or storage devices) are closely connected and share memory or a high-speed interconnect. They often operate as a single system. Tightly coupled systems typically have low latency and high-speed communication between nodes.

Tightly coupled clusters are going to be more expensive, but they provide very high performance and low-latency storage.

Loosely coupled

Loosely coupled clusters are essentially the opposite. They consist of nodes that are relatively independent and communicate over a network with less frequent interaction. They do not share memory and have higher latency in communication. If performance and low latency aren’t critical, you can save some money, and loosely coupled clusters have the additional benefit of being easier to scale.

Image of a diagram that depicts the difference between tightly and loosely coupled clusters - Destination Certification

Here’s a diagram that depicts the difference between tightly and loosely coupled clusters.

Tightly coupled clusters offer high performance, with a higher cost.
Loosely coupled clusters have lower performance, at lower cost, and are easier to scale

By service model

Lets now talk through which of these different storage options we have access to by service model.

SaaS

We’ll start with software as a service. It’s really simple–in SaaS, we have access to none of these storage options. SaaS means you are simply renting access to someone else’s application–you have zero control over the architecture of the software and what storage solution it’s using on the back-end.

Web interface

The major way that we have access to storage in a SaaS application is simply through a web interface. We can perhaps upload some files through the web-interface, or enter some data through forms, and we can download files or view data, but we have zero visibility or control over how that data is being stored.

PaaS

Moving on to platform as a service, we now have some options. When you are building your own custom application, you can define how that application stores data.

Structured & Unstructured

Depending on what sort of data you are storing, you could choose to store your data in a structured format: nice and neat columns and rows, like a table in a database. Or in an unstructured format, such as text files, images, videos.

Databases, NoSQL, Big data

If you are building an application, you can choose to connect it to a database. PaaS environments will offer managed relational databases where structured data is stored in tables with predefined schemas, often accessed using SQL. NoSQL storage solutions are another option that are designed to handle large volumes of unstructured or semi-structured data. There are also big data storage solutions that are designed to manage and process massive datasets, often involving parallel processing and distributed storage.

Object

The final PaaS option we’ll discuss here is the aforementioned object storage. Your application can store its data as objects. This solution is ideal for storing large amounts of unstructured data such as images, videos, backups, and archives.

IaaS

Moving on again, we arrive at infrastructure as a service. This is where we have the most storage options, by far!

Volume

You have access to volume storage. Remember, a volume is essentially a virtual hard drive and you can attach one or more volumes to a virtual machine.

Object

You also have access to object storage.

Raw storage

Another option is raw-disk storage. Raw-disk is going to be more expensive, but it may be cost justified depending on the requirements.

Ephemeral

Ephemeral storage is temporary storage that exists only for the duration of a session or instance.

Databases, NoSQL, Big data

Finally, you also have access to various types of databases. To sum it up, you have access to all the options in IaaS because you have so much more control over the virtual environment.

Threats to Storage

Alrighty, let's move on to discuss some common threats to storage. None of these threats are unique to the cloud environment, but some of these threats are exacerbated in the cloud. So, you definitely need to be thinking about them.

Unauthorized usage

Unauthorized usage is where cloud resources, such as storage, are used without permission. It could result in over-consumption of resources, leading to unexpected costs or reduced performance for legitimate users.

Unauthorized access

Unauthorized access is where cloud storage is accessed without permission. This can result in the exposure of sensitive data, data integrity issues, or even deletion of important data. You want to carefully control who has access to data in the cloud. As you’ve likely seen by the almost daily data breaches in the cloud, it’s relatively easy to misconfigure settings and expose your data to the world.

Regulatory non-compliance

Regulatory non-compliance is failing to comply with regulations, such as GDPR or HIPAA. The consequences can be legal penalties such as fines. Regulatory bodies require specific controls for data security, retention, and privacy, which must be adhered to when using cloud storage.

Denial of Service (DoS)

Denial of service (DoS) is the threat where an attacker seeks to make a system–or in this case, some data–unavailable to its intended users by temporarily or indefinitely disrupting services. This can prevent access to data or applications and affect service continuity.

Theft or accidental loss

Theft or accidental loss is the threat where data stored in the cloud can be accidentally deleted or lost due to human error or infrastructure failures. Additionally, data can be stolen if proper security measures are not in place.

Malware

Malware is malicious software, such as ransomware or viruses, which can infect cloud storage, encrypting or corrupting data. This can happen through insecure APIs, user devices, or compromised credentials. And the results can be data corruption, loss, and ransom demands, leading to financial losses and downtime.

No sanitization

Lastly, but certainly not least–this is important! No sanitization, or lack of proper sanitization, is where sensitive data is not properly removed or destroyed to prevent it from being recovered. For example, if a hard drive fails and is not securely disposed of, then someone could recover sensitive data from the failed drive. If you are storing sensitive data in the cloud, it’s important to understand your cloud provider’s data sanitization processes and ensure they meet your requirements.

Image of Cloud Data Storage table - Destination Certification

That’s it for our MindMap on Cloud Data Storage in Domain 2, covering many of the essential topics you need to know for the exam.

Image of next mindmap - Destination Certification

If you found this video helpful you can hit the thumbs up button and if you want to be notified when we release additional videos in this MindMap series, then please subscribe and hit the bell icon to get notifications.

I will provide links to the other MindMap videos in the description below.

Thanks very much for watching! And all the best in your studies!

Image of masterclass video - Destination Certification

The easiest way to get your CCSP Certification 


Learn more about our CCSP MasterClass

>