How the data lifecycle and cloud services cause new problems
Blog / How the data lifecycle and cloud services cause new problems
5 MIN READ
In the cybersecurity world, data is everything. Companies are chocked full of both incoming and outbound data flowing in and out of their organisations that are the lifeblood of business operations, and cloud application and platform services only seem to add new complications to the conundrum of cybersecurity.
Being able to understand and manage the data lifecycle is a significant step in understanding, handling, prioritising and properly protecting important data as it makes its journey through, out and back into our organisation.
So, the question is, what is the data lifecycle and how do we map the data lifecycle? Well, first we need to talk about states and state models.
State models
One of the first things we need to do is map the data lifecycle with a state model. We develop models that describe software architectures and security architectures.
State models are very common in computing and are used for a variety of things:
- Regular expressions use a simple state model.
- Abstract data types can use state models. Programmers build the models with either structures or state tables.
- Lifecycles are commonly based on state models. Since they illustrate cycles, we always draw them in a circle.
In order to build a state model, we have to identify different states the system achieves and choose states that relate to properties we consider valuable, like correct behaviour and security properties. We represent separate states with boxes.
For example, a port on a server could have two security states, closed and open. So, if we only care about whether the port is open or closed, we create a two-state model and identify what actions, processes and events cause state changing, making it switch between open and closed.
We can also show events that don't necessarily, or at least immediately, cause a state change, like port knocking.
More detailed models have more states and events. A good model will reflect the system structure and organise its essential design features and details. This is essential for developers and testers to be able to identify correct and incorrect behaviour as well as compare secure and unsecure conditions.
The data lifecycle
Every system which processes data follows a data lifecycle, although the details vary depending on what the system’s focus is. Organisations create a lot of data through their accounting systems, other internal operations and by collecting it from users.
At its simplest, the data lifecycle can be broken down into these basic categories:
- We first create a data item.
- We store and then use it.
- We share it.
- We archive it and then destroy it when we're finished with it.
- After destruction, we can reuse the storage space for a new data item and the cycle repeats.
The data lifecycle mainly applies to persistent data that a company collects and maintains normally in a database. The data we utilise in most programs and processes are really just temporary copies of data stored in databases.
Data states
After creating a data item, we can utilise the data lifecycle to track how its use in the system and what risks it could be exposed to on its journey. Often, data moves back and forth between storage, where we can retrieve it, and processing. We will often also transmit data between systems and storage devices.
When a data item is no longer needed, we should delete the data and the cycle repeats. Data deletion is taken very seriously these days due to privacy regulations and compliance.
All of these elements involve event transitions. After the creation of a data record, we move the data through the storage and transmission states and back again as we go through the cycle.
The states of live data can be broken down into three widely-known states:
- Processing data in use.
- Storage is data at rest.
- Transmission is data in motion.
In order to keep track of security status, we often assign a few states to locations that are relative to the trust boundary of an organisation. Generally, transmitting data moves data outside of our trust boundary.
Using state models and data lifecycles, we can get a more precise expression of system goals that developers and testers can more effectively test against. This helps with implementing more reliable techniques to protect data in motion and data at rest.
Data states and cybersecurity
If we haven't applied cryptography to our data at each state, then the data is plain text.
This is not very good for cybersecurity, especially considering that if we use a cloud service, our processing will take place at a remote site with its own trust boundary.
Packet Sniffing
One of the biggest problems with data in motion, going from point A to B, remaining unencrypted as it goes over the internet, is that it can be intercepted and copied by malicious actors performing traffic sniffing via a software or hardware tool like Wireshark, breaching the confidentiality of an organisation.
In order to prevent packet sniffing with standard transport security, we implement the TLS protocol, also known as SSL, to encrypt traffic. This creates a ciphertext to represent the data in motion, blocking a potential sniffing attack from easily being able to read our communications.
Internal sniffing on Cloud server
Cloud providers maintain their own trust boundary, and are in charge of supplying the processing, resources, and standard mechanisms for that boundary. This can cause problems when you take into account that it is your company data that they are hosting.
It’s important to remember that you aren’t the only customer of the cloud provider you do business with. Cloud providers have numerous customers, likely including competitors and threat actors that buy access simply for the express purpose of facilitating their cyber attacks.
Essentially, if you store unprotected plain-text data in a cloud provider’s server, you’re still open to things like sniffing attacks.
Therefore, a company should:
- Establish a separate trust boundary around enterprise resources, taking advantage of access restrictions implemented by the cloud provider.
- Protect the cloud plaintext with company’s own software running on the cloud process along with the provider's security software so the provider’s protecting the data.
However, this still has disadvantages, such as:
- Losing some of the protection when processes stop running.
- Losing protection if the Cloud provider is successfully hacked.
- Losing the remaining protection if physical security fails, such as if a storage drive is removed from the provider's physical environment.
In order to reduce our attack surface, companies should avoid plaintext on the cloud server, except for when actively processing data. Any data that is stored and saved for later in the database on the cloud server’s service should be encrypted.
We use different crypto techniques for data at rest and for data in motion. A major difference manifests in key management, where we use different cryptographic keys for each. While Advanced Encryption Standard (AES) is used underneath, we use standard public key techniques to share cryptographic keys protecting and safeguarding data in motion.
Securiwiser
Securiwiser is a cybersecurity threat detection monitoring tool that evaluates your company’s cybersecurity posture and flags up vulnerabilities and active exploits in real-time, checking factors like the security of your network and cloud, suspicious port activity and scanning, data exposure, and much, much more, presenting them all in an easy-to-read dashboard.
How secure is
your school?
Blog categories
How secure is
your school?