Before you read this post, it is advisable to read the GDPR Summary.
The GDPR is limited on specifics about how you need to host and manage your software. This is deliberate, given how quickly technology changes. There are, however, some core principles;
- Systems must be secure by design and by default (Article 25) This very open-ended statement in the GDPR puts the onus on you to keep up with best practice.
- You must ensure that your processing is done in a manner that ensures "appropriate security" of the data using both technological and organisational measures and that these measures are periodically tested. See Article 5.1.f, 24 and 32 or Recitals 74 and 78.
- Report data breaches within 72 hours of becoming aware.
Recital 78 also talks about the need to establish technological and organisational measures to ensure that breaches are identified quickly.
- Encryption is not required by the GDPR, but is specifically referenced and recommended.
- Losing the data, as in the data being deleted and you don't have a backup, is also considered a data breach under the GDPR, meaning that an inadequate backup strategy will put you in breach of the GDPR.
From an Ops point of view, it is important to understand that the GDPR puts just as much emphasis on having appropriate processes in place as it does on technological protection; You need to limit which of your own people can access which data and track what they do with it, not just focus on foiling external attackers. For many smaller organisations this is often something that has not been considered before and will require both effort and, often, a change of culture.
At the same time, the GDPR is clear that you need to assess the risk; The level of security you have to apply depends on your risk assessment, there is no fixed set of requirements.
The first step, as always, is to ensure your application developers have done what they can to make sure the software itself is as solid as can be.
Thereafter you need to consider adding additional security layers, for example;
- A Web Application Firewall for Websites; There are many providers and variations, from Azure's WAF to a range of products from Cloudflare. What these products can do is detect a range of typical attack patterns and stop them before they even hit your website.
- VNets and ports; Lock down your network, ensure your back-end servers are only accessible from within a VNet, remove RDP ports from servers etc. If you are hosting in Azure, for example, some but definitely not all of these are done by default, especially the VNet configuration.
- Use end-to-end encryption, including between your webserver and database server, even when you are also using a VNet.
Once you have taken steps to protect your application from attacks through the front-door, you face the much thornier issue of preventing data being leaked through insiders, whether maliciously or through social engineering or indirect attacks. Similarly, you also need to plan for the situation where the attackers do get through the front-door so that you limit how much data they can get access to.
The first step is to implement appropriate internal processes for restricting access to data. The principles in ISO27001 are a good starting point and are a lot less daunting than may be imagined. At least, implement the following policies and processes;
- Create a policy that no-one has access to any data by default unless they are specifically granted access to it for a reason. This includes things like database access, access to disk drives, log files, backups etc.
- Create a list of your systems and the data they hold (an Information Asset Register). A simple Excel spreadsheet will do. On that list include;
- A rating for how sensitive the data in the system is.
- An assessment of how bad it would be for your organisation if that data were leaked.
- A nominated Information Asset Owner who is responsible for controlling that data.
- A list of everyone who currently has access to that data (when you first compile it, this will likely scare you).
- Create a very simple, but written-down, process that, in order for anyone to get access to any data, they must first seek the approval of the Information Asset Owner, who is responsible for granting or denying it and for updating the Information Asset Register. This is a very simple, but powerful, way to ensure there is accountability for data access.
You don't have to make any of this difficult, in fact you can make it run very easily. The key is that you ensure that giving someone access to data is a deliberate choice.
Note; None of this advice about processes comes from the GDPR; The GDPR is devoid of any specific recommendations about how you do this, so use this as you wish.
One interesting point in the GDPR is that you may not need to report a data breach if the data was encrypted. The reporting obligation is based on risk to the fundamental rights and freedoms of individuals; if a data breach, or indeed an outage or a loss of data, causes risk to individuals, then you need to report it. Conversely, if you are confident there is no risk because the data is unreadable by the attacker then you (probably) do not need to report it.
You need to make sure all personal data is encrypted at rest, that is your minimum baseline. If you use Azure, that is generally quite easy; both SQL Azure and CosmosDB supports it via a tick-box and Blob storage, as long as the storage account was created relatively recently. If you have an older Blob storage account, you'll have to create a new one and move all the files.
However, encryption at rest really only helps if someone steals the harddrive. It doesn't help you prevent someone with legitimate access to your database downloading unencrypted data – or a hacker who has got hold of valid database credentials.
You need to consider further measures to encrypt or restrict data from legitimate users, so this usually means application-level encryption, which can be complicated to implement.
If you are working with SQL or Azure, there are a few technologies you may want to consider;
- SQL Always Encrypted is application-level encryption implemented at the database driver level so can be implanted with less disruption to the application. It even has some very clever, if partial, support for searching the encrypted data.
- SQL Dynamic Data Masking on Azure allows you to obscure data from certain database users.
- Azure Key Vault allows you to store secrets securely and even has some encryption/decryption features you can use to secure data.
You need to review all the different bits of data, databases and so forth you have and restrict who can access what. If you are on-premise, that is as much as we can tell you really. Except, maybe, to have a long, hard think about how you restrict access to indirect vectors such as disk drives and backup tapes etc.
If you are on Azure, there are a number of technologies you should look at;
- Ensure that all user access to the Azure Portal and to Azure SQL are using an Azure AD account. You can do this even if you don't use Office 365, you just need to set up an Azure AD. Do not allow access from Live accounts.
- Require 2 Factor Authentication for all users with access to any data asset. This will even give you 2FA on SQL logins, which is very powerful.
- As far as possible use Service Principals for applications accessing other Azure data assets, such as SQL Azure. With Azure Managed Identity, you can get a Service Principal injected into your application without having a password stored anywhere. This, in practice, means it is impossible for anyone to log in as the application, thus simplifying your audit requirements.
- Use Azure Key Vault to store all other connection strings, secrets and certificates and use the Managed Principal to retrieve those secrets at runtime.
This is specific to SQL Server, whether on premise or on Azure but is such a powerful tool it has been included here; In short, when you run this tool on your SQL database, it will assess how secure your SQL configuration is. It will even look for column names that look like they contain sensitive data, in order that you may choose to encrypt or mask those columns.
The Vulnerability Assessment also allows you to save a "baseline", meaning you can come back later and compare what has changed. This kind of security configuration management can be a useful tool in protecting your data and will help evidence what you have done to guard against data leaks.
At this point you have reduced your attack surface as much as practicable; everything is locked up tight and no-one can get in. Except, there is always a way and the GDPR requires you to "need to establish technological and organisation measures to ensure that breaches are identified quickly".
In order to do that you need to turn to auditing, monitoring and alerting. There are many great tools for this, but if you are using or thinking about using Azure, have a look at Log Analytics – it has some very powerful features. Azure are moving all their logging in to Log Analytics and you will soon be able to centralise all logs, including all Azure infrastructure logs as well as all your own trace logs in one place and set up alerts etc.
Audit data access
Make sure that you audit all database access. Consider setting up alerts if anyone or anything other than the application accesses the database or if queries are run that download more than a certain number of records.
Monitor for attack patterns
There are many kinds of "soft" attacks that may not directly trigger any alerts by external tools because they are more application specific, but an increase in them may indicate that an attack is underway. It is a good idea to set up alerts to notify you when the incidence of these over a time period is higher than usual. The exact things to look for will be application specific, but here are a few ideas to get you started;
- Failed logins; If you see a sudden increase in the number of failed logins to your application per time period then this could indicate that someone is trying to bruteforce the system.
- A sudden increase in general traffic could be a warning that something is wrong.
- An increase in 403 and/or 404 responses could indicate a few different potential attacks;
- Someone is probing your system to find unsecured URLs.
- A legitimate user may access data it shouldn't see, by enumerating the ID in a URL such as https://mystem.com/accounts/123
If you use SQL Azure, you should switch on Threat Detection; It's a machine-learning tool that will detect both known attack patterns (such as SQL Injection or brute force attacks) and also unusual activity that deviates from the baseline, such as logging in from an unusual location.
Erasure and deletion
The GDPR has clear rules on the need to delete data you no longer need and on individuals' rights to tell you to delete their data. There is no exemption from those rules. However, there are times when this is not technically practicable, including but not limited to;
- Temporal tables
- Log files
- Audit files
Unfortunately, there is no answer to this at the time of this writing. Hopefully some sensible guidelines will come out of the ICO eventually on this. In the meantime, at least try to do the best you can and document what you can't do so you can show you tried your best to comply.