One Problem at the Center of it all – Data Cannot Protect Itself


Cybersecurity is one of the top concerns for all individuals, companies, governments, and organizations today because – despite all of the tools, products and services available today for cyber – data is still being lost at a constant and every increasing rate.  The crisis of data loss is simply non-stop. Every day, we see the headlines of another data breach or identity theft or cryptocurrency loss. 

Something is missing.  Something has to change because continuing to do the same thing and getting all of the wrong results is just not sustainable.  We have to accept that what we are doing is not working and we need to get to the core of the problem – but it seems that we are missing the issue – something is not right.  Why can we not stop the loss of data in our lives?

There is a simple reason: the entire cybersecurity market of tools, products and services are all based on one premise only: Data cannot protect itself, and therein lies the problem. We are trying to solve a problem from the inside out.  Our base premise needs to change.

One single hole in the wall – and it was all for nothing

Our data is vulnerable to anyone who can get to it. Once data is out of our realm of control – we have to assume that it is gone, and we are never getting it back.  You cannot retrieve it, control it, even see what is happening to it. Data is the one common denominator across all of IT systems and solutions, yet it is the weakest link.

So herein lies the problem:  data cannot protect itself and this has been an unsolvable problem.  Until now.

In today’s cybermarket, there are really two categories of technology for the protection of data, and they apply – in simple terms – to a) the protection of data itself, and b) the forensics and threat intelligence we use to understand where data goes and what happens to it when it is accessed. 

For the first half – the protection of data itself – we try to rely on a multi-faceted defense-in-depth approach surrounding our data, which may include, but is not limited to, many of the following:

  • For the network: Data Loss Prevention – which is meant to keep data in place and not let it leak out of your systems if possible. This solution requires a myriad of other solutions to properly work as well (we’ll drill down on complexities of DLP in our next blog post).
  • For the devices: End-Device Protection/Management – which is meant to check endpoints to ensure they are safe for data to reside on and try to ensure that bad actors cannot get to data on individual devices.
  • For the users: Identity Management – to ensure that only the right people get access to the right resources – as long as those identities can be kept safe and secure themselves.
  • For the data: Data Encryption – to ensure that only the right people can see data – but once that authorization or dispersal of access is done, simple one-time encryption does not work anymore.  Key management (another future blog post) has also been incredibly problematic with no easy solutions to date.
  • For the servers and compute resources: Cloud, datacenter and networking protections – to ensure that data stays where we want it to stay and is accessed only how we want it accessed.
  • And more….and more….

For forensics and threat intelligence, it is also a multi-faceted defense-in-depth approach:

  • For the servers and compute resources: Firewall logs – to understand the ingress and egress of data from our known physical datacenter
  • For the devices: End-point control system logs – to understand how data is being brought into our controls, accessed, and egressed on mobile devices, desktops, laptops and servers
  • For the multiple cloud-based services: CASB logs – to understand how data flows in and out of cloud-based resources
  • For intrusion and insider threat prevention: IDS/IPS – to try to detect intruder behavior and understand the flow of attacks – unfortunately in a reactive state since the behavior had to happen before we can catch it
  • And to pull it ALL together: SEIM – to pull all of these solutions above together into one common platform for analysis of what happens to data after the loss has already occurred – in the hopes of stopping future behavior or analyzing an attack that has already happened

We need all these solutions to try to keep data protected all because data cannot protect itself.  The costs, complexities, intricacies, synchronization, and maintenance of all these defense-in-depth models is overwhelming – and still not as effective as we need it to be.  Data continues to be lost at an exponential rate.  The result is that we are forever in “pray and react” mode when protecting our data.  we are hoping for the best, depending on multiple technologies to be tied together, unified on levels that have yet to be successfully achieved in most implementations.

Praying for the protection of data – and reacting to the results

Praying for Protection: with all the tech and solutions listed above required to make even the most basic of cybersecurity operations to work, even when we get it all right, too many variables can make it go wrong:

  1. Identity must be pristine and incorporated across all layers of the architecture seamlessly, with identity hygiene and maintenance a critical exposure points for any organization if they get it right initially.
  2. All points of control must be synced for optimal protection.  The end point devices need to be getting all the updates and protections on any moment’s notice.  The servers, desktops, and mobile devices need to all be up to date – even one aberration is enough for compromise to happen.
  3. All network ingress/egress points need to be protected and monitored across the data centers, virtualization, and cloud resources.  Any gaps in the armor for the network will be exploited.

In other words, we are praying we got everything right, praying that we didn’t miss anything, and doing our very best through people and processes to ensure all of this is working to protect data – all because data cannot protect itself.

Reacting to situations – after the fact: All threat management and intelligence systems need to have all feeds from everything above to be succinct and accurate – and even when this is the case and everything is working optimally, it is still a reactive control and will only catch aberrations or threats after the fact.

The irony of it all – data cannot – and should not- be contained

The entire model here is for data to be kept in place, or contained in a certain area, or not accessed outside of our control, or kept in our visible landscape – in other words – everything we should not expect data to do.  Data is meant to be “out there”, working for us, collaborating for us, driving new interactions, creating new opportunities.  To try to contain data is anathema to the nature of data usage today.  Yet we still must protect it, because data can contain everything a bad actor needs to take over our lives, our identities, our finances, our health, our businesses.  We need to protect data, but it cannot protect itself, so we need data to evolve, we need it to understand what we what our intentions are, no matter where it goes, how many copies are made or where it resides or travels to.  Data needs to become smart.

Bringing it all together – data is the common denominator

Instead of trying to build a better “data-mouse trap”, it’s time to reverse the threat model.  We need a fundamental change at the one common level to all of technology and IT – the data itself.  We need a fundamental change in our thinking and capabilities. 

We need data to be able to:

  1. Know where it is, where it is allowed to be, where it is not allowed to be, and be capable of protecting itself if it’s in the wrong place
  2. Know who it is with and what that person is allowed to do under the current circumstances, depending on where they are and on what device
  3. Know what level of access it should provide to this person under these conditions
  4. Know when it is allowed to do anything, such as date, time, and context
  5. Know how to tell its owner EVERYTHING that is happening to it AS it happens – not “after the fact” as we do today – but live intelligence as it’s happening so we can see everything happening to the data

This is the dreamed panacea that the IT industry has wanted for many, many years – for data to be able to be intelligent, self-protecting, self-aware and self-reporting.

The good news – this is not a dream to be achieved – this is a reality today.  Data can be self-protecting, self-aware, intelligent, and self-reporting in near-real time.  It can control who accesses it, where they can access it, at what times, on which devices.  It can be completely visible to us no matter where it goes, on any device, no matter how far or how many times it is copied.  This kind of a fundamental shift is going to take new thinking, new models, new ways of architecting our technology to work with smart data.

I am going to be writing an ongoing series of blogs around all of the points above – going into how intelligent data can enhance, augment, and possibly replace the pieces of the puzzle above.  By consolidating our security into the one common denominator across all of IT – data itself – we can change the way we do everything.

blog