In today’s digital-first landscape, data has become the new carbon. In its raw form, data is meaningless. Like carbon, raw data needs to undergo a change — to crystalize before becoming a diamond. 

For organizations, raw data becomes meaningful only after analyses. However, for threat actors, the difference between raw data and information matters not. Threat actors can leverage any byte of data for a wide range of malicious intents. 

What Is DLP?

Data loss prevention (DLP) is a set of tools and practices that protect organizations from data loss, modification, or theft. It starts with classifying and taking inventory of data and then creating and implementing data protection policies. These policies are often based on established compliance regulations, including HIPAA, PCI-DSS, and GDPR. 

Implementing data loss prevention policies enables ensures security teams are aware of vulnerabilities and violations. It also enables protective and remediation actions to be taken in a timely manner.

Most DLP strategies are focused on the following data types: 

  • Personal information — personally identifiable information (PII), payment card information (PCI), protected health information (PHI), and other data falling under compliance regulations.
  • Intellectual property — research and development data, white papers, and source code.
  • Corporate data — business intelligence, financial data, employee information, and information related to mergers or acquisitions.

Why Is DLP Important? 

DLP practices secure data and can prevent losses, maintain or build reputation, and avoid regulatory fines. Below are some interesting statistics that highlight the importance of DLP strategies.

  • According to a report by Varonis, 22% of folders are unprotected. This may be due to misconfiguration or oversight. Either way, the end result is that data is freely available to attackers, particularly when folders are stored on cloud resources. 
  • Backblaze reported that 1.89% of hard drives failed in 2019. While this may seem like a small number, hard drives can contain terabytes of data — without backups or replication in place, all of this data can be permanently lost.
  • A report from Verizon revealed that 28% of data breaches involve malware. It is incredibly common  for users to accidentally install malware on a device. Often, this happens when malware is included in a seemingly legitimate file or email. Once installed, these programs can grant full access to your systems and data.

The Origin of Data Breaches

Data breaches have been occurring since computers first started appearing in workplaces and began to markedly increase in the 1980s. For example, in 1984, the company now known as Experian had 90 million records stolen in an attack. In 1986, there was another incident resulting in 16 million records being stolen from Revenue Canada.

Into the ’90s and early 2000s, the number of breaches continued to increase but so did public awareness. Cybertheft became a relatively common media headline and people and organizations began pushing for stronger data protection policies. 

During this time, legal entities also began to take notice. For example, in 2003, California passed the first law protecting the privacy of consumers’ personal information. This law prompted many organizations that were previously lax about their policies or lacking them entirely to take notice and begin focusing on protective measures. 

Despite this sudden growth of attention, as of 2012, only 46 U.S. states had laws regulating how to handle data breaches. It wasn’t until at least 2018 that all 50 states included protections for personal, private information. 

The Evolution of Data Protection

Cybercriminals are continuously altering their attack methods because the type of data being stored is changing and the way it’s being stored is too. As a result, data protection methods have evolved as well, adapting to new conditions and threats.

Pretty Good Privacy

Pretty good privacy (PGP) was the first real attempt at standardizing data protection strategies. It involved the use of encryption to secure the privacy of data and prevent the exposure of sensitive information. This was fine if users were the only ones accessing information but made it difficult or impossible to intentionally share data with others. 

Once encryption keys are shared, the original user loses control over the data, and responsibility for security is shared by all parties involved. Because of this, PGP was unable to truly secure data and was never designed to prevent loss.

Information rights management

Information rights management (IRM) was another legacy attempt at data protection. However, this attempt was limited to a small set of applications, namely Microsoft. IRM prevents users from performing actions in documents and files that could lead to data loss. For example, restricting copy/paste actions, editing or saving capabilities, and printing. 


Currently, DLP is the default strategy for protecting data. This strategy relies on an organization's ability to classify data and apply protections appropriate to that classification. As DLP grew in popularity, vendors began making solutions to help organizations manage these tasks.

Generally, DLP solutions are based on one of two approaches.

  1. Traditional — This approach provides coverage for data across components, including in the cloud, at endpoints, on network gateways, and in storage. 
  2. Agent — This approach uses kernel-level agents on endpoints to monitor network traffic. These agents detect policy breaches, report suspicious activity through notifications/alerts, and enforce policy restrictions. 

Modern DLP solutions also vary significantly by the focus of protections. Many industries have specialized restrictions or data types that need protection, and the solutions must account for these differences. For example, as DLP solutions have evolved, strategies have segmented to account for health care, intellectual property, and financial needs accordingly. 


Data is a resource, and resources need to be protected from accidental loss, deliberate exfiltration, corporate espionage, ransomware attacks, and hardware failure. Data can be damaged, corrupted, stolen, and deleted by insider or outsider threats. 

It’s imperative for organizations to protect their data by implementing approrpriate measures of security. In the past, organizations used PGP standards to encrypt and protect data. However, as technologies advanced and attacks became more sophisticated, PGP was no longer enough.

Today, organizations implement DLP strategies and tools that allow them to scale and adapt quickly. DLP can protect data across components and environments, and it can deploy kernel-level agents on endpoints to monitor network traffic. But, the evolution isn’t over. As organizations adopt more dynamic infrastructures and the amount of stores data continues to grow, classification-based strategies are not enough. Distributed data use, real-time information streams, and changing regulations are presenting new DLP challenges that must be overcome. Incorporating machine learning into solutions is just one example.