Micro Focus is now part of OpenText. Learn more >

You are here

You are here

How dark data puts your business in jeopardy

Eric Popiel Cybersecurity Evangelist, CyberRes

Long considered a business asset that should be kept no matter what, data has become a liability for many companies.

Late last year, German authorities levied a fine equivalent to $17 million against real estate company Deutsche Wohnen SE for retaining data on tenants—such as pay slips, references, and other credit information—when the data was no longer a business necessity, and for not having a proper data-retention schedule. The fine followed another penalty of almost $300,000 levied by Danish privacy authorities against Taxa 4x35 for keeping data—former customer phone numbers—past a reasonable retention limit.

The incidents underscore that regulators have become more confident in fining companies for over-retaining data and not having sufficient data management protocols in place. Companies should expect the trend to continue—enforcement of the General Data Protection Regulation has ramped up in Europe, and US companies are facing a similar impact from the California Consumer Privacy Act (CCPA) and legislation being considered by other states.

According to Privacy Affairs, more than 410 fines have been levied since GDPR came into force, totaling $208 million. While data lacking appropriate security accounted for nearly 40% of GDPR fines, unprotected and over-retained data accounted for more than a quarter of the overall fines levied under the regulation, according to research.

The trends show that data minimization—often considered a "should do" chore—is now a necessity. Dark data has grown, and companies are putting their businesses at risk if they do not tame data sprawl soon.

Remote work has made the problem more difficult as well. Companies without proper controls will likely find that users have saved data on systems that risk data leakage. And the pandemic has led more and more people to create shortcuts around compliance and security measures.

Here's why your team should embark on a data minimization effort as soon as possible, or risk being made an example by privacy regulators.

More regulation is coming

Companies should be ready for more regulations, especially as states adopt new citizen-focused privacy regimes. Already, Nevada has updated its privacy law, and Virginia, Florida, New Hampshire, Washington, and Illinois have all proposed legislation that will strengthen consumer privacy rights. Massachusetts, New York, Hawaii, Maryland, and North Dakota have all passed legislation that mimics California's template to varying extents.

One major change is that personally identifiable information (PII) will be defined probabilistically by many states. A seminal paper published by Latanya Sweeney of Carnegie Mellon University in 2000 found that 87% of US citizens could be identified by three pieces of common information: gender, ZIP code, and date of birth. Companies with customers in states adopting probabilistic definitions of PII should take more significant steps to analyze any retained data.

Only keep what you need

Companies can no longer hold to the tenet that exercising an abundance of caution means keeping data because you never know when a particular type of data might become a business advantage.

Now an abundance of caution means deleting anything that is not immediately useful. Even if a particular piece of data is not considered PII by itself, a few different data elements could create a probabilistic danger for companies. For that reason, companies should classify data by its importance to the business and only retain data that is significant. Historically, data retention has been a cost and complexity issue, but in the modern business world, it is a cost, complexity, and risk issue.

To begin the data minimization process, companies should first classify data retained on devices and servers. Categorizing data into broad classes—such as intellectual property, financial data, sensitive personal data, and health-related information—can help determine what is necessary for the business and what can be deleted over time. Seeking out data that has been retained for long periods and then questioning its existence can help companies focus their efforts.

Regulations require some industries to preserve data far past its direct value to the company. Such data must be protected along with that required for the business, and a variety of methods exist to achieve this. At a minimum, companies should encrypt the data at rest to limit information leakage in a breach. However, format-preserving data protection techniques such as tokenization and Format-Preserving Encryption can provide deeper protection.

These technologies replace data with values of the same type and length, allowing analytics and other operations to be performed on the data in its protected form—that is, the protection is preserved while the data is being processed, rather than the data being automatically decrypted before every use. Minimizing instances of unprotected data in the processing flow increases protection and minimizes risk. Similarly, emerging technologies such as homomorphic and privacy-preserving encryption can also allow the original data to remain protected during processing.

Focus on the core of your business

Most of all, data retention should be defined by whether the existence of the information helps the core business or poses an existential risk to the company.

Clothing retailer H&M forgot to protect its business. In 2019, a breach of the company revealed that it had collected sensitive data on its employees, stemming from a practice where managers logged any details about employees that came up during conversations, including information on health, religious faith, and family issues. After the revelations surfaced, German privacy authorities fined the company $41 million.

Companies also have to understand data retention regulations across all jurisdictions in which they do business. Many US companies have European customers, which makes them subject to GDPR in addition to US laws.

Once, businesses ran few risks in keeping data around for long periods of time. Now, however, the mere existence of data can pose a regulatory risk. It is time for companies to stop putting off the chore of reviewing data and put infrastructure in place to minimize the risk. At a minimum, companies should classify their data assets, delete unnecessary data that would pose a financial or reputational risk in a breach, and use encryption and tokenization to protect the information at rest and during use.

Moderation is the new watchword for data, and wise companies will heed it.

Keep learning

Read more articles about: SecurityData Security