Could Test Data Be a Liability Under the New GDPR Framework? [Guest Blog]

The GDPR (General Data Protection Regulation) is a recently ratified legal framework that introduces wide-ranging reforms of the use of personal data about EU citizens. Finalized on April 14, the reforms aim to give individuals control over use of their personal data. The GDPR supersedes the International Safe Harbor laws that most companies in the US have been operating under, and compliance is required by May 2018. The GDPR affects any organization that collects or has collected personal data about EU citizens regardless of location. Some examples of its reforms include:

  • Consent: Specific, limited consent by the individual is required – no opt-in by default;
  • Limited use: Data collected can only be for “legitimate interests” that must be disclosed; data can be kept only as long as needed, and used only by those that need it;
  • Data portability: EU citizens can request a copy of their personal data in a form usable by them and transmissible to another system;
  • The right to be forgotten: Consent must be easy to withdraw, whereupon the data should be erased;
  • Breach notification: Breaches must be reported within 72 hours to authorities and “without undue delay” to affected individuals;
  • Harsh penalties for non-compliance: Up to €20 million or 4 percent of the entity’s global gross – whichever is greater.

If your organization does business in the EU, the GDPR has implications on many of your current IT functions. In addition to the data breach notification requirement, which could significantly impact your IT and security teams, another area of concern is test data management. A common practice is to copy production data to test environments as input for application testing. This practice must be revisited.

Test data copied from production is:

  • Subject to GDPR compliance: The use of personal data for application testing must be disclosed to users as a “legitimate interest,” consent obtained and the data deleted when testing is finished.
  • Of limited value for testing: Production data only supports “happy path” testing with values representing those inputs an application was able to successfully process in the past (generally only 20 to 30 percent of possible values). Production data does not help at all with negative testing.
  • A liability: To reduce risk associated with breaches, pseudonymization of data is highly recommended. This process replaces identifiable information with artificially derived values. Identifiable information extends well beyond unique identifiers like Social Security and NHS numbers to any combination of values that can identify a person. For example, a name, birthdate, and postal code can uniquely identify a person, so those values must be replaced with synthetic ones.

Production data’s value has become suspect for application testing. As test values must be synthetically created anyway, why not just go all the way with a completely synthetic approach to test data?

Synthetically created test data has many merits. Synthetic or anonymized data is not subject to GDPR regulations, requires no access to production databases, can cover 100 percent of possible input values, and, most importantly for the sake of customers and your business, it is of no value if it is stolen.

This is just one example of the far-reaching implications of data privacy reforms introduced by the GDPR. If your organization collects personal data, you should give some thought to what the GDPR will mean for the way you manage that data. In some cases, synthetically created data can help you avoid any compliance issues.

Guest blogger Jody Hunt has worked in the software industry for more than 20 years and is currently a Solution Strategist for Continuous Delivery at CA Technologies. He has helped customers in their DevOps journey since 2010. Learn more about GDPR and request a free e-book.