Risk of data breach: Tips for mitigating risk now that the threat has shifted from external to internal
Posted by: Jenny Laurello
Data privacy, Data security, HIPAA, HITECH, PHI, Risk assessment
Guest Post: Jay Hill, Director of Product Management and Marketing, Informatica
Your computer network is protected by a firewall including endpoint protection against malware and other types of attacks. Antivirus software is in place to prevent and detect viruses, worms and Trojan horses. Your production databases are encrypted and all applications have role-based security to limit access to protected health information (PHI). You have greatly minimized the threat of outside forces and protected your organization from malicious attacks.
Now that your organization has gone paperless (or is about to) and has switched over to an electronic health records system, it seems like you have all the protection required in place. Job well done.
Time to relax? Don’t be so certain. There is a hidden exposure that is often not thought about: real data used in nonproduction environments or extracted for secondary use in other applications.
Applications all need to be tested before being rolled out to production usage. This is no different in the health care industry, but it is perhaps more relevant. There is little room for error when dealing with someone’s health, so the applications used to support clinical decisions and workflows better be 100% sound.
The best way to test critical applications is by simulating what it will face in the real world. In the case of health care, this requires real patient data mimicking patient encounters, care delivery, and administrative processes. But, the safeguards you have in place in production (or plan to put in place) will not be there in development and testing environments. Contractors, developers, off-shore folks will have access to many (if not all) functions within the application. These development and testing systems are ripe for attack or misuse whether it is malicious or accidental.
Production systems are one of the busiest places for custom extracts, and these extracts are often not tracked or protected like other data. A quick glance through http://datalossdb.org shows that the majority of Health Care organizations are most impacted by loss or theft of media/drive/laptop accounting for ~ 65% of affected lives. The most telling conclusion of this study, health care organizations’ biggest risk of a HITECH privacy breach comes from within, not from afar.
The nature of these nonproduction data environments force you to think of how to combat these threats in a different way. As you might have done in your production environments, the first step I recommend is to designate a “champion” for managing the creation and access to test data. Without this important first step, no one group may step up and take responsible leaving sensitive data scattered around the organization just waiting for disaster.
Another tip is to get a handle on the scope by defining a list of sensitive fields. For healthcare, the list of 17 – 19 attributes identified as PHI is well documented by HIPAA and then again referenced by the HITECH Act. This is the best place to start. Each of these sensitive fields needs to be dealt with in the test and development systems. You could simply substitute constants or nullify the values, but that removes most of the business value and increases the chance of introducing production application flaws. The best approach is to “mask” the sensitive data using sophisticated and easy-to-use data masking software to keep the data as realistic as possible while still drastically reducing the risk of a HITECH Privacy breach. Per each sensitive field such as first name, last name or phone number, you will want to create a standard data masking policy so that you don’t have to create the routine or script over again for each database or application.
You now have your inventory in place to mask your test and development environments. You’ve made some progress, but there could be many databases you need to address with various structures. It could be very tedious and time consuming work locating each sensitive column. Luckily, there is “data discovery” or “data explorer” software you can use to automatically locate sensitive data patterns or column names throughout your IT infrastructure, and help you decrease your overall risk.
Once you have located the sensitive data and defined which data masking policy should be used, you should put in place a process to ensure sensitive data is either masked as it is being copied from production or immediately after before the data is turned over to wider usage. Deploying these policies for a wide variety of purposes such as clinical research, compliance reporting, and application implementation and modernization help drive value across the organization.
The final tip is to perform an automated audit or validation after the data masking is complete using automated data validation software. The validation step ought to be part of the standard process as part of the sign-off before handing over to end users.
Sensitive data and access to it in nonproduction systems is often overlooked. Putting in place processes and enterprise grade data privacy software will reduce the risk that a HITECH breach will come knocking on your door.
This post was written by Jay Hill, Director of Product Management and Marketing at Informatica, who is responsible for driving, messaging and fulfillment of test data management focused products such as Data Subset and Data Masking. Please visit www.informatica.com for more information.