Data protection: patient confidentiality in the age of AI

Data Protection Alert


Thinking inside the box: the intelligent diagnostic device

In our previous article, we considered a typical collaboration between a healthcare institution and a developer of an digital diagnostic system using machine learning or some other form of artificial intelligence. The currency of that commercial relationship is data.

The healthcare institution has a vast store of patient data, from first presentation to diagnosis, treatment and eventual outcomes. The AI solution developer needs that data to first train and then test its machine. The ultimate goal is that − after ingesting all of that data − the machine will be equal to or better than human doctors in diagnosing or recommending a course of treatment.

However high-minded the aim of the two organizations may be, any arrangement that sees a healthcare institution share any part of its patients' data with an outside developer needs to take full account of relevant data and healthcare records regulations, or the consequences can be severe.

Not just a European problem: global privacy regulation

Depending upon where in the world the collaboration takes place, the data sharing at the heart of any collaboration between healthcare providers and AI system builders might involve the consideration of EU data protection laws (including the impending GDPR), Health Insurance Portability and Accountability Act (HIPAA) compliance in the US and a myriad of local equivalents in other jurisdictions.

Within the EU, a healthcare provider's capacity to share patient data that constitutes "personal data" (ie, data that can be used to identify a living individual) with an AI system builder for any purpose not connected with the "direct care" of a patient is severely constrained. Most medical data will fall within the definition of "personal data" for these purposes. On top of that, any personal data which concerns a person's physical or mental health is considered "sensitive personal data," (or a "special category of personal data") and processing can only take place in more limited circumstances.

I'd know that kidney anywhere: diagnostic images as personal data

Even if the most basic identifying information (such as name or patient number) is removed,, there will almost always be enough detail in patient records to tie that data back to an individual. X-rays, scans and other medical images are essentially no different from photographs from a data protection perspective. An expert may recognize an individual quite readily from scan data, in the same way that a photograph immediately identifies an individual to people who know them. Scan data might not be of use in identification only to experts − even if most people would not be able to recognize an individual based on a CAT-scan, that data can be used to build a virtual 3D model from which it may be much easier for even the untrained eye to identify the scan as being of a particular person.

Sharing past patient data − records, scans, outcomes − would have to be justified on one or more of the grounds provided for in relevant law. While the current EU regime based on the 1995 Directive already sets out controls (eg, from a UK perspective, processing of "sensitive personal data" must comply with the requirements in Schedule 3 to the Data Protection Act 1998), from May 2018 the General Data Protection Regulation (GDPR) will update and harmonize the approach across the EU. Articles 9(2)(h) and 9(2)(i) do allow for processing of sensitive personal data in connection with diagnosis, treatment and where "[n]ecessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of healthcare and of medicinal products or medical devices."

AI systems builders and healthcare providers need to tread lightly when relying on Articles 9(2)(h) and 9(2)(i) to justify data sharing. It is likely that they will be interpreted narrowly by relevant authorities.

On a related question, we have just seen the Information Commissioner's Office in the UK give a ruling on the data sharing arrangements between the Royal Free Hospital in London and Google DeepMind. The ICO's determination was that the Royal Free had failed to comply with the Data Protection Act in that case. This arrangement involved the transfer of 1.6 million patient records for the purposes of developing and testing an AI application to provide earlier detection of kidney disease. While the purpose of the transfer may be laudable, the ICO found several shortcomings in how patient data was handled and determined that patients were not adequately informed that their data would be used as part of the test. This decision serves to underscore the serious importance of getting it right when it comes to collecting and sharing data as part of these collaboration projects.

Next steps

Healthcare institutions already hold veritable treasure troves of data, but very little of it will have been collected with an eye to any consents that might be required for sharing that data with outside AI systems developers. The Royal Free / DeepMind ruling underscores that such data cannot simply be reused without further thought.

In some instances, pseudonymization or differential privacy techniques that allow statistically valid conclusions to be drawn from datasets without any individual within that dataset being identifiable might be used to avoid the issue. However, in many instances, the real value to AI solutions developers comes from tracking individual cases from presentation through diagnosis and treatment to outcomes.

Healthcare providers need to start thinking about this issue now, as they engage with existing and new patients. Seeking appropriate consents to share data for currently envisaged or likely future collaborations will start to unlock the social, scientific and economic value of these datasets. If done properly, these collaborations will pay dividends for future generations, who can benefit from the new techniques and treatments developed.

For more information about issues raised in this article, please get in touch with the authors below.

Keep an eye out for our upcoming TechLaw Series events which we will be regularly updating here.