A lot of health IT experts are taking a fresh look at the field’s (abysmal) record in protecting patient data, following the shocking Cambridge Analytica revelations that cast a new and disturbing light on privacy practices in the computer field. Both Facebook and others in the computer field who would love to emulate its financial success are trying to look at general lessons that go beyond the oddities of the Cambridge Analytica mess. (Among other things, the mess involved a loose Facebook sharing policy that was tightened up a couple years ago, and a purported “academic researcher” who apparently violated Facebook’s terms of service.)
I will devote this article to four lessons from the Facebook scandal that apply especially to health care data–or more correctly, four ways in which Cambridge Analytica reinforces principles that privacy advocates have known for years. Everybody recognizes that the risks modern data sharing practices pose to public life are hard, even intractable, and I will have to content myself with helping to define the issues, not present solutions. The lessons are:
There is no such thing as health data.
Consent is a meaningless concept.
The risks of disclosure go beyond individuals to affect the whole population.
Discrimination doesn’t have to be explicit or conscious.
The article will now lay out each concept, how the Facebook events reinforce it, and what it means for health care.
There is no such thing as health data
To be more precise, I should say that there is no hard-and-fast distinction between health data, financial data, voting data, consumer data, or any other category you choose to define. Health care providers are enjoined by HIPAA and other laws to fiercely protect information about diagnoses, medications, and other aspects of their patients’ lives. But a Facebook posting or a receipt from the supermarket can disclose that a person has a certain condition. The compute-intensive analytics that data brokers, marketers, and insurers apply with ever-growing sophistication are aimed at revealing these things. If the greatest impact on your life is that a pop-up ad for some product appears on your browser, count yourself lucky. You don’t know what else someone is doing with the information.
I feel a bit of sympathy for Facebook’s management, because few people anticipated that routine postings could identify ripe targets for fake news and inflammatory political messaging (except for the brilliant operatives who did that messaging). On the other hand, neither Facebook nor the US government acted fast enough to shut down the behavior and tell the public about it, once it was discovered.
HIPAA itself is notoriously limited. If someone can escape being classified as a health care provider or a provider’s business associate, they can collect data with abandon and do whatever they like (except in places such as the European Union, where laws hopefully require them to use the data for the purpose they cited while collecting it). App developers consciously strive to define their products in such a way that they sidestep the dreaded HIPAA coverage. (I won’t even go into the weaknesses of HIPAA and subsequent laws, which fail to take modern data analysis into account.)
Consent is a meaningless concept
Even the European Union’s new regulations (the much-publicized General Data Protection Regulation or GDPR) allows data collection to proceed after user consent. Of course, data must be collected for many purposes, such as payment and shipping at retail web sites. And the GDPR–following a long-established principle of consumer rights–requires further consent if the site collecting the data wants to use it beyond its original purpose. But it’s hard to imagine what use data will be put to, especially a couple years in the future.
Privacy advocates have known from the beginning of the ubiquitous “terms of service” that few people read before the press the Accept button. And this is a rational ignorance. Even if you read the tiresome and legalistic terms of service (I always do), you are unlikely to understand their implications. So the problem lies deeper than tedious verbiage: even the most sophisticated user cannot predict what’s going to happen to the data she consented to share.
The health care field has advanced farther than most by installing legal and regulatory barriers to sharing. We could do even better by storing all health data in a Personal Health Record (PHR) for each individual instead of at the various doctors, pharmacies, and other institutions where it can be used for dubious purposes. But all use requires consent, and consent is always on shaky grounds. There is also a risk (although I think it is exaggerated) that patients can be re-identified from de-identified data. But both data sharing and the uses of data must be more strictly regulated.
The risks of disclosure go beyond individuals to affect the whole population
The illusion that an individual can offer informed consent is matched by an even more dangerous illusion that the harm caused by a breach is limited to the individual affected, or even to his family. In fact, data collected legally and pervasively is used daily to make decisions about demographic groups, as I explained back in 1998. Democracy itself took a bullet when Russian political agents used data to influence the British EU referendum and the US presidential election.
Thus, privacy is not the concern of individuals making supposedly rational decisions about how much to protect their own data. It is a social issue, requiring a coordinated regulatory response.
Discrimination doesn’t have to be explicit or conscious
We have seen that data can be used to draw virtual red lines around entire groups of people. Data analytics, unless strictly monitored, reproduce society’s prejudices in software. This has a particular meaning in health care.
Discrimination against many demographic groups (African-Americans, immigrants, LGBTQ people) has been repeatedly documented. Very few doctors would consciously aver that they wish people harm in these groups, or even that they dismiss their concerns. Yet it happens over and over. The same unconscious or systemic discrimination will affect analytics and the application of its findings in health care.
A final dilemma
Much has been made of Facebook’s policy of collecting data about “friends of friends,” which draws a wide circle around the person giving consent and infringes on the privacy of people who never consented. Facebook did end the practice that allowed Global Science Research to collect data on an estimated 87 million people. But the dilemma behind the “friends of friends” policy is how inextricably it embodies the premise behind social media.
Lots of people like to condemn today’s web sites (not just social media, but news sites and many others–even health sites) for collecting data for marketing purposes. But as I understand it, the “friends of friends” phenomenon lies deeper. Finding connections and building weak networks out of extended relationships is the underpinning of social networking. It’s not just how networks such as Facebook can display to you the names of people they think you should connect with. It underlies everything about bringing you in contact with information about people you care about, or might care about. Take away “friends of friends” and you take away social networking, which has been the most powerful force for connecting people around mutual interests the world has ever developed.
The health care field is currently struggling with a similar demonic trade-off. We desperately hope to cut costs and tame chronic illness through data collection. The more data we scoop up and the more zealously we subject it to analysis, the more we can draw useful conclusions that create better care. But bad actors can use the same techniques to deny insurance, withhold needed care, or exploit trusting patients and sell them bogus treatments. The ethics of data analysis and data sharing in health care require an open, and open-eyed, debate before we go further.