The Bigger the Better: Getting Data for Health Care Analytics at Wolters Kluwer

News comes over the transom on a daily basis about Johns Hopkins or the Mayo Clinic or some other health care institution using modern analytics and machine learning. But a basic tenet of machine learning holds that you need lots and lots of data to derive valid and trustworthy insights. Health care institutions rarely have data at the scale enjoyed by Google, major social media sites, or even large retailers. And this is where Wolters Kluwer Health has a leg up on other institutions: after decades in the health care field, with numerous acquisitions and partnerships, they’ve accumulated a set of patient data that they boast is even bigger than the anonymized sets released by Medicare (as well as more diverse, because it is not limited by age or nationality).

To be fair, even with small amounts of data, some providers cite impressive results. This past August, as just one example, I reported on a small chain that found ways to save money by analyzing their own patient base–although I noticed that the chain was using the data to confirm what they already knew through conventional observation. The data available to a single chain has limited uses, and–as pointed out by John T. Langton, Director of Data Science at Wolters Kluwer–it isn’t generalizable to other institutions. Each institution has to start over, deriving what insights it can from its unique set of patients.

This article explores the relatively new analytics team (just two years old) at Wolters Kluwer Health, and how it draws on the copious resources of the larger company.

A gangly information firm

With roots going back to the 1830s in the Netherlands, Wolters Kluwer made its mark in publishing and evolved with the times to enter numerous information fields, notably health. The Health division of the company has 3,000 employees across many business units.

The analytics team, like many other information-heavy health services, draws a careful line to support clinicians by providing useful insights without offering actual diagnoses, which would require complex FDA approval. One recent project led by applied data scientist Kang Liu improved the prediction of hospital-acquired infections–a task where a couple of extra hours of advance notice could save a life. Equally important, such early detection could catch the infection before it spreads to many other patients. Through an acquisition, the company now offers a service called Sentri7 that is used by hundreds of hospitals.

On the surface, Wolters Kluwer Health is just doing what dozens of other expectant and optimistic companies are doing in health care: hiring a bunch of smart people to do data science and tackling actionable issues in the hope of popping up above the crowd. Wolters Kluwer Health possesses two powerful advantages: its enormous data set, and its close collaboration between AI experts and a global community of medical professionals who have authored the most widely used clinical decision support system. In this regard, it’s unlike many start-ups, who focus on the tech side without the clinical domain expertise.

In a partnership with Oklahoma State University’s Center for Health Systems Innovation, Wolters Kluwer has access to the center’s Health Facts database, notable both for its size–data on more than 158 million patients–and its variety, covering not only clinical and billing records but pharmacy and lab records as well.

Wolters Kluwer also pays attention to medical experts, a necessity ignored by many software app firms to their detriment. More than 50 doctors are in the building working closely with the analytics team to explain to them what’s needed, where the challenges lie, and where they shouldn’t waste their time. The team can also draw on partnerships with research institutions and universities, as well as authors and peer reviewers that other Wolters Kluwer businesses draw on.

Langton and his colleagues told me that, as a large and well-established company, Wolters Kluwer has a conservative corporate culture. A healthy tension between cutting-edge tech staff and cautious doctors leads to new discoveries while avoiding poor choices and preventable failures.

Related initiatives

The vast Wolters Kluwer conglomerate also uses analytics under the hood in several of its other offerings:

  • Its UpToDate clinical decision support resource offers fairly typical access to the research literature on medical conditions and treatments, but couples it with evidence-based recommendations authored by experts from around the world. UpToDate is enhanced by machine learning to get closer and closer to anticipating what question a clinician is asking based on the activity of its user base, which now consists of more than 1.7 million clinicians globally that view over half a billion clinical topics a year.
  • Wolters Kluwer just launched clinical natural language processing (cNLP) to normalize data in free-text patient notes, through Health Language, a company it acquired in 2013.
  • Health Language’s tools are also used to structure and normalize data across different systems in hospitals, such as electronic health records and lab systems. Normalization can carry out such tasks as converting ICD-9 codes to roughly equivalent ICD-10, and similar work with standards that change over time or, like LOINC, vary from one clinic to another. As mergers and acquisitions among health care institutions become more and more of a fad, this kind of normalization of records becomes critical.
  • Wolters Kluwer’s medical research platform, Ovid, is developing AI to more rapidly and accurately identify newly published research articles for pharmacovigilance and other use cases. The technology works more accurately than existing methods by assigning an impact or risk score to each article to aid in prioritization. This permits their search to be more accurate than the industry’s standard “hedges,” which are lists several pages long of criteria to use for determining articles that match.
  • Wolters Kluwer improves their online medical education with Firecracker, which uses AI to learn how each person memorizes information and presents that person with personalized flash cards and drills.

Wolters Kluwer thinks that AI analytics in health care have a bright future as payers move toward value-based reimbursement. Their hope is to cut time to diagnosis, drawing on the many resources the company has built up over decades. The lesson for companies that want to apply analytics to health care seems to be: go big.

About the author

Andy Oram

Andy Oram

Andy Oram writes and edits documents about many aspects of computing, ranging in size from blog postings to full-length books. Topics cover a wide range of computer technologies: data science and machine learning, programming languages, Web performance, Internet of Things, databases, free and open source software, and more. My editorial output at O'Reilly Media included the first books ever published commercially in the United States on Linux, the 2001 title Peer-to-Peer (frequently cited in connection with those technologies), and the 2007 title Beautiful Code. He is a regular correspondent on health IT and health policy for He also contributes to other publications about policy issues related to the Internet and about trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business.