Can Machine Learning Tame Healthcare’s Big Data?

Big data is both a blessing and a curse. The blessing is that if we use it well, it will tell us important things we don’t know about patient care processes, clinical improvement, outcomes and more. The curse is that if we don’t use it, we’ve got a very expensive and labor-hungry boondoggle on our hands.

But there may be hope for progress. One article I read today suggests that another technology may hold the key to unlocking these blessings — that machine learning may be the tool which lets us harvest the big data fields. The piece, whose writer, oddly enough, was cited only as “Mauricio,” lead cloud expert at, argues that machine learning is “the most effective way to excavate buried patterns in the chunks of unstructured data.” While I am an HIT observer rather than techie, what limited tech knowledge I possess suggests that machine learning is going to play an important role in the future of taming big data in healthcare.

In the piece, Mauricio notes that big data is characterized by the high volume of data, including both structured and non-structured data, the high velocity of data flowing into databases every working second, the variety of data, which can range from texts and email to audio to financial transactions, complexity of data coming from multiple incompatible sources and variability of data flow rates.

Though his is a general analysis, I’m sure we can agree that healthcare big data specifically matches his description. I don’t know if you who are reading this include wild cards like social media content or video in their big data repositories, but even if you don’t, you may well in the future.

Anyway, for the purposes of this discussion, let’s summarize by saying that in this context, big data isn’t just made of giant repositories of relatively normalized data, it’s a whirlwind of structured and unstructured data in a huge number of formats, flooding into databases in spurts, trickles and floods around the clock.

To Mauricio, an obvious choice for extracting value from this chaos is machine learning, which he defines as a data analysis method that automates extrapolated model-building algorithms. In machine learning models, systems adapt independently without any human interaction, using automatically-applied customized algorithms and mathematical calculations to big data. “Machine learning offers a deeper insight into collected data and allows the computers to find hidden patterns which human analysts are bound to miss,” he writes.

According to the author, there are already machine learning models in place which help predict the appearance of genetically-influenced diseases such as diabetes and heart disease. Other possibilities for machine learning in healthcare – which he doesn’t mention but are referenced elsewhere – include getting a handle on population health. After all, an iterative learning technology could be a great choice for making predictions about population trends. You can probably think of several other possibilities.

Now, like many other industries, healthcare suffers from a data silo problem, and we’ll have to address that issue before we create the kind of multi-source, multi-format data pool that Mauricio envisions. Leveraging big data effectively will also require people to cooperate across departmental and even organizational boundaries, as John Lynn noted in a post from last year.

Even so, it’s good to identify tools and models that can help get the technical work done, and machine learning seems promising. Have any of you experimented with it?

About the author

Anne Zieger

Anne Zieger

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.


  • Machine learning offers great promise for clinical decision making. Right now, artificial intelligence is largely handicapped by the requirement that the path to a clinical decision be understood by a human being. The problem is, much of the forefront of machine learning and AI is with neutral networks at other types of systems whose conclusions are accurate yet difficult or impossible to “reverse engineer” to determine their decision making path. It may not be fully integrated into the clinical decision process, but population-based analysis using these techniques will prove extremely valuable, and this will drive adoption.

    One company to watch in this space is Apervita. They are a company trying to create a marketplace for analytic algorithms that an HCO can source and apply to their EDW. They’ve witnessed pharmas paying for provider data and leveraging the Apervita marketplace to better understand outcomes data. The solution could also be viable from a risk stratification perspective as well as profiles can be built around different analytics and algorithms can be evaluated with regard to performance in predicting outcomes.

    Apervita was profile in a post last year on EMR & EHR:

  • Further, the promise of machine learning is that it facilitates predictive analytics for population health management and risk analysis, but doesn’t require deployment of the extremely expensive and complex systems that are usually required to do so. For instance, one could perform ETL from the HCO EDW and apply machine-learning based models. Those models could then be used against the EDW to make predictions and be exposed via a web service so the HCO can integrate into their existing reporting/analytics systems.

  • Great insights Justin. I’d have never thought that Apervita would be in the machine learning and neural networks space, but it does make sense. That will be interesting to watch. I’ve loved their work on sharing analytics. Interesting to think even bigger picture with them.

  • Vendors are certainly investing in ML. Orion Health is another example. Very timely recent announcement:

    “Machine learning is becoming increasingly important in the delivery of software that enables the practice of precision medicine or personalized healthcare. Orion Health launched its precision medicine platform last year, and we have already achieved significant sales around the world, most notably six deployments in North America,” McCrae said.

  • Thanks for sharing that Justin. I hadn’t seen their machine learning efforts. I find it interesting to note that they consider success sales as opposed to actually impacting healthcare. At least that quote gives that impression.

Click here to post a comment