“Non Structured Data Is More Valuable to Practitioners Than Discrete Research Oriented Data”

In my post on the EHR Bubble, Don B offered this strong statement:

“Recognizing the non-structured information is more valuable to the practitioner than discrete researcher oriented data.”

I love people that make strong statements and this is no exception. This is a comment that will no doubt hit people the wrong way when you consider how much focus things like meaningful use have focused on discrete data. I can already hear the chorus of doctors asking why meaningful use wants all this discrete data if the non-structured data is where the value is for practitioners.

There are a lot of nuances at work that are worth discussing. I agree with Don B that at this point in time the non-structured information is more valuable to a physician than the discrete data. I’d also extend that comment to say that non-structured information will likely always have value to a practitioner. There are just certain parts of physician documentation that can’t be discrete or at least cost far too much to make them discrete. I’m sure the EHR narrative crowd out there will love this paragraph.

Although, even proponents of the EHR narrative realize the value of discrete data elements. That’s why companies like Nuance and MModal are investing so much money, time and effort into their various NLP (Natural Language Processing) and CLU (Clinical Language Understanding) offerings. The key question for these companies has never been whether there was value in discrete healthcare data, but in how you capture the discrete healthcare data.

When thinking about discrete healthcare data I hearken back to a post I did in 2009 that asserts the Body of Medical Knowledge Too Complex for the Human Mind. This concept still resonates with me today. The core being how does a physician take in all the patient data, device data, lab data, medical data, research data, etc and provide the patient the best care possible. This will never replace the physician (I don’t think), but I expect the tools will become so powerful that a physician won’t be able to practice medicine without them.

Much of the power required for computers to assist physicians in this way is going to come through discrete data.

Over the next 2-3 years we’re going to start seeing inklings of how healthcare will improve thanks to discrete data (often captured through and collected by an EHR). Then, in the next 5-10 years we’re going to see how healthcare couldn’t survive without all the detailed healthcare data.

About the author

John Lynn

John Lynn

John Lynn is the Founder of the HealthcareScene.com, a network of leading Healthcare IT resources. The flagship blog, Healthcare IT Today, contains over 13,000 articles with over half of the articles written by John. These EMR and Healthcare IT related articles have been viewed over 20 million times.

John manages Healthcare IT Central, the leading career Health IT job board. He also organizes the first of its kind conference and community focused on healthcare marketing, Healthcare and IT Marketing Conference, and a healthcare IT conference, EXPO.health, focused on practical healthcare IT innovation. John is an advisor to multiple healthcare IT companies. John is highly involved in social media, and in addition to his blogs can be found on Twitter: @techguy.


  • Hi John,

    “There are a lot of nuances at work that are worth discussing.”


    A couple weeks ago I attended and livetweeted the 2012 NAACL Human Language Technology Conference. I focused on clinical and biomedical natural language processing. Later I embedded my tweets into a blog post, surrounding them with tutorial and editorial material. My goals were to summarize state-of-the-art clinical NLP research for health IT folks who are not computational linguists or NLP engineers. It was a stretch and I learned a lot. The more I dug the more I found, and the more I found the more I dug. It’s a long post, but it may be worth skimming to someone interested in this topic. It has lots of links to further reading.

    I always enjoy how well you write (though, about that pun…), not only what you write about.




    From my conclusion:

    “Computational linguistics and natural language processing (the former the theory and the latter the engineering) are about to transform healthcare. At least some people think so. There’s certainly a lot of buzz in health IT traditional and social media about medical speech recognition and clinical language understanding.

    Coverage can be pretty superficial. Watson will, or won’t, replace clinicians. Siri will, or won’t, replace traditional EHR user interfaces. It comes with the territory. CL and NLP are full of dauntingly abstract concepts and complicated statistical mathematics. However, there is an idea, among philosophers, that science is really just common sense formalized. If so, maybe the science of CL/NLP can be “re-common-sense-ized”, at least for the purpose of looking under the hood of what makes these clever language machines possible.

    Looking further ahead, where I’d really like to see clinical NLP go, is toward conversational EHRs. A bit like Siri, or at least the way Siri is portrayed in ads, only a lot more so. To get there EHRs will need to become intelligent systems, not just converting compressions and refractions of air molecules into transcribed tokens to be passed on to pipelines and become ICD-9 or -10 codes. They will need to “understand” the ebb and flow of medical workflow and, like the hyper-competent operating room nurse, do the right thing at the right time with the right person for the right reason, without having to be explicitly triggered to do.”

  • No pun intended there. I wrote that line well before I decided to write about NLP and CLU.

    It is quite interesting to follow the NLP and CLU world. It’s a lot more complex than most people realize. My biggest question is how well we can really crack the NLP and CLU algorithm. I still have my doubts about it achieving the clinical nirvana that we hope it will achieve.

  • I’m with Chuck…not surprisingly! The technology is already offering insights into narrative content that was previously locked away in the dungeons of clinical data repositories. The technology offers the Rosetta Stone to this narrative unlocking the data that used to require manual (and hence unscalable and unaffordable) manual abstraction.
    Today we can analyze near real time narrative documents and produce real time report that address everything from Hospital Acquired Conditions/Present on Admission Indicators, Patient Safety Indicators, Core Measures and immediate focus on problematic cases such as Sepsis, Stroke and possible RAC Audit targets.

    Layer on top some real time feedback available to clinicians to help them match their clinical notes to the required information to meet the coding and billing requirements and now you ave turned the process on its head. Allowing capture of non-structured (and often preferred) narrative notes and using the technology to generate the discreet data we all know is necessary to drive our data hungry EMR’s and clinical systems

  • John
    Here today – there are already sites using the technology that is analyzing documents and producing reports in a dashboard format identifying cases that produce lists of patents and their documents based on categories for
    – Hospital Acquired Conditions
    – Present on Admission Indicators,
    – Patient Safety Indicators such as AMI measures
    and many more.

    There are also several sites who have begun testing the physician documentation assistance tool

Click here to post a comment