Recently, I checked out a case study touting IBM Watson’s ability to analyze medical documentation and determine which ICD-10 codes apply to a given patient’s care. The story was impressive, of course – Big Blue wouldn’t have published it otherwise – but it raised some important questions about the return it generated from its efforts.
The case study tells the story of how a Belgian health system built and “trained” an AI-powered assistant to participate in its coding process. The health network, Centre Hospitalier Chretien, includes 1,000 acute-care beds, 30 psychiatric beds and 675 nursing home beds.
In the past, as patients moved through and out of the CHC network, it relied on a team of roughly 15 coding experts to review each hospitalization case.
As you might imagine, doing the job correctly was a slow, painstaking process, which included reading discharge letters, poring through medical records, reviewing lab tests and much more. As a result, the team could process no more than 25 cases per day. (The study doesn’t say so explicitly, but the department must have been struggling with a tremendous backlog!)
In recent times, though, CHC began developing what it hoped would be a more efficient solution. Over time, the health system created and trained an AI-powered assistant leveraging IBM Watson Explorer and the IBM Watson Knowledge Studio.
Over 18 months, CHC trained Watson to understand the ICD-10 coding by feeding it almost 2,000 tagged and annotated documents. Over the first few months, the AI assistant had only achieved a 28% accuracy rate, but within a year of additional tweaks and training, its accuracy rate had hit 80%.
Not long after, CHC leaders decided that the assistant was working at a high enough accuracy rate to go into regular use, and in June 2018 the technology went live.
Now, the RCM team treats Watson as a sort of co-worker. Human coders review its rationale and conclusions, then accept or reject its suggestions. As the humans correct Watson, its performance is improving. Using the AI, the RCM team’s productivity has climbed from 25 completed reviews a day to 35 reviews.
All of this sounds lovely, but when it comes to corporate case studies I keep my skeptic’s hat firmly in place. My mean ol’ skeptic questions include the following:
- Is the jump from 25 to 35 cases per day that impressive? After spending 18 months on what is purported to be one of the world’s top artificial intelligences, shouldn’t we expect more?
- How many hours have the health system’s coders spent interacting with Watson? What’s the return on those hours?
- When all is said and done, Is the AI deployment cheaper than hiring and training up a human coder?
- Does the health system own the technology or is it renting Watson access? If it’s renting access, what will CHC do if in the future, IBM won’t strike a rental deal to its liking?
What I’m getting at with these questions is that we know too little about the tradeoffs CHC has made to roll out its AI assistant, and what the payoff has been for the steps it took. Does anyone out there have better data points to share on the benefits of your hospital AI project?