Audio Interfaces Have a Much Wider Potential in Health Care

Providing reminders around medication, transcribing patient conversations, controlling surgical devices: these are some of the new ways voice interactions are entering medicine. People who have gotten used to talking to their cell phones–or through agents such as Alexa, to their TVs and microwave stoves and cars–will expect simple voice interactions in their medical encounters.

I talked to Bruce Ryan, director of engineering for HARMAN Embedded Audio, to find out how audio capabilities are evolving and what his company is doing to provide them in health care.

The parent company HARMAN, which operated independently from 1953 to 2017 and is now a part of Samsung, is a premier company for audio around the world. The international reach has helped them be more inclusive, because they can do voice recognition in every major language and are continually expanding their list of supported languages. HARMAN Embedded Audio itself is not limited to audio; it also produces a range of embedded devices.

My conversation with Ryan touched on audio devices in three areas: the home, the clinical encounter, and the operating room.

Home

Chatbots are growing popular everywhere; you have probably visited a web site and seen a little dialog pop up asking, “What can I help you with?” Chatbots exploit natural language processing (NLP) to interpret what the client is asking and guide them through a sequence of questions until the chatbot can either provide an answer or (more often) turn the client over to the appropriate human customer service person.

Chatbots can drastically reduce the costs of providing customer service, and if asked certain common questions, they can give clients quicker answers. In a situation such as poison control, these few minutes are extremely valuable. But of course, the risks in health care are higher than in other areas. You don’t want to tell someone having a stroke to take a pain reliever and lie down. Time constraints prevented me from going deep into the safety of health care chatbots with Ryan.

Ryan suggested that medical devices may be able to answer common questions, such as what to do when you’re suffering from a fever, or help with everyday medical tasks such as alerting when it’s time to take your medicine. These devices could also collect biometric information from the fitness devices and watches worn by patients, and alert them if there seems to be a problem.

Medical devices must be more independent of the cloud than everyday voice-enabled devices. When products such as Amazon Echo first entered the market, they had to send everything to the vendor’s servers for processing because of the compute demands of NLP and the access required to databases. This is inadequate security for medical information.

Ryan says that the efficiency and power of audio processing devices has seen “almost exponential acceleration” over the past few years. A device can store enough data locally and do enough local processing to handle a limited set of questions, making it appropriate for many medical applications. Ryan says HARMAN devices can do a limited form of NLP using a digital signal processor similar to a GPU: single instruction, multiple data.

Clinical Encounter

In another model for intelligent user interaction, local devices communicate with a local server instead of to a cloud vendor’s servers. The local server doesn’t have to be connected to the Internet, and it can use a local area network to serve devices in the facility. HARMAN first developed the local server option for cruise ships, which (as Ryan said) would require “an awfully long Ethernet cable” to connect to the Internet. Treatment facilities can now use local servers too.

Electronic records are often blamed for physician burnout. The clinical encounter always involved documentation, but jotting a note in a paper record took much less time and concentration than the rigid interfaces offered by most of today’s EHRs. Therefore, transcriptions have become popular, done either by human assistants or by voice recognition. The clinician can scan the transcript made from a conversation and quickly extract the key points manually for an official record.

Operating Room

Operating rooms present challenges to voice processing that mirror the challenges they present to staff. The venues are chaotic and noisy. Sudden changes in patient status require quick adaptation. So clinicians are constantly complaining in the operating room that they must put something down in order to use a hand for some other task. Anything that could be voice-controlled would lead to fewer errors and faster surgeries.

Typical voice processing starts with noise reduction: the device needs algorithms to screen out annoyances that humans learn to do automatically. The resulting signal runs through speech-to-text translation. Then NLP and other AI algorithms can process the text to determine what response is required.

The advantages of voice interaction, and the truly astounding progress made in that area over the past decade, ensure that it will become more and more important to health care. This article has hopefully shown some of the problems it can address.

About the author

Andy Oram

Andy is a writer and editor in the computer field. His editorial projects have ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. A correspondent for Healthcare IT Today, Andy also writes often on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM (Brussels), DebConf, and LibrePlanet. Andy participates in the Association for Computing Machinery's policy organization, named USTPC, and is on the editorial board of the Linux Professional Institute.

   

Categories