Gaining Rapid Understanding of COVID-19 Through a Real-world Data Lens

The following is a guest article by Joseph Menzin, Ph.D., CEO of Boston Health Economics (BHE).

The COVID-19 pandemic has increased our awareness of dangerous infectious disease outbreaks and their exponential rapid spread. It has also laid bare the need for better preparedness and more timely data in the course of emergency situations. The success of those efforts will largely depend on gleaning information from the real-world data (RWD) being collected on COVID-19 patients. Using preliminary data released from the CDC and influenza data from the Optum® Electronic Health Record Database (Source:  Analysis of Optum® de-identified Electronic Health Record dataset, April 9, 2020.  To learn more about Optum data and analytics, visit, this report compares the underlying conditions most likely to result in hospitalizations for COVID-19 to the seasonal flu.

COVID-19 is a new disease, so clinicians and policymakers are playing “catch-up” in terms of understanding the patients affected, their hospital course, and their outcomes.  Rapid analysis of RWD as soon as it becomes available will help drive the U.S. response to COVID-19, as it can help governments, healthcare providers, and communities more accurately prepare for anticipated patient needs.

Using RWD to gain insight into COVID-19

There is a relative dearth of data on the basic epidemiology of COVID-19 in the U.S., since there does not appear to be any detailed disease registries that highlight the incidence of the disease, characteristics of patients who suffer from it, and the course of the disease, especially from the standpoint of scarce resources such as hospital beds, ICU beds, and ventilators.

One way clinicians and policy makers can gain a better understanding of COVID-19 is to examine past examples of similar viral infections such as SARS, H1N1, and recent influenza seasons. To provide a context and basis for comparison we used RWD to better understand the ways in which the new virus compares to what we know of prior, similar viruses.

Recently, the CDC published the first detailed information on novel coronavirus cases in the U.S., focusing on over 5,000 patients with confirmed COVID-19 between February 12, 2020 and March 28, 2020. The data includes information on the prevalence of several comorbid conditions for individuals treated as outpatients, hospitalized patients with or without intensive care unit (ICU) admission, or with unknown hospitalization status. Based on analysis of this data, the CDC found that 1,494 patients aged 19 or older had been hospitalized. The analysis showed that diabetes mellitus, cardiovascular disease and chronic lung disease were found in 16-27% of hospitalized patients, and that these conditions were 24-48% more likely to be found among patients treated in the ICU.

Using data from 2016-2019 from the Optum® Electronic Health Record Database and the CDC’s published information on COVID-19 hospitalizations, I compared hospitalized patients, with a focus on their demographics, underlying medical conditions, and course of hospital stay.  This analysis was conducted using BHE’s Instant Health Data (IHD) platform.

How COVID-19 compares to seasonal influenza

Our analysis revealed that selected underlying conditions — diabetes mellitus, cardiovascular disease and chronic lung disease — varied in prevalence with a higher frequency in hospitalized influenza patients versus COVID-19 patients.  Moreover, about one-third of hospitalized patients with either type of infection were treated in an ICU.  Diabetes mellitus, chronic lung disease, heart disease, and kidney disease were common among ICU patients, especially for the group with influenza (see Figure).

There were some differences between the two patient populations that may account for our findings. First, patients hospitalized with influenza were somewhat older (58% aged 65 years and older vs. 51% for COVID-19), which may tend to increase the presence of co-occurring conditions. There were also many more current and past smokers among the historical controls, which may account for differences in rates of chronic lung as well as cardiovascular disease, although the smoking prevalence data among COVID-19 appears to be underestimated. Finally, conditions may have varied in their definitions between chart-recorded conditions and diagnosis data included in the EHR. While these are preliminary results based on a small COVID-19 dataset and as more data from COVID-19 patients is collected, much more will be learned about the virus and how best to treat and contain it.

The importance of RWD in fighting COVID-19

The availability of summary data from surveillance systems with detailed patient and disease conditions, hospitalization status, and use of ICU units is helpful in driving a better understanding of COVID-19. Further data that describe the clinical course and outcomes of patients with COVID-19 will be important for research whether through the collection of primary data or from large EHR and administrative claims.

In addition to demographic and medical condition data, RWD can provide time and type of procedures, time in care units, drugs administered, discharge destination, readmissions, and mortality. For example, it is possible to explore the characteristics of patients needing ventilators, which can be of use in hospital resource planning. EHR data bases can provide clinical indicators measured by lab results, imaging tests and other useful measures for understanding a patient’s prognosis.

RWD can reduce the burden of data collection and also allow for the assessment of long-term patient outcomes. Within days of the CDC data being released, we were able to build an analysis in IHD comparing COVID-19 to influenza patients. The combination of RWD and rapid analysis to gather key insights quickly may ultimately be one of the most important weapons in the battle against COVID-19.

About Joseph Menzin

Joseph Menzin, Ph.D. is the founder and CEO of Boston Health Economics (BHE), an independent health analytics company with over two decades of experience in designing and conducting research that assesses the value of medical technologies.