Amazon Launches Public Data Lake Making COVID-19 Data Accessible

Amazon has announced that it has made an AWS COVID-19 data lake available to the public for use in better understanding the virus.

The data lake, a centralized repository of up-to-date curated datasets on or related to the spread and characteristics of the pandemic, will be hosted on the AWS cloud.

Its content includes case tracking data from Johns Hopkins and The New York Times, along with hospital bed availability data from Definitive Healthcare. It will also feature more than 45,000 research articles on COVID-19 topics provided by the Allen Institute for AI.

The idea behind the data lake project is to help improve access to this data and make it easier to experiment with it without having to spend time extracting and wrangling data from already-available data sources.

Interested parties will be able to use AWS or other third-party tools to analyze trends, do keyword searches, perform question/answer analyses or run custom research focused on their specific needs. In addition, data lake users will be able to work with the public data lake assets, combine data lake content outside data or subscribe to the source datasets directly through the AWS Data Exchange.

This offer comes in the wake of the launch of the AWS Data Exchange last November. Data Exchange customers can find, subscribe to data gathered on the Exchange, then use its API or console to pull data to which they subscribe into Amazon’s Simple Storage Service.

At the time, AWS predicted that one of the key uses to which healthcare professionals would put Data Exchange would be to subscribe to aggregated data from historical clinical trials to speed up their research programs. It also seemed likely that providers would use the service to access patient data from HIEs or fellow members of a health system.

Of course, the healthcare industry’s priorities have shifted dramatically. In its announcement, AWS leaders predicted that local health leaders will use this data to build dashboards that help them track COVID-19 diagnoses and establish plans for deploying critical resources such as hospital beds and ventilators effectively.

Of course, AWS isn’t the only organization helping companies address critical pandemic data issues. For example, last month we reported on the formation of a group known as the COVID-19 Healthcare Coalition which has come together to foster the rapid deployment of open source solutions tackling the virus.

The group, whose members include AWS, Epic, HCA Healthcare, Intermountain Healthcare, Walgreens and Microsoft, is partnering with MITRE Corp. to manage group data-sharing efforts. One of the initial resources being made available to the group is a dashboard tracking vulnerable populations by region. MITRE is also helping participants develop a taxonomy they can use to track non-pharmaceutical inventions already available for use in tracking the virus.

Be sure to check out our list of free COVID-19 Health IT resources.

About the author

Anne Zieger

Anne Zieger

Anne Zieger is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.