Are Researchers Ready to Use Patient Health Records?

Andy Oram praxagora

There’s a groundswell of opinion throughout health care that to improve outcomes, we need to share clinical data from patients’ health records with researchers who are working on cures or just better population health measures. One recommendation in the much-studied JASON report–an object of scrutiny at the Office of the National Coordinator and throughout the field of health IT–called on the ONC to convene a conference of biomedical researchers.

At this conference, presumably, the health care industry will find out what researchers could accomplish once they had access to patient data and how EHRs would have to change to meet researchers’ needs. I decided to contact some researchers in medicine and ask them these very questions–along with the equally critical question of how research itself would have to evolve to make use of the new flood of data.

Results of the interviews suggest that systems will have to change on many levels to reap the rich harvest promised for research. Doctors and their EHRs must both tighten up their methods of collecting and recording data. Researchers need to learn what’s popularly called Big Data techniques. And the computer systems that store and process the data have to adapt to the enormous sizes of “omics” data they are receiving.

Byron J. Ruth is one of the researchers already cavorting in the lush fields of EHR data. He is a Lead Analyst/Programmer in the Center for Biomedical Informatics at the Children’s Hospital of Philadelphia (CHOP). Ruth assigns to EHR vendors the primary task of data exchange, a big push by the ONC. He points out that data sharing can lead to larger cohorts–regional or national data instead of data from a single institution–and thereby benefit from bigger data sets, which mean less bias.

For instance, there are few samples of rare conditions in any one region, but a clinical decision system can store information on diseases from far-flung areas and warn doctors of risks such as Ebola outbreaks.

John Wilbanks, who promotes the sharing and advanced processing of health care data through Sage Bionetworks, reports hearing several common objections from researchers he is trying to persuade to take EHR data into account. These are all valid, but there are ways to compensate for them.

EHR data is not specific enough (except genomic data).
EHRs contain too many errors.
EHR data is aimed at treatment and billing rather than research.
Most EHRs are still incapable of generating structured, well-coded data that is useful to researchers. The ONC has made great strides in promulgating structured data exchange standards–Blue Button for structured data and Blue Button Plus for an API–but these are only beginning to be adopted in scattered places.

Wilbanks thinks EHR data is still invaluable, because it contains hard facts such as lab reports as well as expert opinions. Statistical techniques can compensate somewhat for the weakness, but clinicians need workflows more conducive to accurate data collection. The single change that would most reduce errors would be to keep data in the hands of the patients. They are the ones who most often discover and fix errors.

More generally, researchers’ objections reflect the challenge of using Big Data: one has to search through a diverse, inconsistently coded, dirty agglomeration of facts and use statistical techniques to do such things as eliminate outliers and find data sharing common charactertistics. Data scientists with these skills are entering the health field and generating useful findings, so eventually the more traditional clinical researchers will learn these techniques or hook up with those who know them.

Dr. Maxim Mikheev, CTO and co-Founder of BioDatomics, highlighted the computer networking problems created by the size of genomes. He’s glad to see repositories swell with genomic data, but they are far too large to download over the networks available to most researchers. Storage is also a problem.

Ruth encountered this problem on a project called the HeartSmart Pediatric Cardiac Genomics Consortium (PCGC). They were able to continue exchanging data by upgrading their Internet connections. But Ruth and Dr. Mikheev both recognize that a more robust solution is to keep data on the system where it was generated and bring the program to the system. The National Cancer Institute has started a Cancer Genomics Cloud Pilots project that runs three data centers hosting the genomic data and running programs uploaded by researchers.

The final hurdle to data sharing is the willingness of researchers to do so. Wilbanks is dealing with this at Sage Networks on a daily basis. Ruth says it is hard to achieve even within CHOP. “One of the other challenges with any kind of data sharing among researchers is that no one really trusts anyone else,” he writes.”Basing studies on other people’s work is a relatively bold move, especially if you do not have access to the data used for that previous work.” Part of the solution, Ruth says, is to record data provenance, “which can be summed up as the who, what, where, why, and how some data came to be.”

Tagsanalytics Byron J. Ruth Children's Hospital of Philadelphia CHOP EHR Electronic Health Record Genomic Data Healthcare Big Data JASON Research

About the author

View All Posts

Andy Oram

Andy is a writer and editor in the computer field. His editorial projects have ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. A correspondent for Healthcare IT Today, Andy also writes often on policy issues related to the Internet and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM (Brussels), DebConf, and LibrePlanet. Andy participates in the Association for Computing Machinery's policy organization, named USTPC, and is on the editorial board of the Linux Professional Institute.

2 Comments

Alex Tate says:

October 21, 2014 at 5:46 am

EHRs can be made useful by creating engagement points such as patient portals. as patient portals can be used to allow the patients to engage with the EHR data and make a better use of it. In the same way different other portals can be made that will give the researchers the access to information that is required for research purposes. http://goo.gl/XqQfDE
Consumers Are Still Held Back From Making Rational Health Decisions | EMR and EHR says:

November 25, 2014 at 8:54 am

[…] problems with analytics throughout the health care field. For instance, I recently reported on how hard a time researchers have obtaining and making use of patient data. Luckily, the GAO report cites several HHS efforts to enhance their current data on price and […]

Click here to post a comment

The State of Hand Hygiene Compliance Infographic

3-D Printed Facial Prosthesis Offers New Hope for Eye Cancer Patients Following Surgery

Cookie	Duration	Description
__cfruid	session	This cookie is set by the provider Cloudflare. This cookie is used for load balancing and for identifying trusted web traffic.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
AWSALBCORS	7 days	This cookie is used for load balancing services provded by Amazon inorder to optimize the user experience. Amazon has updated the ALB and CLB so that customers can continue to use the CORS request with stickness.
AWSELB	session	This cookie is associated with Amazon Web Services and is used for managing sticky sessions across production servers.
cf_ob_info		This cookie is set by the provider Cloudflare. The cookie provides informations on HTTP Status Code returned by the origin web server, the Ray ID of the original failed request and the data center serving the traffic.
cf_use_ob		This cookie is set by the provider Cloudflare content delivery network. This cookie is used for determining whether it should continue serving "Always Online" until the cookie expires.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-non-necessary	1 hour	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non-necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
gdpr_status	6 months 2 days	This cookie is set by the provider Media.net. This cookie is used to check the status whether the user has accepted the cookie consent box. It also helps in not showing the cookie consent box upon re-entry to the website.
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
ts	1 year 1 month	This cookie is provided by the PayPal. It is used to support payment service in a website.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie is set by CloudFlare. The cookie is used to support Cloudflare Bot Management.
_alid_	session	This cookie is set by the provider mielevod-vh.akamaihd.net. This cookie is used for making the live streaming of video content more efficient.
akavpau_ppsd	session	This cookie is provided by Paypal. The cookie is used in context with transactions on the website.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language	session	This cookie is used to store the language preference of the user.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
sp_landing	1 day	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
sp_t	1 year	This cookie is set by the provider Spotify. This cookie is used to implement audio content from spotify on the website. It also helps in collecting information on user interaction with this audio content.
v1st	1 year 1 month	This cookie is set by the provider TripAdvisor. This cookie is used to show user reviews, awards and information recieved on the community of TripAdvisor. It helps to collect information about how visitors use the website.

Cookie	Duration	Description
AWSELBCORS	session	This cookie is used for load balancing, inorder to optimize the service. It also stores the information regarding which server cluster is serving the visitor.
dmvk	session	This cookie is set by the provider Dailymotion. This cookie is used for collecting statistical data of the visitor behaviour on the website. It is used for internal analytics.
sid	past	This cookie is very common and is used for session state management.

Cookie	Duration	Description
__gads	1 year 24 days	This cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_131168995_1	1 minute	This cookie is set by Google and is used to distinguish users.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
CONSENT	16 years 4 months 2 days 9 hours	These cookies are set via embedded youtube-videos. They register anonymous statistical data on for example how many times the video is displayed and what settings are used for playback.No sensitive data is collected unless you log in to your google account, in that case your choices are linked with your account, for example if you click “like” on a video.
UID	2 years	No description available.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.
WMF-Last-Access	1 month 20 hours	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
bscookie	2 years	This cookie is a browser ID cookie set by Linked share Buttons and ad tags.
DSID	1 hour	This cookie is setup by doubleclick.net. This cookie is used by Google to make advertising more engaging to users and are stored under doubleclick.net. It contains an encrypted unique ID.
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
NID	6 months	This cookie is used to a profile based on user's interest and display personalized ads to the users.
OAGEO	session	This cookie is set by the provider OpenX. This cookie is used for advertising campaigns on the website. The cookie helps in avoiding the same ad showing repeatedly.
OAID	1 year	This cookie is set when an AdsWizz website visitor have opted out the collection of information by AdsWizz service or opted to disable the targeted ads by AdsWizz.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.
yt-remote-connected-devices	never	These cookies are set via embedded youtube-videos.
yt-remote-device-id	never	These cookies are set via embedded youtube-videos.
yt.innertube::nextId	never	These cookies are set via embedded youtube-videos.
yt.innertube::requests	never	These cookies are set via embedded youtube-videos.