Data Paths: An Update from the PaTH Informatics Team

Project Manager Nickie Cappella is the site lead for informatics at Pitt and UPMC.

With 10.8 million patients across six different health care systems, a unique challenge the PaTH Network faces is recording data consistently across its sites. For example, researchers at one site may note that a patient has been discharged to a nursing home by entering ’03 – SKILLED NURS FACIL’ while researchers at another institution might record this information as ’64 - DSC NUR FAC MA APPR .’ These differences make it confusing and difficult to compare data from different PaTH institutions.

To ensure data are recorded consistently across sites,PCORnet (the National Patient-Centered Clinical Research Network) created a Common Data Model (CDM) for all 33 of its networks. The CDM organizes data into a standard structure, so all PaTH sites record data the same way (so, for instance, all occurrences of patients being sent to nursing homes would be coded as ‘NH’).

CDM 3.0, the latest version of the CDM, allows sites to use data across organizations without having to standardize information with each question, simplifying the process for both immediate and repeated use of the data. It also allows for much quicker responses to research-related questions because the sites don’t have to transform and prepare data every time they want to ask a new question. This standardization increases the efficiency of analyzing, sharing, and comparing data across PaTH sites.

Converting all PaTH sites’ current data into one consistent format – known as Data Transformation or Data Harmonization – was a huge undertaking for PaTH’s informatics team across the network. Fortunately, the team was able develop a structured pipeline of work that starts with conversion scripts to transform the data from the local clinical databases into a form the PaTH sites use to share data with each other (known as Informatics for Integrating Biology & the Bedside or i2b2) to CDM 3.0 specifications, says Senior Project Manager Nickie Cappella.

“PaTH has the option of loading additional ‘non-CDM’ data elements into i2b2, which allows PaTH researchers to leverage PaTH-specific and PCORnet data elements. We then write and use data extraction routines to extract, transform, and load data from our i2b2 system into PCORnet CDM-compliant table structures,” Cappella explains.

Finally, these databases, which use different database management systems, are converted into a format that can be queried by a single program called Statistical Analysis System (SAS).

Cappella says centralizing these scripts avoids redundant conversion programming and ensures a consistent CDM loading process across all PaTH sites.

The informatics team successfully implemented the CDM 3.0 at all PaTH institutions in time to pass PCORnet’s CDM submission deadline.

The team has since been focused on Data Characterization (DC). During DC, the team reran reports over the collected data. They then reran DC for all sites and completed and summarized the data for the diagnostic and DC queries. These queries reviewed the data to confirm PaTH was accurately populating the tables in CDM 3.0. Through these queries, three of PaTH’s DataMarts (groupings of data similar to a database) were approved for Prep-to-research (PTR) or research. The remaining four DataMarts are pending approval.

Coming up, the informatics team plans to execute 50 pre-research or research queries over the next year and will analyze the studies to determine if PaTH data needs to be more robust.

«—- Back To News


Stay in touch with the PaTH Network with news and updates in your inbox.

PaTH Network Logo
Twitter Logo Facebook Logo LinkedIn Logo YouTube Logo

Copyright 2016 | PaTH Network