Justin Starren's research while affiliated with University of Illinois at Chicago and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (170)


The Prevalence of Post-Acute Sequelae of COVID-19 in Solid Organ Transplant Recipients: Evaluation of Risk in the National COVID Cohort Collaborative (N3C)
  • Article

June 2024

·

15 Reads

American Journal of Transplantation

Amanda J. Vinson

·

Makayla Schissel

·

·

[...]

·

Xiaohan Tanner Zhang
Share


Northwestern University resource and education development initiatives to advance collaborative artificial intelligence across the learning health system
  • Article
  • Full-text available

April 2024

·

134 Reads

Learning Health Systems

Learning Health Systems

Introduction The rapid development of artificial intelligence (AI) in healthcare has exposed the unmet need for growing a multidisciplinary workforce that can collaborate effectively in the learning health systems. Maximizing the synergy among multiple teams is critical for Collaborative AI in Healthcare. Methods We have developed a series of data, tools, and educational resources for cultivating the next generation of multidisciplinary workforce for Collaborative AI in Healthcare. We built bulk‐natural language processing pipelines to extract structured information from clinical notes and stored them in common data models. We developed multimodal AI/machine learning (ML) tools and tutorials to enrich the toolbox of the multidisciplinary workforce to analyze multimodal healthcare data. We have created a fertile ground to cross‐pollinate clinicians and AI scientists and train the next generation of AI health workforce to collaborate effectively. Results Our work has democratized access to unstructured health information, AI/ML tools and resources for healthcare, and collaborative education resources. From 2017 to 2022, this has enabled studies in multiple clinical specialties resulting in 68 peer‐reviewed publications. In 2022, our cross‐discipline efforts converged and institutionalized into the Center for Collaborative AI in Healthcare. Conclusions Our Collaborative AI in Healthcare initiatives has created valuable educational and practical resources. They have enabled more clinicians, scientists, and hospital administrators to successfully apply AI methods in their daily research and practice, develop closer collaborations, and advanced the institution‐level learning health system.

Download

Fig. 1 eMERGE sample processing workflow. Steps indicating where aliquots of DNA are taken from samples that are presented to the Clinical DNA Sequencing Laboratory for accession, to test via the Fluidigm 96-SNP panel assay. Data from the Fluidigm 96-SNP panel assay are compared with DNA sequence data from the DNA sequencing pipeline as a quality control step, ahead of the Automated Clinical Reporting step
Fig. 2 Scatter plot analysis of 96-SNP panel reveals sample contamination. Scatter plot analysis from vendor software, showing a normal DNA male sample (A) or a contaminated sample containing a mixture of male and female DNAs (B). Panels 1-3 SNPs on X chromosome; panels 4-6 SNPs on Y chromosome; panels 7-9 autosomal SNPs. Each panel shows the data from a single SNP, as compared to clusters from all other SNPs. Clusters are shown as either homozygous (red or green), or heterozygous (blue) positions. In panels B2, 3, 7-9 single SNPS are represented as outside the expected (arrows) resulting in erroneous or 'no-call' from the software
Causes of sample sex discrepancy
Genetic sex validation for sample tracking in next-generation sequencing clinical testing

March 2024

·

78 Reads

BMC Research Notes

Objective Data from DNA genotyping via a 96-SNP panel in a study of 25,015 clinical samples were utilized for quality control and tracking of sample identity in a clinical sequencing network. The study aimed to demonstrate the value of both the precise SNP tracking and the utility of the panel for predicting the sex-by-genotype of the participants, to identify possible sample mix-ups. Results Precise SNP tracking showed no sample swap errors within the clinical testing laboratories. In contrast, when comparing predicted sex-by-genotype to the provided sex on the test requisition, we identified 110 inconsistencies from 25,015 clinical samples (0.44%), that had occurred during sample collection or accessioning. The genetic sex predictions were confirmed using additional SNP sites in the sequencing data or high-density genotyping arrays. It was determined that discrepancies resulted from clerical errors (49.09%), samples from transgender participants (3.64%) and stem cell or bone marrow transplant patients (7.27%) along with undetermined sample mix-ups (40%) for which sample swaps occurred prior to arrival at genome centers, however the exact cause of the events at the sampling sites resulting in the mix-ups were not able to be determined.


The Intersections of COVID-19, HIV, and Race/Ethnicity: Machine Learning Methods to Identify and Model Risk Factors for Severe COVID-19 in a Large U.S. National Dataset

February 2024

·

50 Reads

AIDS and Behavior

We investigate risk factors for severe COVID-19 in persons living with HIV (PWH), including among racialized PWH, using the U.S. population-sampled National COVID Cohort Collaborative (N3C) data released from January 1, 2020 to October 10, 2022. We defined severe COVID-19 as hospitalized with invasive mechanical ventilation, extracorporeal membrane oxygenation, discharge to hospice or death. We used machine learning methods to identify highly ranked, uncorrelated factors predicting severe COVID-19, and used multivariable logistic regression models to assess the associations of these variables with severe COVID-19 in several models, including race-stratified models. There were 3 241 627 individuals with incident COVID-19 cases and 81 549 (2.5%) with severe COVID-19, of which 17 445 incident COVID-19 and 1 020 (5.8%) severe cases were among PWH. The top highly ranked factors of severe COVID-19 were age, congestive heart failure (CHF), dementia, renal disease, sodium concentration, smoking status, and sex. Among PWH, age and sodium concentration were important predictors of COVID-19 severity, and the effect of sodium concentration was more pronounced in Hispanics (aOR 4.11 compared to aOR range: 1.47–1.88 for Black, White, and Other non-Hispanics). Dementia, CHF, and renal disease was associated with higher odds of severe COVID-19 among Black, Hispanic, and Other non-Hispanics PWH, respectively. Our findings suggest that the impact of factors, especially clinical comorbidities, predictive of severe COVID-19 among PWH varies by racialized groups, highlighting a need to account for race and comorbidity burden when assessing the risk of PWH developing severe COVID-19.


Figure 1. Computer-Adaptive Testing Event Loop. The survey begins with an assumption of an average T-score of 50, the general population norm. Based on the patient's response to the first question, the next item is selected to give maximal additional information. The cycle is repeated until the confidence in the result is sufficiently high (in other words, the standard error is sufficiently low) or the maximum number of questions is reached.
Figure 2. Project design architecture. The main software developed for this project was the custom survey management middleware (top left) housed within an existing Northwestern Memorial Healthcare enterprise service bus ("NMH Framework"). The NMPRO project also developed custom code for multiple aspects of Epic (blue box). NMPRO made use of the newly developed Assessment Center API (green). Patient CAT scores were stored within the NMH Research Database within the NMH Framework for access by the middleware and Epic (bottom).
Figure 3. Swimlane diagram of the initiation of a CAT survey.
Figure 5, four PROMIS CATs are combined into one assessment and scores are displayed simultaneously to a clinician in Epic's user interface.
Seamless Integration of Computer-Adaptive Patient Reported Outcomes into an Electronic Health Record

December 2023

·

19 Reads

·

1 Citation

Applied Clinical Informatics

Background Patient-reported outcome (PRO) measures have become an essential component of quality measurement, quality improvement, and capturing the voice of the patient in clinical care. In 2004, the National Institutes of Health endorsed the importance of PROs by initiating the Patient-Reported Outcomes Measurement Information System (PROMIS), which leverages computer-adaptive tests (CATs) to reduce patient burden while maintaining measurement precision. Historically, PROMIS CATs have been used in a large number of research studies outside the electronic health record (EHR), but growing demand for clinical use of PROs requires creative information technology solutions for integration into the EHR. Objectives This paper describes the introduction of PROMIS CATs into the Epic Systems EHR at a large academic medical center using a tight integration; we describe the process of creating a secure, automatic connection between the application programming interface (API) which scores and selects CAT items and Epic. Methods The overarching strategy was to make CATs appear indistinguishable from conventional measures to clinical users, patients, and the EHR software itself. We implemented CATs in Epic without compromising patient data security by creating custom middleware software within the organization's existing middleware framework. This software communicated between the Assessment Center API for item selection and scoring and Epic for item presentation and results. The middleware software seamlessly administered CATs alongside fixed-length, conventional PROs while maintaining the display characteristics and functions of other Epic measures, including automatic display of PROMIS scores in the patient's chart. Pilot implementation revealed differing workflows for clinicians using the software. Results The middleware software was adopted in 27 clinics across the hospital system. In the first 2 years of hospital-wide implementation, 793 providers collected 70,446 PROs from patients using this system. Conclusion This project demonstrated the importance of regular communication across interdisciplinary teams in the design and development of clinical software. It also demonstrated that implementation relies on buy-in from clinical partners as they integrate new tools into their existing clinical workflow.


Development of a Social and Environmental Determinants of Health Informatics Maturity Model

December 2023

·

58 Reads

Journal of Clinical and Translational Science

Introduction Integrating social and environmental determinants of health (SEDoH) into enterprise-wide clinical workflows and decision-making is one of the most important and challenging aspects of improving health equity. We engaged domain experts to develop a SEDoH informatics maturity model (SIMM) to help guide organizations to address technical, operational, and policy gaps. Methods We established a core expert group consisting of developers, informaticists, and subject matter experts to identify different SIMM domains and define maturity levels. The candidate model (v0.9) was evaluated by 15 informaticists at a Center for Data to Health community meeting. After incorporating feedback, a second evaluation round for v1.0 collected feedback and self-assessments from 35 respondents from the National COVID Cohort Collaborative, the Center for Leading Innovation and Collaboration’s Informatics Enterprise Committee, and a publicly available online self-assessment tool. Results We developed a SIMM comprising seven maturity levels across five domains: data collection policies, data collection methods and technologies, technology platforms for analysis and visualization, analytics capacity, and operational and strategic impact. The evaluation demonstrated relatively high maturity in analytics and technological capacity, but more moderate maturity in operational and strategic impact among academic medical centers. Changes made to the tool in between rounds improved its ability to discriminate between intermediate maturity levels. Conclusion The SIMM can help organizations identify current gaps and next steps in improving SEDoH informatics. Improving the collection and use of SEDoH data is one important component of addressing health inequities.


ICD10-CM Codes Related to Reproductive Health Activities that are Illegal in some Jurisdictions.
A Privacy Nihilist’s Perspective on Clinical Data Sharing: Open Clinical Data Sharing is Dead, Long Live the Walled Garden

November 2023

·

59 Reads

Journal of the Society for Clinical Data Management

Clinical data sharing combined with deep learning, and soon quantum computing, has the potential to radically accelerate research, improve healthcare, and lower costs. Unfortunately, those tools also make it much easier to use the data in ways that can harm patients. This article will argue that the vast amounts of data collected by data brokers, combined with advances in computing, have made reidentification a serious risk for any clinical data that is shared openly. The new NIH data sharing policy acknowledges this new reality by directing researchers to consider controlled access for any individual-level data. The clinical data sharing community will be well-advised to follow the lead of the physics and astronomy communities and create a “walled garden” approach to data sharing. While the investment will be significant, this approach provides the optimal combination of both access and privacy.


FIGURE 2. Analysis of nonelective coronary artery bypass grafting outcomes during the Coronavirus disease 2019 pandemic. CABG, Coronary artery bypass grafting.
Non-elective Coronary Artery Bypass Graft Outcomes are Adversely Impacted by COVID-19 Infection, but not Altered Processes of Care: An N3C and NSQIP Analysis

September 2023

·

66 Reads

JTCVS Open

Objective The effects of Coronavirus disease 2019 (COVID-19) infection and altered processes of care on nonelective coronary artery bypass grafting (CABG) outcomes remain unknown. We hypothesized that patients with COVID-19 infection would have longer hospital lengths of stay and greater mortality compared with COVID-negative patients, but that these outcomes would not differ between COVID-negative and pre-COVID controls. Methods The National COVID Cohort Collaborative 2020-2022 was queried for adult patients undergoing CABG. Patients were divided into COVID-negative, COVID-active, and COVID-convalescent groups. Pre-COVID control patients were drawn from the National Surgical Quality Improvement Program database. Adjusted analysis of the 3 COVID groups was performed via generalized linear models. Results A total of 17,293 patients underwent nonelective CABG, including 16,252 COVID-negative, 127 COVID-active, 367 COVID-convalescent, and 2254 pre-COVID patients. Compared to pre-COVID patients, COVID-negative patients had no difference in mortality, whereas COVID-active patients experienced increased mortality. Mortality and pneumonia were higher in COVID-active patients compared to COVID-negative and COVID-convalescent patients. Adjusted analysis demonstrated that COVID-active patients had higher in-hospital mortality, 30- and 90-day mortality, and pneumonia compared to COVID-negative patients. COVID-convalescent patients had a shorter length of stay but a higher rate of renal impairment. Conclusions Traditional care processes were altered during the COVID-19 pandemic. Our data show that nonelective CABG in patients with active COVID-19 is associated with significantly increased rates of mortality and pneumonia. The equivalent mortality in COVID-negative and pre-COVID patients suggests that pandemic-associated changes in processes of care did not impact CABG outcomes. Additional research into optimal timing of CABG after COVID infection is warranted.


Figure 2. Overview of HIPPS. The inputs (A) of our composite algorithm are all individuals and the full set of OMOP concepts in N3C. From these, we identify pregnant persons (B) and identify pregnancy-specific concepts (C). Our HIPPS algorithm (D) is comprised of the Hierarchy-based Inference of Pregnancy (HIP) algorithm (E), Get gestational timing concepts (F), Pregnancy Progression Signature (PPS) Algorithm (G), Merge episodes (H), and Estimated Start Date (ESD) Algorithm (I). The output is a dataset with pregnancy-related data and enriched with COVID covariates. The following are the individual steps within each panel. (1) From the 12 million (M) patients in N3C, we identified 4M that were both female and of reproductive age (15-55 years). (2) Of these, we identified 633K possibly pregnant persons who matched at least one concept in an initial set of ultrasound and pregnancy outcome concepts from Matcho et al. 12 (3) To develop an enriched set of concepts specific for pregnancy, we then assessed concept frequency among the initial cohort of possibly pregnant persons and chose 1417 concepts that were present in at least 1000 individuals and were 10X (determined
Figure 5. HIPPS results. (A) Histogram of the number of outcome concepts per episode by outcome category. (B) Outcome concordance scores by outcome category. An outcome concordance score of 2 has an outcome within the expected term duration and is supported by both HIP and PPS. An outcome concordance score of 1 has an outcome within the expected term duration. An outcome concordance score of 0 does not have an outcome within the expected term duration. (C) Histogram of episodes with week-level resolution only (N ¼ 563 471) by outcome category of recorded pregnancy lengths (start and end dates of records for pregnancy that occur within the EHR data) in weeks and (D) inferred pregnancy lengths (pregnancy start and end estimated using HIPPS) in weeks. (E) Histogram of episodes with week-level resolution only (N ¼ 563 471) by outcome category. Number of outcome concepts were determined from the outcome date to 28 days after. (F) Proportion of episodes by outcome category and by start date precision level for baseline and Estimated Start Date Algorithm. The baseline method obtained the start dates using only the week-level or GW concepts within an episode without any removal of outliers and assigned precision based on the maximum start date difference between GW concepts.
Clinician validation of episodes and outcome categories
Clinician validation of start and end dates of both inferred and recorded pregnancy episodes
Who is pregnant? Defining real-world data-based pregnancy episodes in the National COVID Cohort Collaborative (N3C)

August 2023

·

112 Reads

·

6 Citations

JAMIA Open

Objectives To define pregnancy episodes and estimate gestational age within electronic health record (EHR) data from the National COVID Cohort Collaborative (N3C). Materials and Methods We developed a comprehensive approach, named Hierarchy and rule-based pregnancy episode Inference integrated with Pregnancy Progression Signatures (HIPPS), and applied it to EHR data in the N3C (January 1, 2018–April 7, 2022). HIPPS combines: (1) an extension of a previously published pregnancy episode algorithm, (2) a novel algorithm to detect gestational age-specific signatures of a progressing pregnancy for further episode support, and (3) pregnancy start date inference. Clinicians performed validation of HIPPS on a subset of episodes. We then generated pregnancy cohorts based on gestational age precision and pregnancy outcomes for assessment of accuracy and comparison of COVID-19 and other characteristics. Results We identified 628 165 pregnant persons with 816 471 pregnancy episodes, of which 52.3% were live births, 24.4% were other outcomes (stillbirth, ectopic pregnancy, abortions), and 23.3% had unknown outcomes. Clinician validation agreed 98.8% with HIPPS-identified episodes. We were able to estimate start dates within 1 week of precision for 475 433 (58.2%) episodes. 62 540 (7.7%) episodes had incident COVID-19 during pregnancy. Discussion HIPPS provides measures of support for pregnancy-related variables such as gestational age and pregnancy outcomes based on N3C data. Gestational age precision allows researchers to find time to events with reasonable confidence. Conclusion We have developed a novel and robust approach for inferring pregnancy episodes and gestational age that addresses data inconsistency and missingness in EHR data.


Citations (74)


... The assurance that PGHD is complete, accurate, and auditable is a critical requirement for the analysis of health data. The importance of these characteristics increases as PGHD is being integrated with EHRs for clinical diagnosis, treatment, and decision support (Ye et al., 2024). The potential benefits of Blockchain and other decentralized ledger system technology include enhancements for secure, transparent, and resilient protection of transaction data to provide verifiable, auditable, and tamper-resistant medical records (Corte-Real, 2024). ...

Reference:

Survey of Information Security and Privacy for Patient-Generated Health Data
The role of artificial intelligence for the application of integrating electronic health records and patient-generated data in clinical decision support
  • Citing Article
  • May 2024

... Instead, more domain specificity is needed, which can, in fact, be obtained without substantially increasing the number of questions that patients must answer if a PROM built with modern measurement theory is selected. 25,26 Category 4: universal and domain-specific Universal and domain-specific PROMs are intended for all patients regardless of their specific condition by measuring aspects of health that are shared among them. The measurement capabilities of these PROMs allow them to overcome many of the shortcomings noted for the other 3 PROM types. ...

Seamless Integration of Computer-Adaptive Patient Reported Outcomes into an Electronic Health Record

Applied Clinical Informatics

... Our group has phenotyped pregnant people, their pregnancy episodes as well as gestational ageing and their COVID-19 outcomes in N3C. 23 With the largest harmonised EHR data in the USA, we focused on two interconnected questions for this study. First, we aimed to evaluate the extent to which COVID-19 vaccination could reduce the risk of incident and severe COVID-19 infections among pregnant people who were vaccinated versus unvaccinated. ...

Who is pregnant? Defining real-world data-based pregnancy episodes in the National COVID Cohort Collaborative (N3C)

JAMIA Open

... Bartlett's paper noted that there is a "decreased risk of pneumonia in COPD patients who use nebulized budesonide". "Secondary bacterial infection of the lung (pneumonia) was extremely common in patients with COVID-19", according to a report from Northwestern Medicine [3], citing data from an April 2023 study in The Journal of Clinical Investigation [4]. ...

Machine learning links unresolving secondary pneumonia to mortality in patients with severe pneumonia, including COVID-19

The Journal of clinical investigation

... Although studies in other research domains, such as clinical research, have proposed data standards (Richesson & Krischer, 2007;Richesson et al., 2023;Tsueng et al., 2023), there are no established standards for gathering, processing, analyzing, or presenting data in primary/secondary education (Hernández-Leal et al., 2021). In the study by Hernández-Leal et al. (2021), a methodological framework was presented 3 https://aws.amazon.com/rds/. ...

Developing a standardized but extendable framework to increase the findability of infectious disease datasets

Scientific Data

... Since the Dobbs ruling, there has been increased awareness surrounding how reproductive health data in both traditional health care settings and beyond can be used against patients. For example, reports have detailed the known limitations of major federal regulation, such as the Health Insurance Portability and Accountability Act (HIPAA), in protecting patients' reproductive health data in the context of patient care [12][13][14][15] and how the massive exchange of health care data across organizations and state lines could jeopardize patients who receive abortion care [13,16,17]. Additionally, outside traditional health care settings, reports have detailed how mobile apps collecting data about users' health, sexual behavior, and menstrual cycles have shared user data without the user's knowledge or consent [18] and how these data could be used to prosecute women who receive abortion care [19]. ...

Paging the Clinical Informatics Community: Respond STAT to Dobbs v Jackson’s Women’s Health Organization

Applied Clinical Informatics

... 4 Finally, our study shows that vaccination decreases the risk of developing adverse surgical outcomes in patients with postoperative COVID-19, suggesting that those included in our study did confer benefit following vaccination. 5 To conclude, the impact of undiagnosed SARS-CoV-2 and potential for variable vaccine immunity would not change the interpretation of the study findings. Vaccination against SARS-CoV-2 is an important tool for mitigating the risk of adverse postoperative events in patients who develop COVID-19 after surgery. ...

Vaccination Against SARS-CoV-2 Decreases Risk of Adverse Events in Patients who Develop COVID-19 Following Cancer Surgery

Annals of Surgical Oncology

... Additionally, performance of algorithms are challenging to interpret because of variation in whether the focus is on the detection of PD itself, PD with NPS, or parkinsonism in general (17). Further, algorithms developed in one system have rarely been tested using data across differing systems, and consensus algorithms have not yet emerged (24). ...

Characterizing variability of electronic health record-driven phenotype definitions
  • Citing Article
  • December 2022

Journal of the American Medical Informatics Association

... To adequately address these obstacles, the usage of EHR data requires careful data preprocessing, presumed to account for 80% of the effort when performing a typical analysis or model development 22 . A few EHR data preprocessing and analysis workflows have been previously developed 4,[23][24][25][26][27] , but none of them enable the analysis of heterogeneous data, provide in-depth documentation, are software packages or allow for exploratory visual analysis. Current EHR analysis pipelines therefore differ strongly in their approaches and are often commercial, vendor specific solutions 28 . ...

A machine learning approach identifies unresolving secondary pneumonia as a contributor to mortality in patients with severe pneumonia, including COVID-19

... The meropenem plasma assay methodology and results have been previously descri bed (10). In this study, we utilized a similar method to quantify meropenem in BALF samples. ...

Individual target pharmacokinetic/pharmacodynamic attainment rates among meropenem-treated patients admitted to the ICU with hospital-acquired pneumonia
  • Citing Article
  • July 2022

Journal of Antimicrobial Chemotherapy