Cross JL, Choma MA, Onofrey JA. Bias in medical AI: Implications for clinical decision-making. PLOS Digit Health. 2024;3(11):e0000651. doi:10.1371/journal.pdig.0000651
This article outlines some known biases across the medical AI development pipeline. While the article highlights the breadth of bias in medical AI, it also provides guidance on minimizing that bias while developing, implementing, or using AI in a medical setting. Bias can exist early in the development process in training data sets (imbalanced sample sizes, nonrandomly missing patient data, data not usually or easily captured, biases in data labels and misclassification, and race and ethnicity in clinical algorithms). Imbalanced sample sizes can be mitigated by characterizing the sociodemographic characteristics of the patient data set, using analytical strategies to compensate for imbalanced data, and improving larger, more diverse data sets. Similarly, for nonrandomly missing patient data improved data collection methods, better imputation techniques, and a better record of linkage algorithms may be helpful. For difficult-to-capture data, incorporating external data, using methods for entering unstructured clinical text, and standardizing questionnaires used across medical facilities may reduce bias. Bias in data labels and misclassifications may be minimized by expert consensus and by providing cultural competency training. Race and ethnicity remain poor predictors of genetic variance and should be reduced as much as possible.
In model development and evaluation, overreliance on whole-cohort performance metrics promotes bias. Here, subgroup analyses, explicit debiasing models, and clear model interpretability methods should be used. During publication, the publication outlets and authors may promote bias and the inherent bias in which medical domains have generally positive results with AI research. Multidisciplinary collaborations between data scientists and clinicians are strongly encouraged as well as international collaborations and data sharing. With model implementation, both sample selection bias and end-user bias should be monitored. Ongoing monitoring systems that detect and explain bias in implemented models and guidelines for reporting model bias should be developed. Clinicians also should be trained on how to effectively use the AI models in the same way, regardless of their own biases for patient appropriateness. Overall, AI has a place in the medical system and novel digital therapeutics. Bias is present in these models and researchers have an ethical duty to eliminate as much bias as possible from their models.