Funding Source
Project Period
Principal Investigator
Other Project Staff
Project Summary
The opioid crisis is one of the most pressing current public health issues in the US. It is estimated that more than 130 people die from opioid-related overdoses every day. Consequently, there is great interest among public health researchers in using machine learning technologies to predict opioid-related harms. National survey and surveillance data on opioid use disorder (OUD) and overdoses are a promising source of data to predict the dynamics of the opioid epidemic. However, there are three critical challenges in this task. First, survey and surveillance data are often several months or years behind, hindering the development of accurate projections of the fast-evolving opioid crisis. Second, single predictions obtained from traditional machine learning methods do not indicate their level of certainty. Third, prejudices against people with OUD may have affected the representation of vulnerable populations in the data, introducing additional challenges in creating fair projections of the opioid epidemic.
A potential approach to handle the data delay challenge is by projecting future OUD and overdoses. However, these projections may quickly become obsolete due to the rapidly evolving nature of the opioid crisis. Machine learning approaches must efficiently adjust their projections as new data arises without requiring new model development. One way to address the lack of transparency in prediction confidence is to create bounds for projections with widths proportional to the prediction uncertainty (i.e., low prediction confidence will lead to wider intervals). A promising approach to tackle biases inherent to historical data is to incorporate fairness enhancements that ensure equal error rates across different populations. This notion of fairness attempts to remove any signs of discrimination or favoritism towards populations based on their intrinsic or acquired traits.
This pilot project aims to develop a machine learning framework that leverages these ideas to address the delay, lack of transparency, and bias challenges in predicting the dynamics of the opioid epidemic. Specifically, our framework will incorporate: (1) online learning (i.e., adjusted predictions of OUD and overdoses as new data become available); (2) conformal inference (i.e., prediction sets that contain the ground truth with specific degrees of certainty); and (3) algorithmic fairness (i.e., equal error rates across demographic groups). The overall goal of our pilot project is to identify geographical areas and demographic groups with the largest expected burden of opioid-related morbidity and mortality across the US. We expect our work to provide timely and reliable warnings about the future burden of OUD and overdoses by addressing the data delays faced by the opioid research community.