Few Shot Learning to Improve Deep Medical Model Transfer

Berezoski, Dayton Lee

Abstract

The advent of machine learning has changed the way healthcare providers care for their patients. Utilizing data driven models, healthcare workers are now able to predict and account for symptoms in patients before they would have been able to in the past. One example is the ICU mortality prediction model, which takes vital signs of an ICU patient as an input and is able to output a binary classification on whether the patient will survive the ICU. This model is extremely helpful for physicians as it provides them with a second opinion on their patients and it allows them to easily find the patients that are in need of immediate care. However, this ICU mortality prediction model is built off of thousands of data points from past patients in the ICU, making it difficult for most health centers to build their own model trained off of their own patients’ data. The next idea to make these models readily available to health centers is to train an ICU mortality prediction model from a hospital with the required resources and then to give that model to any health center that wants that model. However, researchers have found that since heath centers often have widely varying resources on hand and types of patients that come in, there is a definite decrease in model performance after the model is transferred. This paper aims to address this decrease in transfer performance by utilizing a technique called few shot learning. In this case, few shot learning will be utilized to modify an existing health center’s ICU mortality prediction model, by taking a small set of data from the receiving hospital’s patients and retraining it on that data. In practice, the experiment will create a general ICU mortality prediction model from the MIMIC-IV database, create a deep copy, retrain that deep copy on a small dataset from the eICU database creating the few shot learning model, and then evaluate them both on the rest of the eICU database. There should be an increase in model discernibility after transfer for the few shot learning model when compared to the general model, because its hypothesis space has been modified to better for the eICU data it was retrained on. Thus, this would allow for more accessibility in the healthcare system for the ICU mortality prediction model and would have major impacts on the level of care physicians are able to give to their patients. After testing the hypothesis, it was concluded that there was a slight increase in the AUCROC score for the few shot learning model. However, it was not statistically significant when compared to the original transfer techniques for prior mortality models. After seeing these results further testing was done that indicated that the few shot learning model ended up learning too much from the eICU dataset that it was trained on additionally and therefore it lost a lot of the training it initially received from the MIMIC database. This event, known as catastrophic forgetting, is likely the reason that the increase in performance was not statistically significant and therefore future work could look into solving this catastrophic learning problem leading to a statistically significant improvement in the few shot learning model.

URI

https://hdl.handle.net/1969.1/199651

Subject

Machine Learning
Computer Science
Healthcare

Collections

Undergraduate Research Scholars Capstone (2006–present)

Citation

Berezoski, Dayton Lee (2023). Few Shot Learning to Improve Deep Medical Model Transfer. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /199651.