An Assessment of Supervised and Unsupervised Machine Learning Applications Toward Predicting Gulf of Mexico Coastal Hypoxia

Loading...
Thumbnail Image

Date

2022-07-22

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Observations of dissolved oxygen, salinity, temperature, and six different nutrient concentrations of the waters on the TXLA Shelf in the months of March – September in 2003 – 2014 were used in unsupervised and supervised machine learning techniques to identify driving processes of hypoxia and examine the performance of classification algorithms on predicting hypoxia on the TXLA Shelf. Unsupervised machine learning techniques, principal component analysis, and K-means clustering, successfully identified variability patterns that were associated with previously known drivers and processes of hypoxia in the region such as vertical stratification of the water column and the Mississippi River plume. The performance of eight classification algorithms (i.e., logistic regression, LDA, QDA, naïve bayes, KNN, SVM, decision tree, and random forest) on predicting hypoxia with the observations on TXLA Shelf were compared. Results showed that naïve bayes performed best on classifying hypoxia with high recall and low false positive rates. Balancing the class distribution in the training set of each algorithm significantly increased performance, indicating that classifier performance was strongly dependent on input training data. This study establishes that straightforward machine learning techniques can aid in identification of known main drivers of hypoxia and their characteristics and that those characteristics can be used to predict hypoxia on the TXLA Shelf. Thees techniques have the potential to evaluate hypoxia presence or absence in hydrographic data where DO is missing and can be a powerful tool used in water quality and resource management in the region. While the approaches presented in this study were specifically for the TXLA Shelf, the methodology is applicable to other coastal systems and locations with similar datasets.

Description

Keywords

Coastal Hypoxia, Gulf of Mexico, Machine Learning, Supervised Machine Learning, Unsupervised Machine Learning, Texas-Louisiana Shelf, Low Oxygen Conditions

Citation