Rescuing Progression in Antibiotic Discovery by Increasing Machine Learning Compatibility of High-Throughput Screening
Abstract
Antibiotic discovery has stagnated. To avoid catastrophe, it must speed up. One of the most heavily used methods of drug discovery is high-throughput screening, yet in the 40 years of use of high-throughput screening, zero antibiotics have come to market from this method. Recent advancements in deep learning have provided a potential solution to this problem. It has been demonstrated that with a clean yet relatively small training dataset, meaningful predictions can be made on large chemical libraries. However, relying on cherry-picked data with extremely confident ‘hits’ or ‘misses’ fails to represent the uncertainty of large real-world datasets. In this paper, I analyze the current state of HTS and propose and new workflow that is compatible with machine learning. The key to machine learning compatibility is determined to be the aversion of false negatives. More specifically, it is most important to reduce the ‘noise’ relative to the size of the dataset for maximum compatibility. Furthermore, using the standard tool ChemProp, I discern that the size of matters significantly, and small datasets of strong data will still fail to be compatible with machine learning models.
Citation
Tevonian, Robert Jeffrey (2021). Rescuing Progression in Antibiotic Discovery by Increasing Machine Learning Compatibility of High-Throughput Screening. Undergraduate Research Scholars Program. Available electronically from https : / /hdl .handle .net /1969 .1 /194327.