A data-driven approach for exploiting enzyme promiscuity as a means to predict novel biochemical reactions

Conference Dates

September 15-19, 2019


Systems metabolic engineering has been widely used to produce chemicals of high commercial value from low cost substrates. But this process has challenges for some applications, such as harnessing lignocellulosic biomass for biofuel and biochemical production, due to our limited metabolic knowledgebase. With current advances in protein engineering, it is possible to exploit substrate promiscuity of enzymes to enable novel biochemical reactions. Nevertheless, performing experiments to determine what substrates an enzyme can act on can be time consuming and it is not always clear what potential substrates to test. So, the current work aims to employ machine learning approaches for identifying novel substrates and in turn, predicting novel reactions that are more promising than the putative reactions predicted simply based on compound similarity measures (e.g., Tanimoto coefficient). A highly accurate (up to 88.3%) machine learning model was developed to identify candidate substrates for alcohol dehydrogenase (ADH) using a dataset consisting of 23 metabolites (with 8 of them being known positives) and 46 chemo-informatics based molecular descriptors (e.g., topology, stereochemistry, and electronic features). In addition, support vector regression proved to be a useful method for estimating enzyme kinetics (characterized by Michaelis-Menten constants, Km and Vmax) for a variety of oxidoreductases that are typically found in biofuel biosynthesis pathways. Such machine learning methods can be applied to other classes of enzymes and hence, used as a tool to expand the knowledgebase of metabolic reactions paving the way for next generation of metabolic/ pathway engineering.

Please click Additional Files below to see the full abstract.

This document is currently not available here.