Finding the right molecule - knowledge-driven enzyme discovery

Conference Dates

September 24-28, 2017


Activity based screening of microbial strain or expression libraries from metagenomic sampling is still the strongest approach for discovery of unknown enzyme functionalities. However, the abundance of genomic and/or protein sequences suggests easy access to new, uncharacterized enzymes of known function, having the required physical properties for industrial use simply by extracting the right sequence from databases, ordering the gene and express it for application.

This sequence based enzyme discovery is unfortunately hampered by two key factors:

  • The functional classification is not of a quality that guarantees a correct functional prediction (see e.g. Furnham N et al., 2009, Walsh, JR et al., 2016)
  • As of today it is almost impossible to unambiguously predict physical parameters of enzyme function based on amino acid sequence alone

As a consequence numerous tools have to be employed to come at least to a correct functional prediction (Alderson, RG, 2012) using the available resources, which the scientific community has developed.

Resources for sequence based enzyme discovery span from (meta-)genomic databases, a plethora of genome data, amino acid sequence databases (Schomburg and Schomburg, 2010), via the RCSB protein structure database (Rose PW et al., 2017) to enzyme property databases like BRENDA (Chang, A et al. 2015). A huge set of bioinformatics tools enables access to this databases, evaluation and classification of search results (e.g. Kuiper, RK et al. 2010).

A streamlined enzyme discovery process is most likely achieved by setting it up as a multidisciplinary effort. Alderson et al. remark that biologists and chemists have different view, interests and interpretation of enzyme data and functionality. That professional bias in looking at molecules offers a chance to develop methods for speeding up the search for new enzymes.

In the presentation, we will discuss, using examples, the exploitation of various databases and bioinformatics tools to enhance sequence based enzyme discovery processes. We will also look at the applicability of these technologies for protein engineering projects, which have evolved into an integral part of enzyme discovery and development.


Alderson RG et al. (2012) Curr Top Med Chem. 2012 Sep 1; 12(17): 1911–1923

Chang A et al. (2015), Nucleic Acids Res. 2015 43 (D1): D439-46

Furnham N et al. (2009), Nature Chemical Biology 5, 521 - 525

Kuipers RK et al. (2010), Proteins. 2010 Jul;78(9):2101-13.

Rose PW et al. (2017), Nucleic Acids Res 2017, 45 (D1): D271-D281.

Schomburg D and Schomburg I (2010), Methods Mol Biol. 2010;609:113-28

Walsh JR et al. (2016), BMC Syst Biol. 2016; 10: 129

This document is currently not available here.