2014 AIChE Annual Meeting
(223an) Graph-Based Evolutionary Algorithm for De Novo Molecular Design Under Multi-Dimensional Constraints
Authors
The area of computer-aided molecular design has greatly influenced the rate and cost at which novel chemicals with desired attributes have been identified. As such, great effort has been invested in new methodologies which allow for the solution of larger and more complex problems of this nature. The application of evolutionary algorithms is one such technique which has shown promise in the solution of large combinatorial, and highly non-linear molecular design problems. The typical approach begins with a population of randomly generated individuals and the more fit members are stochastically selected to undergo computational analogues of natural recombination and mutation. This process is iterative until the resultant population possesses the desired attributes, which could be evaluated with existing quantitative structure property (activity) relationships (QSPR’s) or other types of property models.
It has been shown that many molecular properties or attributes are often best characterized by a combination of descriptors with varying dimensionality. This could, for example, include a combination of graph theoretical two-dimensional descriptors, such as the chi connectivity index, along with three-dimensional descriptors, which capture important spatial characteristics. The inverse solution to property models of this nature, which entails identifying candidate molecular structures with the desired properties as defined by the given model, is often highly non-linear in nature. In addition, the use of molecular fragments, as often practiced in the de novo design of novel structures, can lead to a combinatorially large search space which becomes intractable for exhaustive solution techniques. The application of an evolutionary algorithm provides a powerful method for the solution of these types of molecular design problems in which there are often multiple objective constraints with high computational complexity and a large search space.
This approach utilizes a fragment based descriptor known as the Signature descriptor, which is represented as a molecular graph, as building blocks to generate candidate solutions. The nature of signature descriptors allows for control over the desired chemical search space as well as convenient reconstruction of the resultant molecular graphs. This descriptor has proven useful in the solution of problems with topological constraints and has recently been extended to tackle problems of higher dimensionality, including three dimensional as well as four dimensional (conformational ensemble) constraints. The graph-based operators necessary for such an approach will be outlined as well as exemplified through a case study which will highlight the advantages of this algorithm.
