2015 Synthetic Biology: Engineering, Evolution & Design (SEED)
Development and Experimental Validation of a Mechanistic Model of in Vitro DNA Recombination
Author
Development and Experimental Validation of a Mechanistic Model of in vitro DNA Recombination*
Jack Bowyer, Jia Zhao, Susan Rosser, Sean Colloms, Declan Bates
I. INTRODUCTION
Synthetic Biology is a highly interdisciplinary field with the aim of establishing engineering protocols for the construction of synthetic biological circuits. One of the first synthetic biological devices to be built was the genetic toggle switch in E. coli [1]. Such transcriptional memory devices have paved the way for the creation of bi-stable genetic switches based on DNA recombination. These site-specific recombinases (SSRs) mediate distinct recombination events that give rise to two stable DNA states. Hence, DNA recombination has huge potential as a tool for DNA sequence assembly with numerous potential applications including biological data storage [2], DNA-based counting systems [3] and the assembly of metabolic pathways [4].
Attempts to experimentally design and build synthetic systems using recombinases have thus far been hindered by a lack of validated computational models that capture the mechanistic basis of DNA recombination. The predictive capabilities of such models could be exploited by Synthetic Biologists to reduce the number of iterative cycles required to align experimental results with design performance requirements. Here, we develop and validate the first detailed mechanistic model of DNA recombination, with a focus on how efficiently recombination can occur, and the model features required to replicate and predict experimental data.
II. A MECHANISTIC MODEL OF DNA RECOMBINATION
Recombinases can enable precise DNA manipulation both in vitro and in vivo [8]. These enzymes, known as integrase and excisionase, catalyse two recombination events termed integration and excision. Integrase alone is sufficient to mediate the integration reaction between an attachment site encoded within the host chromosome, attB, and an attachment site on the phage chromosome, attP. The phage genome is inserted into the host chromosome and is flanked by the newly formed attachment sites attL and attR. The excision reaction is mediated in the presence of integrase and a recombination directionality factor (RDF) also known as excisionase, restoring the independent substrates as well as the original attB and attP sites (Figure 1). A pre-integration state consisting of attB and attP is referred to as the BP state,
*Research supported by EPSRC via a DTA studentship to J. Bowyer
J. Bowyer and D.G. Bates are with the Warwick Integrative Synthetic
Biology Centre and School of Engineering, University of Warwick
(e-mail: J.E.Bowyer@warwick.ac.uk, D.Bates@warwick.ac.uk)
J. Zhao and S Colloms are with the Institute of Molecular, Cell and
Systems Biology, University of Glasgow
(e-mail: j.zhao.1@research.gla.ac.uk, Sean.Colloms@glasgow.ac.uk)
S. Rosser is with Synthetic and Systems Biology, School of Biological
Sciences, University of Edinburgh (e-mail: Susan.Rosser@ed.ac.uk)
whereas a post-integration/pre-excision state containing attL and attR is the LR state. DNA recombination efficiency in switching between these two states is dependent on the concentrations of integrase and RDF in the system.
Figure 1. Schematic diagram of phage integration and excision. The phage genome attachment site, attP, is integrated into the host chromosome attachment site, attB. Integration gives rise to attL and attR, each formed of half of attB and attP. Excision restores attB and attP, removing the integrated phage genome from the host chromosome.
An extensive review of the experimental literature was carried out to synthesise current knowledge of the mechanistic basis of recombination, see [5-7] and references therein. Many structural features of the DNA recombination reaction network have widespread experimental support in the literature. Integrase forms dimers in solution [9] with one dimer bound to attB and attP necessary to mediate the integration reaction [10], which is unidirectional [11]. RDF does not bind directly to DNA attachment sites [12]. One integrase dimer and an additional RDF monomer bound to both attL and attR is necessary to mediate excision, restoring attB and attP [13]. However, we were unable to find consensus regarding three significant biological details:
â?¢ The directionality of the excision reaction.
â?¢ RDF dimerisation and subsequent tetramerisation in solution.
â?¢ Integrase monomer binding to DNA substrates.
By constructing a variety of models based on the well established properties of the reaction network that differed only according to the inclusion/exclusion of these three features, we were able to examine which model best matched our experimental data and hence investigate the most plausible representation of the true biological structure (Figure 2). We found that including unidirectional excision
[14], monomeric RDF that neither dimerises nor tetramerises in solution [15] and integrase monomer binding [16] resulted in optimal data fitting and prediction capabilities in our model. These results were determined via global optimisation of the parameters for each candidate model against the data (see Results section). Early results revealed that our models consistently produced greater recombination efficiency than that observed in our data. Hence we included a plausible mechanism for the sequestration of integrase: the formation of a dysfunctional integrase dimer at a rate kintX (Figure 2, top).
Our modelling investigation focuses on DNA recombination mediated by Ï?C31 serine integrase and its RDF, gp3, in vitro. The network consists of twenty-two distinct biological entities and twenty-eight reaction constants. The rate of change in concentration of each entity is governed by a corresponding ordinary differential equation (ODE) in the model. We apply mass action kinetics to the biological interactions in order to derive the ODEs that comprise the model. Analysing the efficiency with which the system switches from BP state to LR state involves determining the total register of the system in either state. This equates to summing each of the ODEs that govern the network entities in the DNA state of interest (see Table I for the full list of ODEs). Model simulations are therefore the numerical solutions to the following equations:
ð??ð·ð¿ð¿ð¿ð¿ð¿
ð??ð?? = ð??ð¿ ð·ðµðµ ð¼4 â?? ð??ð¿ ð·ð¿ð¿ ð¼4 ð??4 , (1)
ð??ð·ðµðµð¿ð¿ð¿
ð??ð?? = â?? ð??ð¿ ð·ðµðµ ð¼4 + ð??ð¿ ð·ð¿ð¿ ð¼4 ð??4 , (2)
where DLRtot , DBPtot represent the total concentration of DNA
in LR state and BP state respectively, DBP I 4 represents the
concentration of the protein:DNA complex consisting of four
integrase monomers bound to DNA in the BP state, DLR I 4 R4 represents the concentration of the protein:DNA complex consisting of four integrase monomers and four excisionase monomers bound to DNA in the LR state and kR represents the rate of recombination.
Experiments performed in vitro allow the initial concentrations of the recombinases and substrates to be quantified exactly and therefore provide the initial conditions with which to solve the system of ODEs. Hence, we do not require knowledge of integrase and excisionase expression levels, as was necessary for the simple model of in vivo recombination dynamics developed in [2].
III. RESULTS â?? EXPERIMENTAL VALIDATION
We compared the ability of our model to replicate and predict in vitro experimental data with a simple mathematical model of DNA recombination previously developed in Bonnet et al. [2]. Using a subset of our steady state data for a variety of different initial concentrations of integrase and gp3, the parameters of both models were optimised using a Genetic Algorithm to minimise the difference between simulations and data values. The predictive power of both models was then assessed by evaluating their ability to match a different subset of experimental time course data. As the model of [2] was originally developed for in vivo data, we adapted it in order to simulate our in vitro system dynamics
by removing the parameters representing recombinase expression levels and adding our sequestration mechanism. As shown in Figure 3A, this simple model is unable to provide a reasonable replication of the experimental data. The more detailed model, however, exhibits strong predictive power, accurately reproducing the experimental recombination efficiencies across all initial concentrations of integrase and gp3 (Figure 3B). Similar results were seen for time-course data (to be included in the full paper). The parameter space within which both models are optimised is large, since evidence regarding the numerical values of many reaction rate constants is currently lacking in the literature. Future work will focus on further investigation of model parameters to ensure their biological plausibility.
IV. CONCLUSIONS
We have developed the first detailed mechanistic model of in vitro DNA recombination. The predictive power of the model was validated against a large set of experimental data on recombination efficiencies for different initial concentrations of integrase and gp3. The proposed model sheds light on a number of mechanistic features of DNA recombination for which there is currently no consensus in the experimental literature, and will be a valuable design tool for Synthetic Biologists working on the construction of recombinase-based genetic circuitry, potentially producing significant reductions in development times. Future work will extend our modelling investigation to the in vivo system in order to examine model performance under cellular biological conditions.
V. REFERENCES
[1] T. Gardner, C. Cantor and J. Collins. Nature. 2000 Jan
20;403(6767):339-42.
[2] J. Bonnet, P. Subsoontorn and D. Endy. Proc Natl Acad Sci
USA. 2012 Jun 5;109(23):8884-9. doi: 10.1073/pnas.1202344109. [3] A. Firedland, T. Lu, X. Wang, D. Shi, G. Church and J.
Collins. Science. 2009 May 29;324(5931):1199-202. doi:
10.1126/science.1172005.
[4] L. Zhang, G. Zhao and X. Ding. Sci Rep. 2011;1:141. doi:
10.1038/srep00141. Epub 2011 Nov 3.
[5] P. Ghosh, L. Wasil and G. Hatfull. PLoS Biol. 2006 Jun;4(6):e186. [6] P. Ghosh, N. Pannunzio and G. Hatfull. J Mol Biol. 2005 Jun
3;349(2):331-48. Epub 2005 Apr 7.
[7] P. Ghosh, L. Bibb and G. Hatfull. Proc Natl Acad Sci U S A. 2008
Mar 4;105(9):3238-43. doi: 10.1073/pnas.0711649105. [8] N. Grindley, K. Whiteson and P. Rice. Annu Rev
Biochem. 2006;75:567-605.
[9] P. Ghosh, N. Pannunzio and G. Hatfull. J Mol Biol. 2005 Jun
3;349(2):331-48. Epub 2005 Apr 7.
[10] T. Miura, Y. Hosaka, Y. Yan-Zhuo, T. Nishizawa, M. Asayama, H.
Takahashi and M. Shirai. J Gen Appl Microbiol. 2011;57(1):45-57. [11] P. Fogg, S. Colloms, S. Rosser, M. Stark and M. Smith. J Mol Biol.
2014 Jul 29;426(15):2703-16. doi: 10.1016/j.jmb.2014.05.014. Epub
2014 May 22.
[12] P. Ghosh, L. Bibb and G. Hatfull. Proc Natl Acad Sci U S A. 2008
Mar 4;105(9):3238-43. doi: 10.1073/pnas.0711649105.
[13] A. Groth and M. Calos. J Mol Biol. 2004 Jan 16;335(3):667-78.
[14] B. Swalla, E. Cho, R. Gumport and J. Gardner. Mol Microbiol. 2003
Oct;50(1):89-99.
[15] R. Keenholtz, S. Rowland, M. Boocock, M. Stark and P. Rice.
Structure. 2011 Jun 8;19(6):799-809. doi: 10.1016/j.str.2011.03.017. [16] M. Smith, R. Till, K. Brady, P. Soultanas, H. Thorpe and M. Smith.
Nucleic Acids Res. 2004 May 11;32(8):2607-17. Print 2004.
Integration Excision
Table 1. Model Equations arising from the application of mass action kinetics to the network reactions shown in Figure 2. The model comprises
22 ODEs. Concentrations are represented by square brackets. DNA is denoted by D with the subscripts BP and LR representing the BP and LR states respectively. Integrase and RDF are denoted by I and R respectively, with numerical subscripts representing the number of monomers present / bound. The dysfunctional integrase dimer is denoted by I2X.
(A) (B)
Figure 2. The DNA recombination reaction network used to construct the model. The network is based on the mechanisms underlying DNA recombination that have been verified in the current experimental literature. We model the dynamics of Ï?C31 integrase and its RDF, gp3. Reactions and their rate constants are depicted by arrows and their corresponding numbered k. The rate of recombination is kR.