2024 AIChE Annual Meeting

(174bt) Quantitative Xenogenomics: Deploying High-Accuracy Sequencing for Improving Replication of 6-Letter DNA Alphabets

Authors

Marchand, J. A., University of California, Berkeley
The standard, 4-letter DNA alphabet (ATGC) has long been thought to be the immutable basis of life. Yet, at the molecular and biomolecular level, biology has been shown to tolerate synthetic nucleotides that deviate structurally from the natural. A subset of synthetic nucleotides known as unnatural base pairing xenonucleic acids, or ubp XNA, observe base pairing complementarity but do so orthogonally to A:T/G:C pairing. Relevant to the field of synthetic biology, XNAs are currently being explored as a general solution to genetic code expansion. For example, a 6-letter codon nucleotide triplet would expand the codon table from the standard 64 to 216 and allow synthetic biologists to envision encoding proteins composed of more than 30 non-standard building blocks.

Though XNAs have the potential to transform synthetic biology, challenges in reading (sequencing) and amplification (PCR) have limited their impact. For XNA sequencing, a lack of next generation sequencing tools has made modern -omics experimentation impossible. Additionally, though many ubp XNAs can be amplified by polymerases, transversion of XNAs to standard letters during replication (replication errors) prohibit workup and scaleup that is routine for 4-letter DNA. Recent advances in machine learning strategies for nanopore sequencing have created an opportunity to solve both sequencing and amplification problems.

In this talk, we show how next generation sequencing advances will act as a propellant in XNA synthetic biology. Using a 6-letter genetic alphabet (ATGCBS), we first present a robust strategy for training machine learning models for high-accuracy (>98% accuracy), single-molecule nanopore sequencing of these ubp XNAs using commercially available devices. We then use these models for high-throughput, massively multiplexed quantification of XNAs to study XNA replication in vitro. Finally, we present results of a large screen (polymerase, buffers, nucleotide concentration, etc) that improve retention of XNAs in PCR. By modernizing the underlying sequencing and amplification methods available for XNAs, we look to improve access and application of XNAs to therapeutic, synthetic biology, and chemical biology research.