5th Conference on Constraint-Based Reconstruction and Analysis (COBRA 2018)
Towards a Genome-Wide Transport Systems Encoding Genes Tracker
Authors
TranSyTâs internal database contains every possible reaction for all compounds described in each TCDB entry, as well as their hierarchical children metabolites.
TranSyT starts by assigning identifiers to metabolites, as TCDB does not provide cross links to any database. These metabolites are identified by searching a BioDB, a database developed in-house, for all possible names and synonyms. The relationships between compounds (hierarchical ontology) was also determined using BioDB, which combines information retrieved from several sources such as ModelSEED, KEGG, MetaCyc and BiGG.
The second step involves retrieving the TC families information from TCDB, which allows to find the suitable transport type for each metabolite (e.g. symport and antiport) and direction (in/out or out/in).
All generated information is then stored in a Neo4j graph database, which allows TranSyT to retrieve the required reactions for the genome in study, saving time and resources.
All sequences of the genome in study are aligned against the whole set of records retrieved from TCDB, containing the sequences of well-known transport systems, using BLAST+.
After the alignments, only genes having evidences of encoding transport systems are associated to transport reactions. The compartmentalization of such reactions is also possible, as the direction of the reaction was forehand defined.
TranSyt was developed in Java and is a the next iteration of the previously developed TRIAGE2 tool.
References:
1 Saier, M. H. et al. Nucleic Acids Res. 34, D181âD186 (2006)
2 Dias, O. et al. IEEE/ACM Trans. Comput. Biol. Bioinforma. XX, 1â1 (2016)