2011 Spring Meeting & 7th Global Congress on Process Safety
(66f) Thermodynamic Data, Physicochemical Properties, and Molecular Information on More Than 2.5 Million Chemical Compounds
Authors
Scientists and engineers often
suffer from the shortage of thermodynamic data, physicochemical properties, and
molecular information of chemical compounds. This is not surprising if one
consider the fact that over 50 million chemicals have been registered as of
October 2010, while the number of chemicals whose thermodynamic or
physicochemical data are available is only in the order of 10,000. Experimental
measurements are expensive and time consuming, and reliable calculation methods
are rarely available.
We have developed computer modules
for the prediction of thermodynamic data and physicochemical properties based
on quantum mechanical calculations at decent accuracy levels and more than
2,000 molecular descriptors. The predicted results were carefully verified with
millions of experimental data collected for over 3 years by more than 50
scientists and engineers, which confirmed that our computer modules are capable
of predicting thermodynamic data and physicochemical properties at the accuracy
level of experiments.
An automatic procedure from the
molecule generation to the prediction of the data and information has been
developed, and a computing center containing over 550 computers has been constructed
to process massive amount of chemical compounds. Using the procedure and the
computing system, more than 2.5 million chemicals have been processed. The
final data and the information have been packed into a database server. To
search and browse the data of the target chemical, information browsing
software has also been developed, which provides the access to the database
server and information retrieval online.
Our database contains a total of
2,140 data and information sets per molecule, consisting of 46 thermodynamic
data and physicochemical properties, 3 spectra (IR, NMR, VCD),
69 quantum mechanical information, 2,004 molecular descriptors, and 18 drug
related properties. The 2.5 million chemical compounds contains radicals,
hydrocarbons, fuels such as gasoline, jet-fuel, diesel, bio-diesel, etc.,
compounds involved in the commercial processes like thermal cracking,
combustion, reforming, isomerization, and drug-like
molecules and hetero-compounds containing oxygen, nitrogen.
Due to the limitation of quantum
computation time, molecules containing other than C, H, N, O, and S atoms and
the compounds with number of carbon atom above C25 were not
processed. They will also be processed in the near future after upgrading our
current computing system.