2016 AIChE Annual Meeting

(190j) Advanced Modeling of Tissue:Blood Partition Coefficients for Industrial Chemicals

Authors

Papadaki, K. - Presenter, Aristotle University of Thessaloniki
Karakitsios, S., Aristotle University of Thessaloniki
Sarigiannis, D., Aristotle University
The application of Physiologically Based BioKinetic (PBBK) models in the risk assessment arena is limited due to the lack of the generic character of these models. A critical limiting factor of describing ADME processes for a large chemical space is the proper parameterization for â??data poorâ? compounds. In order to expand the applicability of PBBK models, so that the chemical space is covered as much as possible, input parameters of these models are predicted using advanced Quantitative Structure-Activity Relationships (QSARs). In silico approaches, including QSARs, are widely used for the estimation of the above physicochemical and biochemical properties, biological effects as well as understanding the physicochemical features governing a biological response. QSARs are described as regression or classification models, which form a relationship between the biological effects and chemistry of each chemical and comprise the activity data to be modeled, the data with which to model and a method to formulate the model.

Several approaches incorporating QSARs have been proposed for the prediction of partition coefficients for PBBK modeling, including (a) the Peyret, Poulin and Krishnan algorithm, which is based on the fractional content of cells, interstitial fluid in tissue, plasma in blood, erythrocyte in blood, tissue lipids and the lipophilicity of the compound of interest; (b) the molecular fractions algorithm proposed by Béliveau et al. that takes into account the frequency of occurrence of the several molecular fragments of the compounds and (c) Abrahamâ??s solvation equation for estimating biological properties, which takes into account the excess molar refraction, the compound dipolarity/polarizability, the solute effective or summation hydrogen-bond acidity, the solute effective or summation hydrogen-bond basicity and the McGowan characteristic volume that can trivially be calculated for any solute simply from a knowledge of its molecular structure.

The methodological approach presented in this study proposes the modeling of tissue/blood partition coefficients for five main human tissues (muscle, kidney, adipose, liver, brain) using PaDEL Descriptor and QSARINS. PaDEL Descriptor is an open source software for the calculation of 1D, 2D, 3D molecular descriptors and fingerprints of chemical compounds. QSARINS is used for the development of QSAR models, based on Multiple Linear Regression (MLR) by Ordinary Least Squares (OLS) as modeling method and Genetic Algorithm (GA) for descriptorsâ?? selection. In QSARINS, models are analysed using tools such as Principal Component Analysis (PCA), fitting, internal and external validation criteria and applicability domain procedure. Users can browse through different options, ending up with a robust and reliable model according to the OECD principles.

The first step of QSAR modeling was the preparation of input data, which included the experimental values of the tissue/blood partition coefficients and the molecular descriptors of the corresponding chemical compounds. The dataset was consisted of 33 environmental chemical compounds, which were randomly splitted to a training and a prediction set. The splitting was based on random selection through property sampling, performed by ordering the chemicals according to their descending experimental values. The prereduction process was followed for the derived PaDEL descriptors in order to avoid the semi-constant and intercorrelated ones. A set of 435 descriptors for each chemical compound was used for the development and analysis of QSAR models.

The next step was the dataset analysis and the development of QSAR models. The distribution of chemical compounds in the chemical space was explored using PCA. The score plot indicated that the molecules were clustered by structures and the loading plot showcased the most influential descriptors for the chemicalsâ?? categorization. As mentioned before, the statistical method of MLR, combined with OLS, was used for the development of the models. Variable selection was done by means of a genetic algorithm, which aimed to find the best combination of variables for the derived models. A large number of models was developed and ordered according to their fitting performance. In order to evaluate the modelsâ?? validity, internal (Leave One Out (LOO) and Leave Many Out (LMO) technique, Y-scrambling) and external validation methods were applied. The selection of the best model for each tissue/ blood partition coefficient was based on the Multi-Criteria Decision Making (MCDM) value, which summarized the fitting, cross validation and external validation criteria.

The fitting performance (R2) of the selected models for predicting muscle, kidney, adipose, liver and brain/ blood partition coefficient was 0.92, 0.92, 0.97, 0.94 and 0.96, respectively. The LOO technique ( ) indicated that modelsâ?? performance in predictions was equal to 0.88, 0.90, 0.96, 0.92 and 0.94, while the LMO technique ( ) resulted in 0.86, 0.89, 0.95, 0.92 and 0.93, respectively. The external validation value ( ) was found to be 0.56, 0.62, 0.98, 0.81 and 0.81 for muscle, kidney, adipose, liver and brain/ blood partition coefficients, respectively. The absence of chance correlation was confirmed by the low values, obtained from the Y-scrambling method. The Root Mean Square of Error (RMSE) for the training set was calculated and ranged from 0.08 to 0.16, while for the prediction set ranged from 0.18 to 0.27. The Applicability Domain (AD) analysis showed that there were not outliers, verifying the reliability of each of the developed QSAR models.

The proposed models for the estimation of tissue/blood partition coefficients were checked for their fitting, validity and applicability. It was found that they are stable, reliable and capable to predict physicochemical parameters of â??data poorâ? chemical compounds that fall within the applicability domain. The developed predictive models could serve as a tool to fill in data gaps of environmental chemicals with unknown values of tissue/blood partitioning. In this way, the animal testing and experiments could be reduced and the wide use of PBBK models could be reinforced. In conclusion, the â??safe by designâ? concept for environmental chemicals is supported, by allowing the successful prediction of toxicokinetic behavior based on molecular parameters, promoting green chemistry and cost saving of product development.