ArticlePDF Available

Structure-toxicity relationships study of a series of organophosphorus insecticides

Authors:

Abstract and Figures

Structure-toxicity relationships were studied for a set of 47 insecticides by means of multiple linear regression (MLR) and artificial neural network (ANN). A model with three descriptors, including shape surface [S(R2)], hydrogen-bonding acceptors [HBA(R2)] and molar refraction [MR(R1)], showed good statistics both in the regression (r = 0.875, s = 0.417 and q2 = 0.675) and artificial neural network model with a configuration of [3-5-1] (r = 0.966, s = 0.200 and q2 = 0.647). The statistics for the prediction on toxicity [log LD50 (lethal dose 50, oral, rat)] in the test set of 20 organophosphorus insecticides derivatives is (r = 0.849, s = 0.435) and (r = 0.748, s = 0.576) for MLR and ANN respectively. The model descriptors indicate the importance of molar refraction and shape contributions toward toxicity of organophosphorus insecticides derivatives used in this study. This information is pertinent to the further design of new insecticides.
Content may be subject to copyright.
Abstract Structure-toxicity relationships were studied
for a set of 47 insecticides by means of multiple linear
regression (MLR) and artificial neural network (ANN).
A model with three descriptors, including shape surface
[S(R2)], hydrogen-bonding acceptors [HBA(R2)] and
molar refraction [MR(R1)], showed good statistics both
in the regression (r = 0.875, s = 0.417 and q2= 0.675)
and artificial neural network model with a configuration
of [3-5-1] (r = 0.966, s = 0.200 and q2= 0.647). The sta-
tistics for the prediction on toxicity [log LD50 (lethal
dose 50, oral, rat)] in the test set of 20 organophosphorus
insecticides derivatives is (r = 0.849, s = 0.435) and (r =
0.748, s = 0.576) for MLR and ANN respectively. The
model descriptors indicate the importance of molar re-
fraction and shape contributions toward toxicity of orga-
nophosphorus insecticides derivatives used in this study.
This information is pertinent to the further design of new
insecticides.
Keywords Multiple linear regression · Artificial neural
network · Organophosphorus · Insecticides · Acute LD50 ·
Descriptors
Introduction
Although the benefits of pesticides [1] are undeniable,
attention has been focused in recent years on their im-
pact on human health and environment. Pesticide is a ge-
neric term for a variety of chemical classes such as in-
secticides, herbicides, fungicides and nematicides.
Although pesticide laws require that both risks drive
the processing terms of depth of analysis and allocation
of federal resources, two questions relative to risks are
appropriate: What is “acceptable risk”? How can we
minimize the assessing the risk?
Computer simulation techniques potentially offer a
further means to probe structure-toxicity relationships.
Quantitative Structure-Activity relationships (QSAR) [2,
3] represent the most effective computational approaches
in drug design. QSAR is largely used to predict activities
and to define pesticides models [4, 5, 6].
In the present article, we have attempted to establish
Structure-Toxicity relationships for organophosphorus
insecticides derivatives by multiple linear regression
(MLR) and artificial neural networks (ANN).
The objectives of our study were both to provide sup-
plementary information concerning the behavior of these
compounds and further define the criteria necessary for
the rational design of a new generation of organophos-
phorus type insecticides.
Material and methods
Experimental data
The organophosphorus insecticide derivatives were taken
from different literature sources [7, 8, 9]. The log LD50
(lethal dose 50, acute, oral, rat) values were used as the
dependent variable that represents LD50. The toxicity
(LD50) was expressed in mols of toxicant per kilogram of
body weight. When LD50 is given in an interval, the min-
imum value was used.
The chemical structures along with experimental tox-
icity (log LD50) data of the compounds used in this study
are shown in Table 1.
M. Zahouily () · A. Rhihil · H. Bazoui · D. Zakarya
UFR Chimie Appliquée,
Laboratoire de Synthèse Organique
et Traitement de l’Information Chimique,
Département de Chimie, Faculté des Sciences et Techniques,
B.P. 146, Mohammedia 20650, Maroc
e-mail: zahouily@voila.fr
Tel.: +212-023-314705/08, Fax: +212-3-315353
A. Rhihil · H. Bazoui · S. Sebti
Laboratoire de Chimie Organique
Appliquée et Catalyse(LCOAC),
Faculté des Sciences Ben M’Sik B.P. 7955 Sidi Othmane,
Casablanca, Maroc
J Mol Model (2002) 8:168–172
DOI 10.1007/s00894-002-0074-0
ORIGINAL PAPER
Mohamed Zahouily · Abdallah Rhihil
Halima Bazoui · Saïd Sebti · Driss Zakarya
Structure-toxicity relationships study of a series
of organophosphorus insecticides
Received: 3 December 2001 / Accepted: 4 February 2002 / Published online: 16 May 2002
© Springer-Verlag 2002
169
Table 1 Chemical structures compounds studied and experimental toxicity (log LD50) values
Structural descriptors
Several structural descriptors and physicochemical vari-
ables were used to characterize the organophosphorus in-
secticide derivatives under study. Those descriptors were
calculated for the substituents R1, R2and X.
These include the octanol/water partition coefficient
(log P), [10] used as a descriptor of the hydrophobic mo-
lecular properties and electronegativity, [11] hydrogen-
bonding donors (HBD), hydrogen-bonding acceptors
(HBA) [12] and molar refraction [13].
The size and shape of the substituents were quantified
by their van der Waals volume (V), molecular weight
(MW), surface (S), [14] length (L), V/L [15] and topo-
logical descriptors [16].
60 parameters were calculated for each compound.
Statistical methods
Multiple linear regression (MLR)
Multiple linear regression was used to generate the linear
models and it was performed with the Unistat statistical
package running on a Pentium PC.
Because of the large number of descriptors consid-
ered, a stepwise multiple linear regression procedure
based on the forward-selection and backward-elimina-
tion methods was used to select the powerful descriptors.
In order to avoid all difficulties in interpretation of
the resulting models, pairs of variables with a correlation
coefficient larger than 0.7 were classified as intercorre-
lated, and only one of these was included in the screened
models. The quality of the model was proven by the cor-
relation coefficient square r2, the standard deviation s
and the Fischer test value (F), when all parameters in the
model were significant at the 95% confidence level. An
analysis of the predictive ability was carried out in two
ways. The predictive ability in the training set
(47 compounds N°1 to 47) was carried out using leave-
one-out cross-validation. For a reliable model, the
squared predictive correlation coefficient q2[17] should
be >0.60 [18]. In addition, 20 organophosphorus insecti-
cide derivatives (N°48 to 67) were retained to test the ac-
tual prediction of the model (r2and s are considered).
Artificial neural network (ANN)
The ANN [19] was trained by the back-propagation (BP)
of errors algorithm [20] had the following architecture:
An input layer including pertinent descriptors of
MLR
A hidden layer for which the ratio of the number of
data points in the training set and the number of con-
nections controlled by the network, ρ, is critical to the
predictive power of the neural net. The range 1.8<ρ
<2.2 [ρ= (number of data points in the training
170
set)/(number of adjustable weights controlled by the
network)] [21], was used as a guideline for an accept-
able number of neurons in the hidden layer. It is
claimed that, for ρ<<1.0, the network simply memo-
rises the data, whereas for ρ>>3.0, the network loses
its ability to generalize.
Output layer of one neuron, representing the toxicity
(log LD50). The input and output values were normal-
ized.
After this step, the learning rate was varied from 0.01 to
0.9, and for each learning rate the momentum was exam-
ined from 0.1 to 0.9. The number of the neurons in the
hidden layer with the use of optimized momentum and
learning rate was determined.
Finally, to preclude training [22] we have studied the
variation of the root mean-squares (RMS) error versus
number of iteration and we have used two strategies for
testing the validity of the selected ANN model.
Results and discussion
Multiple linear regression analysis
Multiple linear regression was performed on the com-
pounds described in Table 1. We included all 47 organo-
phosphorus insecticides derivatives (compounds N°1 to
47) of the training set for the model generation. After col-
lecting the data, we submitted all parameters to the regres-
sion; many models were generated using this method. We
obtained the best models without constant terms [Eq. (2)]
because the constant term is not statistically significant.
However, an ideal model [Eq. (2)] is one that has high r2
and F values, low standard deviation, least numbers of in-
dependent variables, and high ability for prediction.
(1)
(2)
The statistical quality of Eq. (2) is fairly good and ac-
counts for 77% of the variance in log LD50. Low toxicity
(high log LD50 values) is associated with high shape sur-
face [S(R2)] and hydrogen-bonding acceptors [HBA(R2)]
with decreased molar refractivity [MR(R1)].
The plot of experimental log LD50 versus calculated
log LD50 is given in Fig. 1a. Cross-correlation analysis
171
showed that all pairwise correlations were 0.229 in this
equation, also indicating a low collinearity (see Table 2).
In the cross-validation phase, 47 subsets were created ac-
cording to the leave-one-out method and the output of
the removed compound was predicted for each subset.
The cross-validation coefficient obtained was: q2= 0.675.
The model obtained was considered to be good predic-
tive one, according to Wold [18].
As a second strategy, the toxicity of 20 organophos-
phorus insecticides was predicted by using the best MLR
model [Eq. (2)].
Results for the prediction in test set of 20 compounds
were r2= 0.721 and s = 0.435. There were two compounds
with a large estimation error for Eq. (2) (compounds N°59
and 64), and these were excluded from the standard devia-
tion of predictions (s = 0.369). Hence, these results are in
good agreement with those obtained for the training sets
and reveal an good predictive quality for the MLR model.
As biological phenomena are considered to be non-
linear by nature it therefore appears very interesting to
study the present series of compounds with the ANN
technique in order to discover possible non-linear rela-
tionships between toxicity (log LD50) and the molecular
descriptors that appeared pertinent for the linear model.
Artificial neural network analysis
The ANN was generated by using the pertinent des-
criptors appearing in the MLR model as input. A 3-5-1
neural network architecture was developed with the opti-
mum learning rate and momentum 0.2 of and 0.9, re-
spectively and with 5 000 iterations (the results of the
ANN did not vary significantly between 4 800–5 000 it-
erations). The five hidden neurons were chosen to main-
tain ρbetween 1.8 and 2.2. To verify this condition we
have tried three to eight neurons in the hidden layer and
found that five hidden neurons gives the best result for
the training and test sets, as shown in Table 3.
The [3-5-1] neural network architecture shows that
the standard deviation between calculated and observed
toxicity was 0.200, which was found to be superior to
that obtained using MLR (s = 0.417). In addition, the
correlation coefficient square between observed and cal-
culated values was 0.933. These results indicate the exis-
tence of non-linear relationships between toxicity and
molecular descriptors that appeared pertinent for the lin-
ear model. The variation root mean-squares (RMS) error
versus number of iteration is plotted in Fig. 2.
The plot in Fig. 1b indicates that there is a significant
correlation between actual values and calculated values
of logLD50.
We used the same procedure as far the MLR analysis
for testing the validity of the selected ANN model. The
Fig. 1 Experimental and predicted values from MLR (a) and
ANN (b) for the training sets.
Table 2 Correlation matrix
S (R2) ABH (R2) MR (R1)
S (R2)1
ABH (R2) 0.045 1
MR (R1) 0.039 0.229 1
Table 3 Variation of r2and s with number of hidden neurons
Hidden r2(training) s(training) r2(test) s(test)
neurons
3 0.8964 0.2389 0.5445 0.5972
4 0.9093 0.2230 0.4925 0.6185
5 0.9333 0.2000 0.5597 0.5761
6 0.9170 0.2130 0.5521 0.5813
7 0.9181 0.2113 0.5287 0.5964
8 0.9155 0.2152 0.5152 0.6028
Fig. 2 Variation of RMS error versus number iteration.
172
corresponding r2and s for the prediction in test set were
0.560 and 0.576 respectively. For the corresponding q2in
cross-validation method is 0.647 [18].
Analysis of descriptors contribution in ANN
and MLR models
To evaluate the influence of each descriptor on the calcu-
lated toxicity, we used two methods.
The first one consists of removing a descriptor and
analyzing the statistical coefficient between observed
and calculated using MLR and ANN. Comparison be-
tween these statistics and those calculated by MLR and
ANN when no descriptor was removed gave an idea
about the importance of the descriptor removed [23]. In-
deed, when the descriptor [RM(R1)] is remove, the mod-
el obtained is of lower quality (r2is only 0.212 and 0.370
for MLR and ANN respectively (Table 4).
The second method consists to use the relation estab-
lished by Chastrette [24] [Eq. (3)] to calculate the contri-
bution of each descriptor (Table 4).
(3)
Ci: Contribution of descriptor i
mi: The mean of deviation absolute values between
the observed and estimated toxicity for all compounds.
These contributions allow the following classification:
MR (R1) >S (R2) >HBA(R2). These results confirm the
large effect of the substituent R1on the toxicity see mol-
ecules: 9 (R1= Me) – 11(R1= Et) = -2.03 (activity) and
19 (R1= Me) – 21(R1 = Et) = 0.88].
To ensure that the results obtained in MLR and ANN
were not due to chance and lend credence to our results,
we have run a scrambling experiment. The dependent
variable log (LD50) is randomly scrambled and then the
same algorithms used in MLR and ANN run once again.
The statistical results as the correlation coefficient
square r2and the standard deviation s of its results are
compared with the r2and s of the MLR and ANN models
developed in this work. The r2values were 0.017 and
0.561 compared with 0.766 and 0.933 for the s values we
have obtained 0.793 and 0.513 compared with 0.417 and
0.200 for the training set in MLR and ANN, respectively.
This test confirms and clearly shows that the descriptors
selected in this study describe very well toxicity studied.
Conclusion
Two important consequences emerge from the present
report.
Firstly, taking into account the complex nature of mod-
eled biological phenomena, on the one hand, and the large
number of compounds analyzed, on the other hand, our re-
sults clearly indicate that the molar refraction is prime im-
portance for the toxicity of the organophosphorus insecti-
cides derivatives under study. In addition, the approach
used for the contributions and classification of descriptors
in MLR and ANN, may be of help in QSAR interpretations.
Secondly, this results revealed good stability of stud-
ied structure-toxicity relationships, and confirm the fact
that toxicity depends, in a great part, on the structural
features of the insecticide.
References
1. Young AL (1987) Pesticides minimising the risk. In: Ragsdale
NN, Kuhr RJ (eds) ACS symposium series 336. American
chemical society, Washington, pp 10–40
2. Devillers J, Karcher W (1990) Environmental chemistry and
toxicology. In: Karcher W, Devillers J (eds) Kluwer academic
press publishers, Dordrecht, pp 181–195
3. Bazoui H, Zahouily M, Sebti S, Boulajaaj S, Zakarya D (2002)
Structure-toxicity relationships study of a series of organophos-
phorus insecticides. J Mol Mod DOI 10.1007/s00894-001-0054-9
4. Livingstone DJ (1989) Res Pest Sci 27:287–304
5. Nendza M (1991) Chemsphere 22:613–623
6. Vighi M, Garlanda MM, Calamari D (1991) Sci Total Environ
109/110:605–622
7. Büchel KH (1983) Chemistry of pesticides. John Wiley &
Sons, pp 48–124
8. Meister RT, Fitzgerald GT, Zilenziger A (1987) Farm chemi-
cal handbook. Meister Publishing Co, pp 42–208
9. Thomson TW (1972) Agricultural chemicals book I. Insecti-
cides, pp 160–266
10. Nys GG, Rekker RF (1974) Eur J Med Chem Ther 4:361–375
11. Pauling L (1960) The nature of chemical bond, 3rd edn. Cornell
University Press, Ithaca NY, p 85
12. Yokohama T, Taft RW, Kamlet M J (1976) J Am Chem Soc
98:3233–3235
13. Weast RC (1988) Handbook of chemistry and physics, 1st edn.
CRS, p E 318
14. Bondi A (1964) J Phys Chem 68:441–451
15. Zakarya D, Rayadh A, Samih M, Lakhlifi T (1994) Tetrahe-
dron Lett 35:2345–2348
16. Randic M (1984) J Chem Inf Comput Sci 24:164 –175
17. Tetko IV, Villa AEP, Livingstone DJ (1996) J Chem Inf Com-
put Sci 36:794–803
18. Wold S (1991) Quant Struct Act Relat 10:191–193
19. Data pro Qnet 2000 for Windows V2 K build 721 neural net-
work modelling. Vesta Services Inc, Winnetka, IL 60093, USA
20. Rumhelart DE, Hinton CE, Williams RJ (1986) Nature 323:
533–536
21. So S, Richards WG (1992) J Med Chem 35:3201–3207
22. Defernez M, Kemsley EK (1999) Analyst 124:1675–1681
23. Chastrette M, Zakarya D, Peyraud JF (1994) Eur J Med Chem
29:343–348
24. Cherquaoui D, Esseffar M, Villemin D, Cence JM, Chastrette
M, Zakarya D (1998) New J Chem 22:839–843
Table 4 Evaluating the impact of each descriptor in ANN and
MLR
Removed C% C% r2sr
2s
descriptor
MLR aANNaMLRbMLRbANNbANNb
S(R2) 34 30 0.3091 0.657 0.6708 0.444
HBA(R2) 23 28 0.6906 0.439 0.7242 0.407
RM(R1) 43 42 0.2116 0.810 0.3697 0.615
a the contribution (C%) of descriptor given by the second method
described in the text.
b Given by the first method described in the text.
... 24 Subsequent QSTR studies for predicting different types of pesticides (amide herbicides and organophosphorus insecticides, respectively) were reported with acceptable statistical parameters. [25][26][27] In a recent study, Wang et al. have developed a set of QSTR models for estimating the acute toxicity of organophosphate pesticides against rats and mice via multiple administration routes. 28 Previous QSAR/QSTR studies on the ecotoxicity of pesticides still have some limitations. ...
... Numerous models were based on a relatively small dataset, and there has been a lack of chemical structural diversity, which limited their applications. 21,[25][26][27] Although some models were derived from a large dataset, they used uninterpretable or non-transparent algorithms or descriptors, 17,18 and thus, are not transparent and transferable to potential users and regulators. 29 Another limitation is the research object, which is generally limited to only aquatic species or terrestrial species previously. ...
Article
As farming activities increase consistently, pesticides, a class of toxic and harmful chemicals, are widely present in the environment and thus pose a potential risk on ecosystem and human health...
... The R 2 value for the test set is 0.33, which means that these models are characterized by low power external prediction. A very marked improvement in R 2 coefficient was obtained following the QSAR models developed with 44, 54, 67, 30 and 62 pesticides by Zakaria et al. [27], Eldred and Jurs [28], Zahouily et al. [29], Guo et al. [30] and Garcıa-Domenech et al. [31], respectively. Recent studies devoted to pesticides [32,33] have proposed QSAR models with values of 0.93 (27 herbicides) and 0.96 (62 herbicides) for the R 2 coefficient. ...
... Structure-toxicity relationships were studied for a set of 47 insecticides with three-layer perceptron and use of a backpropagation algorithm [29]. A model with three descriptors showed good statistics in the artificial neural network model with a configuration of 3/5/1 (r = 0.966, RMS = 0.200 and Q 2 = 0.647). ...
Chapter
Thousands of environmental pollutants including pesticides , issued from human activities, are accumulated in the environment making a source of danger for the whole ecosystem. Also, the risk assessment process has become a vital and necessary discipline in the legislation to ensure that these pollutants pose no risk or negligible risk to human health, wildlife and the whole ecosystem. The risk assessment carried out for the three natural compartments, namely the terrestrial, the aquatic environment and air, is usually based on experimental studies whose cost is especially high in terms of money, time and laboratory animals. Thus, regulatory agencies are turning to the search for alternative methods less expensive, reliable and fast, which may have a power to predict the potential risks of chemical pollutants. One such toxicological predictive approach is obtained by the development of quantitative models of structure-activity relationships (QSAR). They provide the means for estimating the toxicity of a variety of chemicals in the absence of experimental data on toxicity. In this chapter, a review of publications dedicated to pollution by pesticides and their effects on the entire ecosystem is described. The general principles of the development and validation of QSAR models are also described. Then a critical review of QSAR models published in the literature to date for the prediction of the toxicity of pesticides is also covered.
... Given the large number of 63 descriptors used to code each molecule, we subjected our data to Stepwise stepwise selection [14,15,16], (2) The statistical quality of the equation is very good, it explains 73% of the total variance, and it's higher than the other models described in the literature if we take into account the number of descriptors used. It explains up to 73% of the total variance with a standard error "s" much lower than the average error made on the observed values of Log (LD50) which is of the order of 0.741 for an interval ranging from 0.66 to 3.45. ...
Article
Full-text available
Structure-Toxicity Relationships have been studied for a set of 42 organophosphorous pesticides (OPs) through multiple linear regression (MLR) and artificial neural networks (ANN). A model with three descriptors, including: total lipophilicity [log (P)], widths radicals R1 [(LR1)] and R2 [(LR2)] has achieved good results in phase Training and phase prediction of toxicity [log LD50 (lethal dose 50, Oral rat)]. The linear model (MLR: n=40, r²=0.86, s=40 and q2 = 0.66) and non-linear model with a configuration [3-6-1] (ANN: r²=0.95, s=0.73 and q2 = 0.17) have proved very successful and complementary. The selected descriptors indicate the importance of lipophilicity and widths radicals R1 and R2 in the contribution of the toxicity of pesticides derived from OPs used in this study. This information is relevant for the design of a new model of non-toxic pesticides OPs.
... For the dataset, although the number of compounds in our model is small, it is the most comprehensive data on OPs with rat acute oral toxicity data at present (Zahouily et al., 2002;Devillers, 2004;Guo et al., 2006;García-Domenech et al., 2007). Additionally, the previous scaffold analysis results demonstrate a high degree of structural diversity in our OPs dataset. ...
Article
Organophosphates (OPs) are highly toxic compounds, with widespread application in agricultural and chemical industries, whose introduction into the environment poses serious hazards to humans and ecological systems. To assess and ultimately mitigate these hazards, this study predicted the acute toxicity of OPs according to their chemical structure and administration route. The acute toxicity data of 161 OPs in two species via six different administration routes were manually collected and used to develop a series of quantitative structure–toxicity relationship (QSTR) models with robust and practical predictive abilities. The random forest algorithm was used to develop the models, employing both quantum chemical and two-dimensional descriptors according to OECD guidelines. Correlation results and feature similarities indicated that whereas acute toxicity data from rats and mice via the same administration route were combinable for modeling, data from different routes were not. Six QSTR models for each route in a single species and two QSTR models for a single route in the two species were constructed, achieving practical predictive performance. Despite significant variances in their datasets, the prediction models could predict the acute toxicity of novel or unknown OPs, realize rapid assessment, and provide guidance for regulatory decisions to reduce the hazards of OPs.
... These findings highlight different mode of action and the nature of these pesticides. It has been well documented that the toxicity of pesticides is varied due to their chemical structures and their mode of actions (Cao et al., 2018;Hamadache et al., 2016;Kaushik and Kaushik, 2007;Mesnage et al., 2018;Zahouily et al., 2002). ...
Article
Pesticides exposure can have harmful effects on human health. The liver is the most common organ of pesticides toxicity due to its major metabolic activity. The molecular mechanism of pesticides effect is complex and is controlled by gene regulatory networks. All components of regulatory networks are controlled by transcription factors and other regulatory elements. Therefore, identification of key regulators through system biology approaches and high-throughput techniques can help to provide comprehensive insights into molecular mechanisms of the pesticide effect. In the current study, a microarray data-set was used to potentially identify molecular mechanisms that regulate gene expression profile of rat hepatocyte cell lines in response to pesticides exposure. Results showed that the number of differentially expressed genes (DEGs) and differentially expressed transcription factors (DE-TFs) were dramatically different among pesticides tested. Results also revealed 205 common DEGs and 11 DE-TFs among pesticides tested. Additionally, we found that six DE-TFs (CREB1, CTNNB1, PPARG, SP1, SRF and STAT3) had the highest number of interactions with other DEGs and acted as the key regulatory genes. The results of this study revealed regulator genes that have the key functions in response to pesticides toxicity in rat liver, which can provide the basis for future studies. Furthermore, these regulatory genes can be used as toxicity biomarkers to improve diagnosis and prognosis.
... Given the large number of 63 descriptors used to code each molecule, we subjected our data to Stepwise stepwise selection [14,15,16], (2) The statistical quality of the equation is very good, it explains 73% of the total variance, and it's higher than the other models described in the literature if we take into account the number of descriptors used. It explains up to 73% of the total variance with a standard error "s" much lower than the average error made on the observed values of Log (LD50) which is of the order of 0.741 for an interval ranging from 0.66 to 3.45. ...
Article
Full-text available
p>Structure-Toxicity Relationships have been studied for a set of 42 organophosphorous pesticides (OPs) through multiple linear regression (MLR) and artificial neural networks (ANN). A model with three descriptors, including: total lipophilicity [log (P)], widths radicals R<sub>1</sub> [(LR<sub>1</sub>)] and R<sub>2</sub> [(LR<sub>2</sub>)] has achieved good results in phase Training and phase prediction of toxicity [log LD50 (lethal dose 50, Oral rat)]. The linear model (MLR: n=40, r²=0.86, s=40 and q<sup>2</sup> = 0.66) and non-linear model with a configuration [3-6-1] (ANN: r²=0.95, s=0.73 and q<sup>2</sup> = 0.17) have proved very successful and complementary. The selected descriptors indicate the importance of lipophilicity and widths radicals R<sub>1</sub> and R<sub>2</sub> in the contribution of the toxicity of pesticides derived from OPs used in this study. This information is relevant for the design of a new model of non-toxic pesticides OPs.</p
... The majority models were based on non linear techniques like (MLP-ANN, kNN, RA etc) of different size datasets and very few MLR models are available mainly for small dataset. [25][26][27][28][29][30][31] The problems of lack of validation of models (local models) as well as limited applicability domain (global models) were already being highlighted by many researchers. [16,32] Martin et al had given emphasis on distribution of dataset by their mechanism for further model development as well as to improve prediction power on validation set. ...
Article
The toxic potentials of carbamates to human and non target organisms are of public concern in relation to society and ecosystem for their unregulated and indiscriminate use. No computational study was found on rat and mouse oral toxicity for carbamate pesticides. In this context, carbamate pesticides were collected from ChemIDplus databases for the modeling study. A series of local QSTR model for both rat and mouse oral toxicity of carbamate derivatives were developed according to OECD principle from 2D descriptors by using Genetic Algorithm (GA) as feature selection chemometric tools using QSARINS software. All the models indicate the importance of auto correlation descriptors related to charge, I‐State, atom type E state for fragment −O− in relation to acute mammalian toxicity. Reliability of predictions of the models was verified by applicability domain (AD) and prediction reliability index. Finally developed models were applied to unknown carbamate pesticides to evaluate their predictions and AD. The toxic nature of the prioritized compounds with structural alerts were commented in a consensus way. Additional toxicity‐toxicity relationship studies (QTTR) between these two responses with similar findings promoted further application of QTTR models in absence of one response. These findings may help the scientific community in prioritizing potentially hazardous pesticides of carbamate and related classes.
... The flow diagram of the GA-MLR model is presented in Figure 4. The statistical quality of the multiple regression equations was examined by different parameters like correlation coefficient r, standard error of regression s and Fischer ratio F. All accepted the MLR equations have regression coefficients and F ratios significant at 95% and 99% levels, respectively (Geiss and Frazier, 2001;Zahouily et al., 2002;Zahouily et al., 2006). The generated QSAR equations were validated by leave-one out, LOO, statistics and cross-validation q 2 (Asim and Roy, 2009;Lui et al., 2008;Cumming et al., 2005). ...
Article
Full-text available
Quantitative Structure Activity Relationships (QSAR) were studied for a series of 54 1-(3, 3-diphenylpropyl)-piperidinyl amides and ureas derivatives by means of Multiple Linear Regression (MLR), Genetic Algorithm (GA) and Artificial Neural Network (ANN) techniques. The values of pIC50 (dose of compound required to reduce the proliferation of normal uninfected cells by 50%) of the studied compounds were correlated with the descriptors or variables encoding the chemical structures. An approach that combines GA and MLR (GA-MLR) was used to select the pertinent descriptors to explain the activity pIC50. The descriptors revealed by GA-MLR were used to characterise the non-linear aspect in the activity parameter. The results obtained from this study indicate that the activity pIC50 is strongly dependent on the highest occupied molecular orbital, molecular weight, molecular volume, molar refractivity and LogP parameters.
Article
In this study, Quantitative Structure-Activity/property relationships (QSAR/QSPR) by means of multiple linear regressions (MLR) was performed to investigate the relationship between the 48 compounds of Temephos (Tem) drivatives and their bioactivities against acetylcholinesterase (AChE) of Tribolium castaneum. Two compounds Propylene-N, N′-bis (O, O′-di Ethyl Phosphorothioat (12) and Ethylene N, ′N- di Methyl bis (O, O′- di Phenyl Phosphoramidate (24) had the most mortality on the T. castaneum. The compound 24 (IC50=34.54 ppm) is a good alternative to Tem. Also, QSAR calculations indicated that the electrostatic characteristics of the most effective insecticide are applied. In docking data, Tem derivatives with the backbone of P (O)-NH−P(O), P(O)-NH-NH−P(O) and P(O)-X−P(O) are located in the active site gorge of both AChE and butyrylcholinesterase (BChE) so as to maximize the favorable contacts. These compounds relate to enzymes by non-covalent interactions such as hydrogen bonding, electrostatic and hydrophobic (as Trp82 and Trp286). Also, these cannot reach the end of the hole because of compounds are locked in the middle between the peripheral and acyl pocket site in AChE and BChE enzymes. The results of MLR and GA-QSAR/QSPR model of human ChE showed that topological parameters affect the inhibitory potencies of the compounds on both of logK and p(IC50). logK and p(IC50) results have a good linear relationship.
Article
Full-text available
Artificial neural networks (ANNs) can be utilized to generate predictive models of quantitative structure-activity relationships (QSAR) between a set of molecular descriptors and activity. In the present work, QSAR analysis for a set of 95 1-[(2-hydroxyethoxy)-methyl]- 6( phenylthio) thymine ( HEPT) derivatives has been investigated by means of a three-layered neural network (NN). It has been shown that NN can be a potential tool in the investigation of QSAR analysis compared with the models given in the literature. The results obtained by using the NN adopted for QSAR models showing not only good statistical significance in fitting, but also high predictive ability. (0.916 < r < 0.968 and q(2) = 0.8779). The relevant factors controlling the anti-HIV-1 activity of HEPT derivatives have been identified. The results are along the same lines as those of our previous studies on HEPT derivatives and indicate the importance of the hydrophobic parameter in modelling the QSAR for HEPT derivatives
Article
Full-text available
Quantitative structure-activity relationships (QSARs) are analyzed for evaluating phenylurea compounds side effects on mammals. Available QSARs derived from substructures or solely partition properties significantly underestimate their toxicity (oral rat LD50).Predictive QSARs are presented using partitioning and electronic descriptors derived from quantum mechanical calculations. The combination of log POW and the ionization potential results in satisfactory estimates of phenylurea rat LD50 values for hazard assessment purposes. None of the phenylureas was significantly underestimated in toxicity. Using log POW and hardness, a model was derived very well suited for screening of experimental data in order to recognize outliers, for which either spurious test results or a different mode of action have to be assumed.
Article
Full-text available
Models of the relationships between structure and musk odour of tetralin and indan compounds were elaborated with a multilayer neural network using the back-propagation algorithm. The neural network was used to classify the compounds studied into two categories (musk or non-musk). The cross-validation procedure was used to assess the predictive power of the network. Each molecule was described by eight global parameters: five steric and three electronic descriptors. The neural network's results were successfully compared to those given by the k-Nearest Neighbours and the Bayesean methods, both in the classification and prediction tests. The contribution of each descriptor to the structure-odour relationships was evaluated. Three out of the eight descriptors were thus found to be the most relevant in the molecular description for the prediction of musk odour. This research points out that neural networks are likely to become a useful technique for structure-odour relationships.
Article
Complex data analysis is becoming more easily accessible to analytical chemists, including natural computation methods such as artificial neural networks (ANNs). Unfortunately, in many of these methods, inappropriate choices of model parameters can lead to overfitting. This study concerns overfitting issues in the use of ANNs to classify complex, high-dimensional data (where the number of variables far exceeds the number of specimens). We examine whether a parameter rho, equal to the ratio of the number of observations in the training set to the number of connections in the network, can be used as an indicator to forecast overfitting. Networks possessing different rho values were trained using as inputs either raw data or scores obtained from principal component analysis (PCA). A primary finding was that different data sets behave very differently. For data sets with either abundant or scant information related to the proposed group structure, overfitting was little influenced by rho, whereas for intermediate cases some dependence was found, although it was not possible to specify values of rho which prevented overfitting altogether. The use of a tuning set, to control termination of training and guard against overtraining, did not necessarily prevent overfitting from taking place. However, for data containing scant group-related information, the use of a tuning set reduced the likelihood and magnitude of overfitting, although not eliminating it entirely. For other data sets, little difference in the nature of overfitting arose from the two modes of termination. Small data sets (in terms of number of specimens) were more likely to produce overfit ANNs, as were input layers comprising large numbers of PC scores. Hence, for high-dimensional data, the use of a limited number of PC scores as inputs, a tuning set to prevent overtraining and a test set to detect and guard against overfitting are recommended.
Article
Commentary on errors in an earlier article on the nature of the chemical bond. Keywords (Audience): First-Year Undergraduate / General
Article
Structure-musk odor relationships were established by means of a 3-layer neural network (NN) using the back-propagation algorithm. To test the reliability of the NN approach, structure-odor relationships were established for a set of 53 tetralins and for subsets of 45 and 41 tetralins, and tested using a set of 15 indans and the remaining tetralins (8 or 12 compounds) as test sets. Each molecule of the training set was described by 7 variables coding substituents of 6 free sites of the tetralin or indan rings (6 steric hindrance descriptors and one electronegativity descriptor). Odor was coded by a binary variable. Training the NN with 53 tetralins gave 100% correct classification and 100% good prediction for the test with indans in all the trials. With the subsets of 45 and 41 tetralins, the classification ability of the NN was in all cases higher than 96.4% and its prediction ability higher than 92.6%. The contributions of the descriptor variables to the classification were evaluated according to different methods. The obtained results confirm the well-known effects of the steric hindrance of the functional group and of substituents in the ortho position.
Article
Principal Component Analysis (PCA) was carried out on the basis of property descriptors for a set of α-halogenated ketones and aldehydes. The obtained PCA model showed that the chemical behaviour of the compounds with 1-ethoxy-3-trimethylsilylprop-1-yne is related to the steric effect of the structural environment of the C=O group. Moreover the PCA model, the steric effect was underlined quantitatively with regression analysis models.
Article
Enhanced solvatochromic shifts in hydrogen-bond acceptor (HBA) solvents for 2-nitroaniline, 2-nitro-p-toluidine, and 2-nitro-p-anisidine relative to their N,N-dimethyl derivatives show good linear correlation with the β-scale of solvent HBA basicities. Reciprocally, the new experimental results are used to expand the data base which supports the β-scale. Solvato-chromic comparison between N-methyl- and N,N-dimethyl-2-nitro-p-toluidine shows hydrogen bonding in the former compound to be intramolecular in all solvents studied.
Article
Intermolecular van der Waals radii of the nonmetallic elements have been assembled into a list of "recommended" values for volume calculations. These values have been arrived at by selecting from the most reliable X-ray diffraction data those which could be reconciled with crystal density at 0°K. (to give reasonable packing density), gas kinetic collision cross section, critical density, and liquid state properties. A qualitative understanding of the nature of van der Waals radii is provided by correlation with the de Broglie wave length of the outermost valence electron. Tentative values for the van der Waals radii of metallic elements - in metal organic compounds - are proposed. The paper concludes with a list of increments for the volume of molecules impenetrable to thermal collision, the so-called van der Waals volume, and of the corresponding increments in area per molecule.