Content uploaded by Maxim V Fedorov
Author content
All content in this area was uploaded by Maxim V Fedorov
Content may be subject to copyright.
Content uploaded by Maxim V Fedorov
Author content
All content in this area was uploaded by Maxim V Fedorov
Content may be subject to copyright.
Accurate calculations of the hydration free energies of druglike molecules
using the reference interaction site model
David S. Palmer, Volodymyr P. Sergiievskyi, Frank Jensen, and Maxim V. Fedorov
a兲
Max Planck Institute for the Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig,
Germany and Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
共Received 30 April 2010; accepted 8 June 2010; published online 22 July 2010兲
We report on the results of testing the reference interaction site model 共RISM兲 for the estimation of
the hydration free energy of druglike molecules. The optimum model was selected after testing of
different RISM free energy expressions combined with different quantum mechanics and empirical
force-field methods of structure optimization and atomic partial charge calculation. The final model
gave a systematic error with a standard deviation of 2.6 kcal/mol for a test set of 31 molecules
selected from the SAMPL1 blind challenge set 关J. P. Guthrie, J. Phys. Chem. B 113, 4501 共2009兲兴.
After parametrization of this model to include terms for the excluded volume and the number of
atoms of different types in the molecule, the root mean squared error for a test set of 19 molecules
was less than 1.2 kcal/mol. © 2010 American Institute of Physics. 关doi:10.1063/1.3458798兴
I. INTRODUCTION
Accurate calculation of the hydration free energies of
organic molecules is a long-standing challenge in computa-
tional chemistry and is important in many aspects of research
in the pharmaceutical and agrochemical industries. For ex-
ample, many of the pharmacokinetic properties of potential
drug molecules are defined by their in vivo solvation and
acid-base behavior, which can be estimated from their hydra-
tion free energies.
1–7
Commonly used methods to calculate hydration free en-
ergy may be categorized as either explicit or implicit solvent
models. In the first approach, each solvent molecule is in-
cluded explicitly and molecular simulation methods are used
to sample their conformational freedom.
1,2,8–13
In the second
approach, the implicit effect of solvent on solute is included
by solving either the Poisson–Boltzmann 共PB兲 or the
generalized-Born 共GB兲 equation.
14–18
While explicit solvent
methods are more scientifically rigorous, implicit models are
often preferred because they are less computationally expen-
sive. Both methods have recently been subjected to a blind
test for the calculation of hydration free energies of druglike
molecules 共SAMPL1 test兲.
19
The best predictions were in the
range RMSE
pred
=2.5–3.5 kcal mol
−1
, which equates to an
⬃2 log unit error in the related pharmacokinetic property
共estimated from ⌬G
solv
=−RT ln K兲.
20–22
Clearly, the current
methods to calculate hydration free energy are not accurate
enough for modern pharmaceutical research. Furthermore,
because these methods have been available for some time
before the first benchmark on druglike molecules was pub-
lished, it has led to a situation where they have often been
used blindly beyond their domain of applicability.
Integral equation theory is an alternative framework for
the calculation of hydration free energies.
23–25
Unlike PB or
GB methods, it retains information about the solvent struc-
ture 共in terms of density correlation functions兲, but estimates
the solute chemical potential without long molecular dynam-
ics 共MD兲 or Monte Carlo 共MC兲 simulations. At present, there
are several approaches based on integral equations. The mo-
lecular Ornstein–Zernike theory is used to calculate the
three-dimensional 共3D兲 hydration structure in molecular
liquids.
26,27
The site-site Ornstein–Zernike 共SSOZ兲 integral
equation is used to calculate the properties of complex
solute-solvent systems in the reference interaction site model
共RISM兲 formalism developed by Chandler and others.
28–30
The theory has been applied successfully to calculate the
structural and thermodynamic properties of various chemical
and biological systems.
31–40
Despite a recent resurgence in interest in biomedical ap-
plications of the integral equation theory,
41–46
the methods
were not represented in the recent SAMPL1 blind test and
have not been adequately tested for druglike molecules. In
this work, we test several previously used RISM methods
combined with molecular geometries and partial charges cal-
culated using different quantum mechanics 共QM兲 and em-
pirical force-field 共FF兲 methods for a subset of the SAMPL1
dataset of druglike molecules.
II. THEORY
A. RISM
The RISM, initially introduced by Chandler and
Andersen,
28
permits the description of the thermodynamics
of infinitely dilute solutions by a set of integral equations. A
complete description of the RISM may be found in Ref. 25.
Here we give only the basic definitions that are needed to
calculate hydration free energies. In the RISM approach,
both the solute and the solvent molecules are treated as sets
of sites with spherically symmetric properties. In the sim-
plest case, the sites are just the atoms of the molecules. In
this paper, we are using the so-called one-dimensional 共1D兲
RISM approach
25
where solute and solvent interactions are
a兲
Author to whom correspondence should be addressed. Tel.: ⫹49 共0兲 341
9959 756. Electronic mail: fedorov@mis.mpg.de.
THE JOURNAL OF CHEMICAL PHYSICS 133, 044104 共2010兲
0021-9606/2010/133共4兲/044104/11/$30.00 © 2010 American Institute of Physics133, 044104-1
described by spherically symmetric site-to-site functions. Us-
ing this approach, one operates only with radial parts of these
functions and, therefore, all the numerical tasks are one di-
mensional 共which leads to a significant reduction in compu-
tational cost兲.
Three types of site-site correlation functions are consid-
ered in the RISM: intramolecular correlation functions, total
correlation functions, and direct correlation functions. In-
tramolecular correlation functions describe the structure of
the molecule. For the two sites, s and s
⬘
of one molecule, the
intramolecular correlation function is
ss
⬘
共r兲 =
␦
共r − r
ss
⬘
兲
4
r
ss
⬘
2
, 共1兲
where r
ss
⬘
is the distance between the sites and
␦
共r −r
ss
⬘
兲 is
the Dirac delta-function. Total correlation functions h
s
␣
共r兲
and direct correlation functions c
s
␣
共r兲 are defined for each
pair of solute and solvent sites 共s and
␣
, respectively兲. The
total correlation functions can be expressed as h
s
␣
共r兲
=g
s
␣
共r兲−1, where g
s
␣
共r兲 is the radial distribution function of
solvent sites around the solute sites. Bulk solvent total cor-
relation functions h
␣
bulk
共r兲 are also considered and represent
the distribution of sites
of solvent molecules around the site
␣
of the selected solvent molecule. Direct correlation func-
tions c
s
␣
共r兲 are calculated using the set of RISM equations
for the case of infinitely diluted solution
25
h
s
␣
共r兲 =
兺
s
⬘
具
ss
⬘
ⴱ c
s
⬘
ⴱ 关
␣
bulk
+
h
␣
bulk
兴典共r兲,
共2兲
s = 1, ... ,N
solute
,
␣
= 1, ... ,N
solvent
, r 苸 关0;⬁兲.
Here 具xⴱ y典共r兲 is the radial part of the spherically
symmetric three-dimensional convolution 具x ⴱ y典共r兲
=兰
R
3x共r −r
⬘
兲y共r
⬘
兲dr
⬘
, and
is a number density of the bulk
solvent. To complete the set of RISM equations, one needs to
use a closure relationship, which has the general form
c
s
␣
共r兲 = e
⌶
s
␣
共r兲−B
s
␣
共r兲
− h
s
␣
共r兲 + c
s
␣
共r兲 −1, 共3兲
where ⌶
s
␣
共r兲=−

u
s
␣
共r兲+h
s
␣
共r兲−c
s
␣
共r兲, u
s
␣
共r兲 is the atom-
atom potential, B
s
␣
共r兲 is a so-called bridge function,
25,47

=1/ k
B
T, k
B
is the Boltzmann constant, and T is the tempera-
ture 共in our calculations we used T=300 K兲. The case B共r兲
⬅0 corresponds to the frequently used hypernetted chain
closure.
25,47
However, the RISM equations with hypernetted
chain closure did not converge for many molecules in the
investigated set 共the poor convergence of RISM with B共r 兲
⬅0 has been reported previously
25,31,48
兲. Therefore, to im
-
prove convergence of our algorithm, we used the partially
linearized hypernetted chain closure 共PLHNC兲,
49,50
c
s
␣
共r兲 =
再
e
⌶
s
␣
共r兲
− h
s
␣
共r兲 + c
s
␣
共r兲 −1, ⌶
s
␣
共r兲 ⬍ 0
−

u
s
␣
共r兲, ⌶
s
␣
共r兲 ⬎ 0.
冎
共4兲
In our calculations, the intramolecular correlation func-
tions
ss
⬘
共r兲 were found from Eq. 共1兲. We used the modified
SPC/E water model 共MSPC/E兲 proposed by Lue and Blank-
shtein in Ref. 51. The MSPCE/E model differs from the
original SPC/E water model
52
by the additional Lennard-
Jones 共LJ兲 potential for the water hydrogen, which was
modified to prevent possible divergence of the
algorithm.
38,53–55
The total correlation functions of the bulk
solvent h
␣
bulk
共r兲 were obtained from the previous work of
some of us
56
where these functions were calculated by the
RISM equations for solvent-solvent correlations
25
and the
wavelet-based algorithm for integral equations.
56–59
For all solutes, we used the LJ parameters from the
OPLS-2005 force-field.
60,61
We obtained the solute partial
charges by different QM methods 共see below兲.
The set of RISM equations 共2兲, together with the closure
relation 共4兲, allows us to find the functions h
s
␣
共r兲 and c
s
␣
共r兲,
which are used to calculate hydration free energies. There are
no known methods to solve the set of RISM equations ana-
lytically in the general case. Thus, in most cases the RISM
equations are solved numerically. In Ref. 62, it was shown
that RISM-like equations for monatomic particles can be ef-
fectively solved using a multigrid technique.
63
In the current
work, we use the RISM-MOL solver, which is the
MATLAB
realization of the multigrid algorithm for solving RISM
equations.
62
The RISM-MOL solver is one of the recent de
-
velopments of the Computational Physical Chemistry and
Biophysics Group of the Max-Planck-Institute for Math-
ematics in the Sciences. Since this program has not previ-
ously been reported in literature, we give a short description
of it in the Appendix of this article.
Within the RISM theory, there are several expressions
which allow one to obtain values of the hydration free energy
from the total and direct correlation functions h
s
␣
共r兲 and
c
s
␣
共r兲. In our work, we compare the accuracy of four of the
most popular free energy expressions.
55,64–66
The first ex
-
pression is the hypernetted-chain 共HNC兲 approximation,
64
in
which the formula for the hydration free energy is
⌬G
HNC
=2
kT
兺
s
␣
冕
0
⬁
关−2c
s
␣
共r兲 − h
s
␣
共r兲
⫻共c
s
␣
共r兲 − h
s
␣
共r兲兲兴r
2
dr. 共5兲
The hypernetted-chain method with repulsive bridge correc-
tion 共HNCB兲 proposed by Kovalenko and Hirata in Ref. 55
has a modified form of this hydration free energy expression.
Here we adopted the HNCB expression from the previous
work
55
for the case of PLHNC closure.
The hydration free energy expression for the HNCB is
55
⌬G
HNCB
= ⌬G
HNC
+4
kT
兺
s
␣
冕
0
⬁
共h
s
␣
共r兲 +1兲
⫻共e
−B
s
␣
R
共r兲
−1兲r
2
dr. 共6兲
Here 兵B
s
␣
R
共r兲其 are repulsive bridge correction functions, de-
fined for each pair of solute s and solvent
␣
atoms by the
expression
exp共− B
s
␣
R
共r兲兲 =
兿
⫽
␣
冓
␣
bulk
exp
冉
−

s
冉
s
r
冊
12
冊
冔
, 共7兲
where
␣
bulk
共r兲 are the solvent intramolecular correlation
functions, and
s
and
s
are the site-site parameters of the
pairwise Lennard-Jones potential.
044104-2 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲
The third expression is the Gaussian fluctuation 共GF兲
approximation,
65–67
in which the free energy is given as
⌬G
GF
=2
kT
兺
s
␣
冕
0
⬁
共−2c
s
␣
共r兲 − c
s
␣
共r兲h
s
␣
共r兲兲r
2
dr. 共8兲
The final hydration free energy equation we consider is
the partial wave 共PW兲 expression.
65,68
It has previously been
demonstrated to be one of the most accurate methods for
calculating hydration free energies of simple organic mol-
ecules within the RISM framework.
38,65,66
The PW expres
-
sion for hydration free energy is
⌬G
PW
= ⌬G
GF
+2
kT
兺
s
␣
冕
0
⬁
h
˜
s
␣
共r兲h
s
␣
共r兲r
2
dr, 共9兲
where h
˜
s
␣
共r兲=兺
s
⬘
具
˜
ss
⬘
ⴱ h
s
⬘
ⴱ
˜
␣
bulk
典,
˜
ss
⬘
and
˜
␣
bulk
are the
elements of matrices that are inverse to the matrices W
=共
ss
⬘
兲 and W
bulk
=共
␣
bulk
兲, which are built from the solute
and solvent intramolecular correlation functions
ss
⬘
and
␣
bulk
, respectively.
We note that there exists a more sophisticated version of
the RISM-3D RISM, which operates with three-dimensional
solute-solvent correlation functions.
69–72
This theory is based
on three-dimensional analogous of Eqs. 共2兲 and 共4兲 共see, e.g.,
Refs 25 and 72 for details兲. However, we are not using this
approach in this work, first because the use of 3D convolu-
tion in the 3D RISM calculations makes them very compu-
tationally expensive even when advanced numerical ap-
proaches and parallel programming are used to speed up the
calculations.
72
The computational cost makes 3D RISM dif
-
ficult to use for screening of large series of compounds and
testing different theoretical models 共e.g., in this work we
performed about 10
3
different RISM calculations to choose
the best parameters for hydration free energy prediction兲.
Second, the large computational cost associated with the 3D
convolution operations in the 3D RISM approach limits the
finite grid spacing to ⬃0.5 Å and the potential cutoff to
8–10 Å,
72,73
which complicates taking integrals in the 3D
analogs of the hydration free energy expressions 共5兲 and 共8兲
共Ref. 73兲 and affects the numerical accuracy of these calcu-
lations. However, in general, we consider the 3D RISM to be
a promising approach and we plan to extend our model to 3D
RISM in the near future. However, to make the 3D RISM
more feasible for practical applications, significant develop-
ments of the numerical part of the model are required. A
detailed investigation of the performance of hydration free
energy calculations using 1D RISM and 3D RISM ap-
proaches will be the subject of future work in our group.
III. MATERIALS AND METHODS
A. Dataset
The methods discussed above were tested on hydration
free energy data for 31 organic molecules taken from the
SAMPL1 dataset 共Table I兲.
19
The molecules in this dataset
present a stringent test of methods to calculate hydration free
energy because they are significantly larger and more com-
plex 共i.e., more functional groups兲 than those previously con-
sidered as benchmarks for hydration free energy calculations.
In Table I, the experimental hydration free energies 共in
kcal/mol兲 are given for all 31 molecules as reported in the
original SAMPL1 publication. The data are tabulated as
⌬G
hydr
=−RT ln共c
aq
/ c
gas
兲, with concentrations in mol/l,
which corresponds to the choice of standard states proposed
by Ben-Naim. The experimental data are given as grand
mean averages taken from multiple sources from the pub-
lished literature. Details of the methods used to compile the
dataset are given in Ref. 19 and will not be recapitulated
here. For the 21 molecules, the experimental hydration data
were reported with estimated experimental uncertainties,
which range from 0.1 to 0.44 kcal/mol with a median value
of 0.1 kcal/mol. The hydration free energies of the remaining
10 molecules were calculated from solubility and vapor pres-
sures; if these data were reported without experimental un-
certainties, they were arbitrarily assigned values of 1 log
unit. As such, the reported experimental uncertainties for
these molecules are pessimistic estimates, as has previously
been noted by other authors.
19
In the present study, we work with only 31 of the 63
molecules in the full SAMPL1 dataset in order to make it
computationally feasible to test a large number of different
free energy expressions and molecular geometry and partial
charges methods. The 31 molecules were selected at random
TABLE I. Hydration free energy data and SAMPL1 identification code for
the 31 molecules used in the current work.
ID Molecule
⌬G
hydr
共kcal/mol兲
Error
共kcal/mol兲
Cup08002 1,2-dinitroxypropane ⫺4.95 0.1
Cup08004 2-butyl nitrate ⫺1.82 0.1
Cup08005 Isobutyl nitrate ⫺1.88 0.1
Cup08007 Alachlor ⫺8.21 0.29
Cup08009 Ametryn ⫺7.65 0.45
Cup08016 Carbofuran ⫺9.61 0.3
Cup08020 Chlorimuronethyl ⫺14.01 1.93
Cup08021 Chloropicrin ⫺1.45 0.1
Cup08024 Diazinon ⫺6.48 0.13
Cup08025 Dicamba ⫺9.86 1.93
Cup08026 Dichlobenil ⫺4.71 1.93
Cup08028 Dinoseb ⫺6.23 1.93
Cup08029 Endosulfan alpha ⫺4.23 0.26
Cup08030 Endrin ⫺4.82 0.1
Cup08033 Heptachlor ⫺2.55 0.1
Cup08034 Isophorone ⫺5.18 1.37
Cup08035 Lindane ⫺5.44 0.1
Cup08038 Methyparathion ⫺7.19 0.1
Cup08041 Nitroxyacetone ⫺5.99 0.1
Cup08043 Parathion ⫺6.74 0.1
Cup08044 Pebulate ⫺3.64 1.93
Cup08045 Phorate ⫺4.37 0.1
Cup08048 Propanil ⫺7.78 1.93
Cup08050 Simazine ⫺10.22 0.1
Cup08052 Terbacil ⫺11.14 1.93
Cup08053 Terbutryn ⫺6.68 0.42
Cup08057 Vernolate ⫺4.13 1.36
Cup08058 4-amino-4-nitroazobenzene ⫺11.24 0.44
Cup08018 Chlordane ⫺3.44 0.1
Cup08047 Prometryn ⫺8.43 0.1
Cup08032 Fenuron ⫺9.13 1.93
044104-3
RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲
from the full dataset. The selected molecules contain be-
tween 8 and 27 heavy atoms each and have molecular
weights ranging from 119 to 426 atomic mass units.
B. Geometry optimization and atomic partial charge
calculation
Estimates of hydration free energies obtained by the
RISM are sensitive to the input molecular geometry and to
the calculated atomic partial charges. In this work, we tested
a variety of different classical 共force-field兲 and quantum me-
chanical methods for their calculation.
Molecular structures were obtained for each molecule in
the SAMPL1 test set from the supporting information of Ref.
19. As a preliminary preparation of these structures, a low-
mode conformational search was carried out for each mol-
ecule in both gas and aqueous phases using the OPLS-2005
force-field
60,61
in MACROMODEL v.9.1,
74
where aqueous sol
-
vent was simulated using the generalized-Born surface area
approximation.
75
The global minimum energy conformers
were used as input to each of the following geometry opti-
mization and partial charge calculations.
First, molecular geometries were optimized using the
B3LYP hybrid density functional and 6-31G
ⴱⴱ
basis set with
diffuse orbitals for heavy atoms and hydrogen
76
in vacuum
and in aqueous solvent simulated separately by two different
hydration models: the polarizable continuum model
共PCM兲
77–79
and the conductorlike continuum model
共CPCM兲,
79,80
implemented in GAUSSIAN 03.
81
All electronic
structure calculations were carried out in
GAUSSIAN 03
RevE.01,
81
unless otherwise stated. For each of the opti
-
mized molecular geometries, atomic partial charges were es-
timated by seven different methods: CHELP 共Charges from
Electrostatic Potentials兲,
82
CHELPG 共grid-based CHELP兲,
83
ESP 共Merz–Kollman Electrostatic Potential charges兲,
84,85
CHELP-DIPOLE, CHELPG-DIPOLE, ESP-DIPOLE, and
natural population analysis 共NPA兲.
86
The CHELP, CHELPG,
and ESP methods calculate atomic partial charges that repro-
duce the electrostatic potential on grid points outside the van
der Waals surface of the molecule. The suffix “-DIPOLE”
indicates that the atomic partial charges are also constrained
to reproduce the molecular dipole. In NPA atomic partial
charges are obtained by decomposing the molecular wave
function into atomic contributions. Since each atomic partial
charge calculation was repeated for three different solvation
models 共vacuum, PCM, CPCM兲, we have 3⫻7=21 geom-
etry and atomic partial charge sets calculated using density
functional theory 共DFT兲.
Second, the molecular geometries were optimized using
Hartree–Fock 共HF兲 theory and the 6-31G
ⴱⴱ
basis set in
vacuum. As for the DFT calculations, seven different partial
charge estimations were carried out.
Third, AM1-BCC and AM1-Mulliken atomic partial
charges were calculated using
MOPAC
87,88
and
ANTECHAMBER.
89,90
AM1-BCC charges are evaluated by ap
-
plying an empirical bond charge correction 共BCC兲 scheme to
AM1-Mulliken charges. Here we use the BCC parameters
derived by Jakalian et al.,
91
which were fitted by these au
-
thors to make the AM1-BCC charges match the electrostatic
potential at the HF/ 6-31G
ⴱ
level.
Finally, the geometries and partial charges calculated by
the OPLS2005 force-field in both vacuum and aqueous sol-
vent during the low-mode conformational search were used
as an additional set of parameters. In total, we have 21+7
+2+2=32 different pairs of molecular geometries and
atomic partial charges for each molecule. For each of these
sets, the hydration free energy was calculated using four dif-
ferent RISM free energy expressions 共HNC, HNCB, GF, and
PW兲. In total, this gives 32⫻4=128 different combinations
of free energy calculation methods. To identify the selected
methods, we will list slash separated names of QM method,
hydration model, partial charge method, and RISM expres-
sion. For example, B3LYP/PCM/CHELPG-DIPOLE/PW.
For each combination of methods, the values of the hy-
dration free energies of the 31 molecules from Table I were
calculated. The best of these models were then parametrized
to improve predictions of the hydration free energy using
separate training and independent test sets.
C. Statistical modeling
1. Error calculation
For all molecules of the dataset 共see Table I兲, hydration
free energy values were calculated using different structure
optimization methods, partial charge models, and RISM free
energy formulas 共HNC, HNCB, GF, and PW兲. To compare
calculated and experimental results, root mean squared de-
viation 共RMSD兲 was evaluated,
RMSD共⌬G,⌬G
expt
兲 =
冑
1
N
兺
i
共⌬G
共i兲
− ⌬G
expt
共i兲
兲
2
, 共10兲
where index i runs through the set of N selected molecules,
and ⌬G
共i兲
and ⌬G
expt
共i兲
are the calculated and the experimental
hydration free energy values of molecule i, respectively. The
total deviation can be split into the two parts: mean displace-
ment 共M兲 and standard deviation 共SD兲, which are calculated
by the formulas
M共⌬G − ⌬G
expt
兲 =
1
N
兺
i苸S
共⌬G
共i兲
− ⌬G
expt
共i兲
兲, 共11兲
SD共⌬G − ⌬G
expt
兲
=
冑
1
N
兺
i苸S
共⌬G
共i兲
− ⌬G
expt
共i兲
− M共⌬G − ⌬G
expt
兲兲
2
. 共12兲
The mean displacement gives the systematic error, which can
be corrected by a simple constant term. The standard devia-
tion gives the random error that is not explained by the
model. One can see the connection between these three for-
mulas,
RMSD共⌬G,⌬G
expt
兲
2
= M共⌬G − ⌬G
expt
兲
2
+SD共⌬G − ⌬G
expt
兲
2
. 共13兲
2. Fitting formula
In Ref. 38, it was shown that when excluded volume-
based correction terms are included in the RISM/PW for-
mula, the accuracy of the calculated hydration free energies
044104-4 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲
for simple nonpolar organic solutes improves considerably.
This result suggests that excluded volume corrections should
also be useful for improving the prediction of the RISM for
the SAMPL1 molecule set. In this case, we calculate the
excluded volume of the solute in infinitely dilute aqueous
solution as a limiting case of the partial molar volume
formula
92
when the solute density tends to zero,
V
ex
=
1
+
4
N
solute
兺
s
冕
0
⬁
共h
OO
bulk
共r兲 − h
so
共r兲兲r
2
dr. 共14兲
Here h
OO
bulk
共r兲 is the total oxygen-to-oxygen correlation func-
tion of bulk water and h
so
共r兲 is the total correlation function
between the solute site s and the water oxygen.
In Ref. 38, it is discussed that the RISM formulas may
systematically overestimate the hydration free energy of
small organic compounds, which contain certain types of
functional groups, e.g., charged groups or hydroxyl groups.
The authors introduced group contribution terms to correct
for these systematic errors. In a similar manner, additional
functional group corrections might be required for calcula-
tions of the larger molecules considered here. Due to the
structures of the molecules from SAMPL1 set, however,
there is no single obvious way to separate them by functional
groups. Therefore, to be consistent we used atom type rather
than functional group corrections. The 31 molecules given in
Table I contain hydrogen, carbon, oxygen, nitrogen, oxygen,
chlorine, phosphorus, and sulfur atoms. Thus, the fitting for-
mula is
⌬G
corr
共b兲 = ⌬G
RISM
+ b
V
V
ex
+
兺
j
b
j
n
j
, 共15兲
where j runs over the all atom types: j
苸 兵H,C,N,O,Cl,P,S其, n
j
is a number of atoms of type j in
the molecule, and b=兵b
V
,b
H
,b
C
,b
N
,b
O
,b
Cl
,b
P
,b
S
其 are the
coefficients to be fitted on the training molecule set. To pa-
rametrize the empirical model, we partitioned the 31-
molecule SAMPL1 subset into separate training and inde-
pendent test sets. As a training set, we chose 12 molecules,
which are listed in Table II. As one can see, the minimum
fitting condition is satisfied: for each atom type there is at
least 1 molecule from the training set which contain atoms of
this type. The test set comprised the remaining 19 molecules
given in Table I, which are not in the 12-molecule
training set given in Table II. Coefficients b
=兵b
V
,b
H
,b
C
,b
N
,b
O
,b
Cl
,b
P
,b
S
其 in the formula 共15兲 were fit-
ted to minimize the root mean squared deviation
RMSD共⌬G
expt
,⌬G
corr
共b兲兲 on the training set molecules.
3. Validation of the fitting results
Because we have relatively small test and training sets,
the small error on the test set by itself was not enough to
validate the formula. An additional validation procedure was
needed. First, a standard analysis of the variance 共t-test and
F-test兲
93
was performed to make sure that both experimental
and corrected calculated results have the same mean values
and standard deviations. Second, the coefficients of determi-
nation 共R
2
兲 were calculated to check the strength of correla-
tion between the corrected calculated and the experimental
results. To check that the fitting coefficients are not depen-
dent on the choice of the training set, three additional tests
were performed: 共i兲 leave-one-out cross-validation, 共ii兲
leave-five-out cross-validation, and 共iii兲 comparison of the
coefficients obtained by fitting to the training set and to the
full set. In the leave-one-out test, we perform a series of
fittings using the training sets, which are the initial 31-
molecule test set from Table I with 1 molecule extracted. For
all possible choices of the extracted molecule, we have 31
different sets of fitting coefficients,
b
共k兲
= 兵b
V
共k兲
,b
H
共k兲
,b
C
共k兲
,b
N
共k兲
,b
O
共k兲
,b
Cl
共k兲
,b
P
共k兲
,b
S
共k兲
其, k = 1, ... ,31.
共16兲
We count the relative standard deviation of each type of fit-
ting coefficient,
␦
b
j
=
SD共兵b
j
共k兲
其兲
兩M共兵b
j
共k兲
其兲兩
⫻ 100%, 共17兲
where j is the type of the coefficient: j
苸 兵V,H,C,N,O,Cl,P,S其. Values
␦
b
j
show the sensitivity of
the coefficient b
j
to the choice of training set. Low
␦
b
j
values
indicate that coefficient b
j
is not arbitrary and we can trust its
value, while high
␦
b
j
values indicate physically nonreliable
coefficients. The leave-five-out test is similar to leave-one-
out, but the training sets are constructed by excluding 5 mol-
ecules from the initial 31-molecule test set given in Table I.
TABLE II. The number of times each atom type occurs in each molecule of the training set.
ID n
H
n
C
n
N
n
O
n
Cl
n
P
n
S
Cup08002 6 3 2 6 0 0 0
Cup08009 17 9 5 0 0 0 1
Cup08021 0 1 1 2 3 0 0
Cup08024 21 12 2 3 0 1 1
Cup08026 3 7 1 0 2 0 0
Cup08029 6 9 0 3 6 0 1
Cup08032 12 9 2 1 0 0 0
Cup08034 14 9 0 1 0 0 0
Cup08035 6 6 0 0 6 0 0
Cup08038 10 8 1 5 0 1 1
Cup08044 21 10 1 1 0 0 1
Cup08057 21 10 1 1 0 0 1
044104-5
RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲
Because the number of possible choices of 5 molecules
among 31 is quite large 共C
31
5
=169 911兲, we chose randomly
1000 such extractions and calculate the values
␦
b
j
for them.
In addition, the training set fitting coefficients were com-
pared to the full set fitting coefficients. In this case, we per-
form two different fittings. Using the full 31-molecule test
sets from Table I for training, we obtain the fitting coeffi-
cients b
j
共full兲
, j 苸 兵V,H,C,N,O,Cl,P,S其. Using the training
set from Table II, we obtain another set of fitting coefficient
b
j
共train兲
and calculate the
␦
b
j
values by the formula
␦
b
j
ⴱ
=
兩b
j
共full兲
− b
j
共train兲
兩
兩b
j
共full兲
兩
⫻ 100%. 共18兲
IV. RESULTS AND DISCUSSION
A. Analysis of calculated data
1. Models without empirical corrections
The hydration free energy values were calculated for the
31-molecule test set from Table I using 128 combinations of
RISM and structure calculation methods. One can find the
results of the calculations in Ref. 94. The comparison with
experiment shows quite high RMSD values for all methods.
The smallest error is about 5.6 kcal/mol.
95
However, if we
look at the differences between the calculated and experi-
mental results, we can see that they are not random. For
many combinations of QM/RISM methods, differences are
distributed around a mean value and the standard deviation is
reasonably small 共see Fig. 1兲. The smallest standard devia-
tion of the differences is achieved using the B3LYP/gas/
CHELPG-DIPOLE/PW methods and is about 2.6 kcal/mol,
which is comparable to the results of the SAMPL1 hydration
free energy predictions found in literature.
20–22
We see that
although the RISM predictions contain large systematic er-
rors, the free energies calculated using the RISM are well
correlated with the experimental values. To support this
point, we calculated the correlation coefficients for the ex-
perimental and calculated values for each combination of
methods. In Table III, correlation coefficients are listed for
the methods that give the smallest standard deviation of the
differences between the calculated and the experimental hy-
dration free energy values. Results of these methods are well
correlated with the experimental data 共for most of them cor-
relation coefficients are larger than 0.7兲. RMSD values, stan-
dard deviations, and correlation coefficients for all methods
−15 −10 −5
0
0
5
10
15
Δ G
ex
p
(kcal/mol)
Δ G
calc
−Δ G
exp
(kcal/mol)
RMSD=12.5
Mean =12.2
SD=2.6
B3LYP/gas/ChelpG−dipole/PW
FIG. 1. Systematic and random errors between the hydration free energies
calculated by the B3LYP/gas/CHELPG-DIPOLE/PW method and the ex-
perimental results.
TABLE III. 共a兲 RISM results with the smallest standard deviations of differences between experimental and
calculated hydration free energies. 共b兲 RISM results with the largest correlation coefficients.
QM level Solvation model Partial charges Formula
Standard deviation
共kcal/mol兲 Correlation coefficient
共a兲 Ten results with the smallest standard deviation
B3LYP Gas CHELPG-DIPOLE PW 2.599 0.749
B3LYP Gas CHELP-DIPOLE PW 2.642 0.685
B3LYP Gas CHELPG PW 2.647 0.744
B3LYP Gas CHELP PW 2.672 0.677
HF Gas CHELPG PW 3.132 0.769
HF Gas CHELPG-DIPOLE PW 3.187 0.766
FF Gas OPLS2005 PW 3.331 0.706
HF Gas CHELP-DIPOLE PW 3.391 0.688
HF Gas CHELP PW 3.459 0.679
B3LYP PCM CHELP PW 3.558 0.820
共b兲 Ten results with the highest correlation coefficients
B3LYP CPCM CHELPG-DIPOLE PW 3.595 0.869
B3LYP CPCM CHELPG PW 3.582 0.868
B3LYP PCM CHELPG-DIPOLE PW 3.581 0.868
B3LYP PCM CHELPG PW 3.568 0.868
B3LYP CPCM CHELPG-DIPOLE GF 7.370 0.830
B3LYP CPCM CHELPG GF 7.349 0.830
B3LYP PCM CHELPG-DIPOLE GF 7.347 0.829
B3LYP CPCM CHELP-DIPOLE PW 3.689 0.824
B3LYP PCM CHELP-DIPOLE PW 3.575 0.823
B3LYP CPCM CHELP PW 3.701 0.821
044104-6 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲
are given in Ref. 94. Using only these preliminary results, we
can already select the most and least suitable methods. We
see that good correlations with experiment are observed with
both HF and B3LYP methods with CHELP, CHELPG, and
CHELP-DIPOLE, or CHELPG-DIPOLE charges. As we see,
the standard deviation for the OPLS2005 charges with PW
expression is about 3.3 kcal/mol; a promising result for this
level of theory. We also see that the smallest standard devia-
tions between the calculated and the experimental results are
obtained with the PW RISM formula, the GF formula gives
intermediate results 共the lowest standard deviation of error is
about 5.3 kcal/mol兲, while the HNC and HNCB free energy
formulas give quite large deviations from experiment 共stan-
dard deviations of errors are larger than 8.8 kcal/mol兲. The
methods for which we have reported small standard devia-
tions of the errors might be expected to be amenable to pa-
rametrization 共using, e.g., molecular volume and atom type
variables兲.
2. Models with empirical corrections
For each combination of methods, the coefficients b
=兵b
V
,b
H
,b
C
,b
N
,b
O
,b
Cl
,b
P
,b
S
其 in formula 共15兲 were fitted
using the training set molecules from Table II. Each fitting
formula was assessed using the test set 共comprising the re-
maining 19 molecules from the 31-molecule test set兲. The ten
best results with smallest RMSD on the test set are listed in
Table IV. Fitting results for other methods are given in Ref.
94. Comparing Table III 共smallest standard deviations兲 with
Table IV 共best fitting results兲 we can see the same set of
structure optimization and partial charge methods. It is inter-
esting to note that although the GF formula gives much
larger standard deviations than the PW formula, after param-
etrization it is able to produce results, which are almost as
good as for the PW formula. We also note that OPLS2005
force-field calculations combined with the PW formula give
good results after fitting 共RMSD of about 2 kcal/mol兲.
94
The best combination of methods is HF/gas/CHELPG/
PW. The calculated values of RISM/PW hydration free ener-
gies and the calculated excluded volumes are given in Ref.
94. After fitting, the RMSD value for the 19-molecule test set
is less than 1.2 kcal/mol. Differences between the calculated
and the experimental hydration free energies for this method
are presented in Fig. 2.
In Table V, values of the fitting coefficients b
=兵b
V
,b
H
,b
C
,b
N
,b
O
,b
Cl
,b
P
,b
S
其 are presented for the HF/gas/
CHELPG/PW method. To validate these coefficients, leave-
one-out and leave-five-out cross-validations have been car-
ried out, along with a comparison between the coefficients
obtained by fitting against either the training set or full
dataset. As one can see, the deviations between the different
training sets are quite small. The highest deviations are about
11% for sulfur and phosphorus 共these are the rarest elements
in the 31-molecule test set兲. The small relative deviations
mean that the formula will not change a lot if one uses a
different training set, i.e., the fitting coefficients are stable.
B. Comparison with other methods
Hydration free energies predicted by other methods for
the 63-molecule SAMPL1 set are given in Refs. 20–22. The
trend in these results is that continuum models 共which in-
clude some fitted parameters兲 give RMS errors around
2.5 kcal/mol on the SAMPL1 set, while slightly higher
RMSD errors are reported for explicit solvent approaches. In
order to provide a direct comparison to our results, we have
used the data given in Refs. 20–22 to recalculate the RMSD
obtained by these methods for the 19 molecules of our test
set only 共Table VI兲. For the HF/gas/CHELPG/PW method,
TABLE IV. The ten fitting results with smallest RMSDs for the 19-molecule test set.
QM level Solvation model Partial charges Formula
RMSD
共kcal/mol兲 R
2
F-test t-test
HF Gas CHELPG PW 1.138 0.897 Passed Passed
HF Gas CHELPG-DIPOLE PW 1.161 0.894 Passed Passed
B3LYP Gas CHELPG-DIPOLE GF 1.250 0.877 Passed Passed
B3LYP Gas CHELPG GF 1.270 0.871 Passed Passed
HF Gas CHELPG GF 1.344 0.857 Passed Passed
B3LYP Gas CHELPG-DIPOLE PW 1.372 0.859 Passed Passed
HF Gas CHELPG-DIPOLE GF 1.375 0.850 Passed Passed
B3LYP Gas CHELP-DIPOLE GF 1.417 0.831 Passed Passed
B3LYP Gas CHELPG PW 1.434 0.846 Passed Passed
AM1 Gas BCC PW 1.470 0.817 Passed Passed
−14 −12 −10 −8 −6 −4 −2
−5
−4
−3
−2
−1
0
1
2
3
4
5
Δ G
ex
p
(kcal/mol)
Δ G
corr
− Δ G
exp
(kcal/mol)
RMSD = 1.138 kcal/mol
HF/gas/ChelpG/PW
Training set
Test set
FIG. 2. The results for the best fitted model 共HF/gas/CHELPG/PW兲. The
RMSD on the test set is 1.14 kcal/mol.
044104-7
RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲
we obtained a RMSD on the test set of 1.14 kcal/mol, which
is almost half of that reported for the continuum models for
the same molecules.
Of course, such comparisons are not completely fair be-
cause results in Refs. 20–22 were obtained without knowl-
edge of the experimental hydration free energies, while re-
sults in the current paper were fitted to give the best
performance on the SAMPL1 subset. However, analysis of
the performance of the different methods shows quite reason-
able trends: the best performing methods are those which use
better levels of QM theory and better RISM hydration free
energy expressions 共GF and PW兲. This indicates that good
agreement with experiment is not just a random result of
statistical fitting but has a physical background. The authors
realize that the fitting procedure proposed in this paper needs
to be improved and further validated before it can be used for
the accurate blind prediction of hydration free energies.
However, this paper illustrates a procedure by which the ef-
ficient RISM-based method for calculating hydration free en-
ergies can be developed.
V. CONCLUSIONS
We have compared the performance of different models
based on RISM theory for the calculation of the hydration
free energies of druglike molecules. The best models were
identified among 128 possible combinations of four different
RISM free energy expressions and 32 different sets of mo-
lecular geometries and atomic partial charges.
TABLE V. Fitting coefficients for the HF/gas/CHELPG/PW method and their deviations during the leave-one-
out, leave-five-out, and training vs full fitting validations.
Coefficient Value
One-left-test 共
␦
b
j
兲
共%兲
Five-left-test 共
␦
b
j
兲
共%兲
Train vs full fit
␦
b
j
ⴱ
共%兲
b
V
⫺0.233 0.904 2.108 1.396
b
H
0.599 3.217 7.666 5.233
b
C
1.383 1.544 3.822 2.309
b
N
2.193 1.431 3.461 1.531
b
O
1.629 1.470 3.606 7.738
b
Cl
2.687 1.621 3.869 1.040
b
P
4.867 4.150 9.686 11.564
b
S
4.460 2.061 5.157 11.030
TABLE VI. Comparison of hydration free energies for the 19-molecule test set calculated by different methods 共kcal/mol兲.
Mol. ID Expt.
a
RISM
b
SM6
c
SM8
c
SMD
c
Klamt1
d
Klamt2
e
Sulea1
f
Sulea2
g
Sulea3
h
Cup08004 ⫺1.82 ⫺3.492 ⫺0.40 ⫺0.30 0.70 0.43 0.02 0.13 0.24 ⫺1.46
Cup08005 ⫺1.88 ⫺3.391 ⫺0.30 ⫺0.20 0.60 0.13 0.13 0.17 0.26 ⫺1.52
Cup08007 ⫺8.21 ⫺9.878 ⫺6.10 ⫺6.30 ⫺8.40 ⫺7.66 ⫺8.02 ⫺9.23 ⫺9.01 ⫺7.54
Cup08016 ⫺9.61 ⫺9.859 ⫺12.20 ⫺12.30 ⫺10.90 ⫺10.97 ⫺11.15 ⫺8.64 ⫺8.35 ⫺9.23
Cup08018 ⫺3.44 ⫺3.142 ⫺3.00 ⫺2.70 ⫺4.40 ¯¯⫺2.33 ⫺2.88 ⫺2.06
Cup08020 ⫺14.01 ⫺14.178 ⫺27.00 ⫺26.30 ⫺23.10 ⫺17.61 ⫺17.59 ⫺21.53 ⫺20.67
⫺21.59
Cup08025 ⫺9.86 ⫺9.965 ⫺8.00 ⫺7.90 ⫺6.80 ⫺9.46 ⫺9.46 ⫺7.63 ⫺7.73 ⫺8.08
Cup08028 ⫺6.23 ⫺7.212 ⫺9.70 ⫺9.60 ⫺8.30 ⫺4.54 ⫺4.54 ⫺4.12 ⫺4.26 ⫺6.60
Cup08030 ⫺4.82 ⫺3.924 ⫺6.30 ⫺5.60 ⫺4.70 ⫺7.34 ⫺7.34 ⫺4.47 ⫺5.11 ⫺5.01
Cup08033 ⫺2.55 ⫺2.667 ⫺2.10 ⫺1.80 ⫺2.30 ⫺5.91 ⫺5.91 ⫺0.62 ⫺1.08 ⫺0.77
Cup08041 ⫺5.99 ⫺6.086 ⫺5.40 ⫺5.10 ⫺3.50 ⫺3.81 ⫺3.97
⫺7.04 ⫺7.01 ⫺6.94
Cup08043 ⫺6.74 ⫺6.181 ⫺6.50 ⫺7.90 ⫺6.30 ⫺7.65 ⫺7.65 ⫺5.84 ⫺5.86 ⫺7.51
Cup08045 ⫺4.37 ⫺2.514 ⫺4.10 ⫺6.80 ⫺7.20 ⫺4.71 ⫺4.71 ⫺3.19 ⫺3.53 ⫺4.44
Cup08047 ⫺8.43 ⫺7.292 ⫺7.10 ⫺8.30 ⫺7.90 ⫺8.15 ⫺8.17 ⫺9.36 ⫺8.71 ⫺8.44
Cup08048 ⫺7.78 ⫺7.193 ⫺8.50 ⫺8.60 ⫺7.60 ⫺8.94 ⫺8.94 ⫺8.40 ⫺8.20 ⫺7.95
Cup08050 ⫺10.22 ⫺9.074 ⫺10.00 ⫺11.10 ⫺11.20
⫺9.74 ⫺9.74 ⫺9.91 ⫺9.14 ⫺8.68
Cup08052 ⫺11.14 ⫺9.266 ⫺8.90 ⫺9.60 ⫺9.20 ⫺11.27 ⫺11.27 ⫺15.67 ⫺15.35 ⫺14.47
Cup08053 ⫺6.68 ⫺7.337 ⫺8.40 ⫺9.40 ⫺8.10 ⫺7.38 ⫺7.63 ⫺10.00 ⫺9.34 ⫺9.26
Cup08058 ⫺11.24 ⫺12.923 ⫺13.80 ⫺13.10 ⫺11.40 ¯¯⫺13.36 ⫺14.12 ⫺16.27
RMSD 1.108 3.40 3.30 2.65 1.76 1.73 2.53 2.33 2.45
a
Experimental data 共Ref. 19兲.
b
HF/gas/CHELPG/PW RISM method 共with correction兲.
c
SM6, SM8, and SMD models 共Ref. 20兲.
d
Original prediction 共Ref. 21兲.
e
Prediction after cross merging 共Ref. 21兲.
f
Model 兵共1,0.9兲,16其共Ref. 22兲共supporting information兲.
g
Model 兵共1,0.9兲,25其共Ref. 22兲共supporting information兲.
h
Model 兵共2,1.0兲,25其共Ref. 22兲共supporting information兲.
044104-8 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲
The RISM calculations were validated against experi-
mental data taken from the SAMPL1 dataset. Since these
data were originally published as part of a blind challenge to
calculate hydration free energies, this has permitted us to
compare our results with those of the best implicit and ex-
plicit solvent approaches.
Although we observe that hydration free energies calcu-
lated with RISM theory contain significant absolute errors,
for the best methods tested here, these are found to be domi-
nated by large systematic errors, while the random errors are
considerably smaller. Using the best free energy expression
共PW兲 combined with the best structure determination meth-
ods 共HF or B3LYP with CHELPG/CHELPG-DIPOLE
charges and AM1 with BCC charges兲, the random errors in
the calculated hydration free energies were approximately
2.6 kcal/mol, which is comparable to results obtained by the
best implicit and explicit solvent methods. After parametri-
zation using an excluded volume term and simple atom
counts, the RMSD calculated by the best model 共HF/gas/
CHELPG/PW兲 was less than 1.2 kcal/mol, which is about
half the error reported by continuum models for the same
molecules.
Hydration free energies calculated by RISM theory have
traditionally been considered to be too inaccurate to be use-
ful in practical applications such as pharmaceutical drug de-
sign. However, these assumptions have been based on pub-
lications that have tested the HNC, HNCB, or related free
energy expressions. The results presented here show that the
PW or GF expressions allow relatively accurate calculations
of hydration free energies, which may be systematically im-
proved by the addition of a small number of simple empirical
parameters.
The RISM calculations based on the HNC expression
give inaccurate estimates of hydration free energies because
they overestimate the energy required to form a cavity in the
solvent and underestimate the electrostatic contribution to
the hydration free energy of hydrogen bonding sites.
38,65,66
In
principle, it might be possible to eliminate some of these
errors through the design of an appropriate bridge function,
but this is presently an open problem in the integral equation
theory of molecular liquids.
The results presented here indicate that qualitatively cor-
rect results obtained by the best RISM expressions can be
improved by an empirical fitting procedure to yield very ac-
curate quantitative predictions of hydration free energies.
The optimum model 共HF/gas/CHELPG/PW兲 is considerably
less computationally expensive than explicit solvent ap-
proaches for estimating hydration free energy. The results
suggest that after further development RISM theory has the
potential to be widely beneficial in practical applications
such as, e.g., pharmaceutical drug discovery and drug devel-
opment.
ACKNOWLEDGMENTS
This work was supported by a grant from the Villum
Kahn Rasmussen foundation through a postdoctoral grant to
D.S.P. Computations were made possible through grants
from the Lundbeck Foundation, the Novo Nordisk Founda-
tion, the Carlsberg Foundation, and from the Danish Center
for Scientific Computing. We thank Gennady N. Chuev and
Andrey I. Frolov for useful discussions and critical reading
of the manuscript. We would also like to acknowledge the
support staff of the Max-Planck-Institute for Mathematics in
the Sciences and particularly Ms. Valeria Huenniger, Ms.
Heike Rackwitz, and Ms. Theresa Petsch for the technical
and administrative support of the collaboration with Aarhus
University.
APPENDIX: THE RISM-MOL SOLVER
In the current work, the calculations of the RISM solute-
solvent correlation functions were performed with the RISM-
MOL program, which was developed, for fast solution of the
RISM integral equations, by Fedorov and Sergiievskyi in the
Computational Physical Chemistry and Biophysics group of
the Max-Planck-Institute for Mathematics in the Sciences.
To solve the RISM equations, the RISM-MOL program
uses the Fourier iterative method
23
speeded up by the multi
-
grid technique.
63
It was shown recently that the multigrid
method
63
is able to speed up the Fourier iterations for the
atomic Ornstein–Zernike equation up to several dozen
times.
62
The same multigrid method has been implemented
in the RISM-MOL program for 1D RISM calculations. Using
this algorithm, the hydration free energy calculations for the
largest molecule in the set 共42 atoms兲 took about 30 s on one
single processor core. The average time required for the hy-
dration free energy calculations was 17 s/molecule.
94
As the input data, the RISM-MOL solver takes the Car-
tesian coordinates, parameters of the Lennard-Jones poten-
tial, and partial charges q
s
of the atoms of the solute mol-
ecule. The parameters of the solvent molecules, as well as
precalculated bulk-solvent correlation functions h
s
␣
bulk
共r兲, are
embedded in the program. Using the atomic parameters, the
site-site interaction potentials between the solute sites s and
the solvent sites
␣
are calculated,
u
s
␣
共r兲 = u
s
␣
LJ
共r兲 + u
s
␣
C
共r兲, 共A1兲
where u
s
␣
C
共r兲 is the Coulomb potential
u
s
␣
C
共r兲 =
q
s
q
␣
r
共A2兲
and u
s
␣
LJ
共r兲 is a Lennard-Jones potential
u
s
␣
LJ
共r兲 =4
⑀
s
␣
冉冉
s
␣
r
冊
12
−
冉
s
␣
r
冊
6
冊
. 共A3兲
The pair Lennard-Jones parameters
s
␣
and
⑀
s
␣
are calcu-
lated via the combining rules. By default, the Lorentz–
Berthelot rules are used
s
␣
=
s
+
␣
2
,
⑀
s
␣
=
冑
⑀
s
⑀
␣
. 共A4兲
Other combining rules can be defined by the user.
In the RISM-MOL program, it is possible to vary the
number of grids, the number of grid points, the number of
iterations, and, hence, the accuracy of the calculation. In the
044104-9
RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲
current study, six-grid iterations were used. The final solution
was obtained on a grid with 4096 grid points and 0.05 bohr
step size with L
2
-norm accuracy =10
−4
.
The fast implementation of the algorithm for the numeri-
cal solution of the RISM equations, together with the pre-
sented possibilities for accurate hydration free energy calcu-
lations, makes the RISM-MOL solver a robust tool for
investigating the thermodynamics of solution. The program
can be obtained for academic users free of charge from
Fedorov by request.
1
C. A. Reynolds, P. M. King, and W. G. Richards, Mol. Phys. 76, 251
共1992兲.
2
P. Kollman, Chem. Rev. 共Washington, D.C.兲 93, 2395 共1993兲.
3
G. Perlovich and A. Bauer-Brandl, Curr. Drug Deliv. 1,213共2004兲.
4
G. L. Perlovich, T. V. Volkova, and A. Bauer-Brandl, J. Pharm. Sci. 95,
2158 共2006兲.
5
G. L. Perlovich, L. K. Hansen, T. V. Volkova, S. Mirza, A. N. Manin, and
A. Bauer-Brandl, Cryst. Growth Des. 7, 2643 共2007兲.
6
L. D. Hughes, D. S. Palmer, F. Nigsch, and J. B. O. Mitchell, J. Chem.
Inf. Model. 48,220共2008兲.
7
D. S. Palmer, A. Llinas, I. Morao, G. M. Day, J. M. Goodman, R. C.
Glen, and J. B. O. Mitchell, Mol. Pharmacol. 5,266共2008兲.
8
W. L. Jorgensen and J. TiradoRives, Perspect. Drug Discovery Des. 3,
123 共1995兲.
9
N. Matubayasi and M. Nakahara, J. Chem. Phys. 113, 6070 共2000兲.
10
N. Matubayasi and M. Nakahara, J. Mol. Liq. 119,23共2005兲.
11
M. R. Shirts and V. S. Pande, J. Chem. Phys. 122, 134508 共2005兲.
12
N. Matubayasi, Front. Biosci. 14, 3536 共2009兲.
13
J. L. Knight and C. L. Brooks, J. Comput. Chem. 30, 1692 共2009兲.
14
J. Tomasi and M. Persico, Chem. Rev. 共Washington, D.C.兲 94, 2027
共1994兲.
15
B. Roux and T. Simonson, Biophys. Chem. 78,1共1999兲.
16
D. Bashford and D. A. Case, Annu. Rev. Phys. Chem. 51, 129 共2000兲.
17
J. Tomasi, B. Mennucci, and R. Cammi, Chem. Rev. 共Washington, D.C.兲
105, 2999 共2005兲.
18
M. B. Ulmschneider, J. P. Ulmschneider, M. S. P. Sansom, and A. Di
Nola, Biophys. J. 92, 2338 共2007兲.
19
J. P. Guthrie, J. Phys. Chem. B 113, 4501 共2009兲.
20
A. V. Marenich, C. J. Cramer, and D. G. Truhlar, J. Phys. Chem. B 113,
4538 共2009兲.
21
A. Klamt, F. Eckert, and M. Diedenhofen, J. Phys. Chem. B 113,4508
共2009兲.
22
T. Sulea, D. Wanapun, S. Dennis, and E. O. Purisima, J. Phys. Chem. B
113,4511共2009兲.
23
P. A. Monson and G. P. Morriss, Adv. Chem. Phys. 77,451共1990兲.
24
J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, 3rd ed.
共Academic, London, 1991兲, http://www.sciencedirect.com/science/book/
9780123705358.
25
Molecular Theory of Solvation, edited by F. Hirata 共Kluwer Academic,
Dordrecht, 2003兲.
26
L. Blum and A. J. Torruella, J. Chem. Phys. 56,303共1972兲.
27
K. Amano and M. Kinoshita, Chem. Phys. Lett. 488,1共2010兲.
28
D. Chandler and H. C. Andersen, J. Chem. Phys. 57, 1930 共1972兲.
29
F. Hirata, B. M. Pettitt, and P. J. Rossky, J. Chem. Phys. 77, 509 共1982兲.
30
B. M. Pettitt and P. J. Rossky, J. Chem. Phys. 77, 1451 共1982兲.
31
M. Kinoshita, Y. Okamoto, and F. Hirata, J. Comput. Chem. 19,1724
共1998兲.
32
M. Kinoshita, Y. Okamoto, and F. Hirata, J. Am. Chem. Soc. 120, 1855
共1998兲.
33
M. Kinoshita, Y. Okamoto, and F. Hirata, J. Chem. Phys. 110, 4090
共1999兲.
34
T. Imai, M. Kinoshita, and F. Hirata, Bull. Chem. Soc. Jpn. 73, 1113
共2000兲.
35
T. Imai, R. Hiraoka, A. Kovalenko, and F. Hirata, J. Am. Chem. Soc.
127, 15334 共2005兲.
36
N. Yoshida, S. Phongphanphanee, Y. Maruyama, T. Imai, and F. Hirata, J.
Am. Chem. Soc. 128, 12042 共2006兲.
37
N. Yoshida, S. Phongphanphanee, and F. Hirata, J. Phys. Chem. B 111,
4588 共2007兲.
38
G. Chuev, M. Fedorov, and J. Crain, Chem. Phys. Lett. 448, 198 共2007兲.
39
M. V. Fedorov and A. A. Kornyshev, Mol. Phys. 105,1共2007兲.
40
G. N. Chuev and M. V. Fedorov, J. Chem. Phys. 131, 074503 共2009兲.
41
T. Imai, Y. Harano, M. Kinoshita, A. Kovalenko, and F. Hirata, J. Chem.
Phys. 126, 225102 共2007兲.
42
T. Imai, S. Ohyama, A. Kovalenko, and F. Hirata, Protein Sci. 16,1927
共2007兲.
43
D. Yokogawa, H. Sato, T. Imai, and S. Sakaki, J. Chem. Phys. 130,
064111 共2009兲.
44
T. Imai, K. Oda, A. Kovalenko, F. Hirata, and A. Kidera, J. Am. Chem.
Soc. 131, 12430 共2009兲.
45
Y. Kiyota, R. Hiraoka, N. Yoshida, Y. Maruyama, I. Imai, and F. Hirata,
J. Am. Chem. Soc. 131, 3852 共2009兲.
46
K. Nishiyama, T. Yamaguchi, and F. Hirata, J. Phys. Chem. B 113, 2800
共2009兲.
47
J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, 4th ed.
共Elsevier Academic Press, Amsterdam, The Netherlands, 2000兲.
48
M. Kinoshita, Y. Okamoto, and F. Hirata, J. Comput. Chem. 18, 1320
共1997兲.
49
A. Kovalenko and F. Hirata, J. Phys. Chem. B 103 , 7942 共1999兲.
50
A. Kovalenko and F. Hirata, J. Chem. Phys. 110, 10095 共1999兲.
51
L. Lue and D. Blankschtein, J. Phys. Chem. 96, 8582 共1992兲.
52
H. J. C. Berendsen, J. R. Grigera, and T. P. Straatsma, J. Phys. Chem. 91,
6269 共1987兲.
53
F. Hirata and P. J. Rossky, Chem. Phys. Lett. 83,329共1981兲.
54
P. H. Lee and G. M. Maggiora, J. Phys. Chem. 97, 10175 共1993兲.
55
A. Kovalenko and F. Hirata, J. Chem. Phys. 113, 2793 共2000兲.
56
G. N. Chuev and M. V. Fedorov, J. Comput. Chem. 25,1369共2004兲.
57
G. N. Chuev and M. V. Fedorov, J. Chem. Phys. 120, 1191 共2004兲.
58
M. V. Fedorov and G. N. Chuev, J. Mol. Liq. 120, 159 共2005兲.
59
M. V. Fedorov, H. J. Flad, G. N. Chuev, L. Grasedyck, and B. N.
Khoromskij, Computing 80,47共2007兲.
60
W. L. Jorgensen, D. S. Maxwell, and J. TiradoRives, J. Am. Chem. Soc.
118, 11225 共1996兲.
61
G. A. Kaminski, R. A. Friesner, J. Tirado-Rives, and W. L. Jorgensen, J.
Phys. Chem. B 105, 6474 共2001兲.
62
M. V. Fedorov and W. Hackbusch, “A multigrid solver for the integral
equations of the theory of liquids,” Preprint No. 88 共Max-Planck-Institut
fuer Mathematik in den Naturwissenschaften, 2008兲.
63
W. Hackbusch, Multi-Grid Methods and Applications 共Springer-Verlag,
Berlin, 1985兲.
64
S. J. Singer and D. Chandler, Mol. Phys. 55,621共1985兲.
65
S. Ten-no, J. Chem. Phys. 115, 3724 共2001兲.
66
K. Sato, H. Chuman, and S. Ten-no, J. Phys. Chem. B 109, 17290
共2005兲.
67
D. Chandler, Y. Singh, and D. M. Richardson, J. Chem. Phys. 81,1975
共1984兲.
68
S. Ten-no and S. Iwata, J. Chem. Phys. 111, 4865 共1999兲.
69
C. M. Cortis, P. J. Rossky, and R. A. Friesner, J. Chem. Phys. 107, 6400
共1997兲.
70
Q. H. Du, D. Beglov, and B. Roux, J. Phys. Chem. B 104, 796 共2000兲.
71
A. Kovalenko, F. Hirata, and M. Kinoshita, J. Chem. Phys. 113, 9830
共2000兲.
72
T. Luchko, S. Gusarov, D. R. Roe, C. Simmerling, D. A. Case, J. Tuszyn
-
ski, and A. Kovalenko, J. Chem. Theory Comput. 6, 607 共2010兲.
73
S. Genheden, T. Luchko, S. Gusarov, A. Kovalenko, and U. Ryde, J.
Phys. Chem. B 114,8505共2010兲.
74
Schrödinger LLC 共2008兲, SCHRODINGER SUITE 2008, MAESTRO Version 8.5,
MACROMODEL Version 9.6.
75
W. C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson, J. Am.
Chem. Soc. 112, 6127 共1990兲.
76
R. Krishnan, J. S. Binkley, R. Seeger, and J. A. Pople, J. Chem. Phys. 72,
650 共1980兲.
77
E. Cancès, B. Mennucci, and J. Tomasi, J. Chem. Phys. 107, 3032
共1997兲.
78
B. Mennucci and J. Tomasi, J. Chem. Phys. 106, 5151 共1997兲.
79
M. Cossi, N. Rega, G. Scalmani, and V. Barone, J. Comput. Chem. 24,
669 共2003兲.
80
V. Barone and M. Cossi, J. Phys. Chem. A 102, 1995 共1998兲.
81
M. J. Frisch, G. W. Trucks, H. B. Schlegel et al., GAUSSIAN 03, Gaussian,
Inc., Wallingford, CT, 2004.
82
L. E. Chirlian and M. M. Francl, J. Comput. Chem. 8, 894 共1987兲.
83
C. M. Breneman and K. B. Wiberg, J. Comput. Chem. 11,361共1990兲.
84
B. H. Besler, K. M. Merz, and P. A. Kollman, J. Comput. Chem. 11,431
共1990兲.
85
U. C. Singh and P. A. Kollman, J. Comput. Chem. 5, 129 共1984兲.
044104-10 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲
86
A. E. Reed, R. B. Weinstock, and F. Weinhold, J. Chem. Phys. 83, 735
共1985兲.
87
J. J. P. Stewart, MOPAC 6.00, Fujitsu Limited, Tokyo, Japan.
88
J. J. P. Stewart, J. Comput.-Aided Mol. Des. 4,1共1990兲.
89
J. M. Wang, W. Wang, P. A. Kollman, and D. A. Case, J. Mol. Graphics
Modell. 25, 247 共2006兲.
90
J. M. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, and D. A. Case,
J. Comput. Chem. 25,1157共2004兲.
91
A. Jakalian, D. B. Jack, and C. I. Bayly, J. Comput. Chem. 23, 1623
共2002兲.
92
J. G. Kirkwood and F. P. Buff, J. Chem. Phys. 19, 774 共1951兲.
93
K. Knight, Mathematical Statistics 共CRC, Boca Raton, FL, 2000兲,p.502.
94
See supplementary material at http://dx.doi.org/10.1063/1.3458798 for
the results of the RISM calculations, results of the statistical analysis of
the calculations, results of the fitting and brief analysis of the computa-
tional performance of the algorithm.
95
For B3LYP calculations in vacuum with CHELP-DIPOLE charges and
Gaussian fluctuation 共GF兲 RISM free energy formula.
044104-11
RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲