ArticlePDF Available

Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model

July 2010
The Journal of Chemical Physics 133(4):044104

July 2010
133(4):044104

DOI:10.1063/1.3458798

Source
PubMed

Authors:

David Simon Palmer

University of Strathclyde

Volodymyr Sergiievskyi

University of Duisburg-Essen

Frank Jensen

Aarhus University

Maxim V Fedorov

Kharkevich Institute for Information Transmission Problems of RAS

We report on the results of testing the reference interaction site model (RISM) for the estimation of the hydration free energy of druglike molecules. The optimum model was selected after testing of different RISM free energy expressions combined with different quantum mechanics and empirical force-field methods of structure optimization and atomic partial charge calculation. The final model gave a systematic error with a standard deviation of 2.6 kcal/mol for a test set of 31 molecules selected from the SAMPL1 blind challenge set [J. P. Guthrie, J. Phys. Chem. B 113, 4501 (2009)]. After parametrization of this model to include terms for the excluded volume and the number of atoms of different types in the molecule, the root mean squared error for a test set of 19 molecules was less than 1.2 kcal/mol.

The results for the best fitted model (HF/gas/CHELPG/PW). The RMSD on the test set is 1.14 kcal/mol.

…

Systematic and random errors between the hydration free energies calculated by the B3LYP/gas/CHELPG-DIPOLE/PW method and the experimental results.

…

Figures - uploaded by Maxim V Fedorov

Content may be subject to copyright.

Content uploaded by Maxim V Fedorov

Content may be subject to copyright.

Content uploaded by Maxim V Fedorov

Content may be subject to copyright.

Accurate calculations of the hydration free energies of druglike molecules

using the reference interaction site model

David S. Palmer, Volodymyr P. Sergiievskyi, Frank Jensen, and Maxim V. Fedorov

a兲

Max Planck Institute for the Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig,

Germany and Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark

共Received 30 April 2010; accepted 8 June 2010; published online 22 July 2010兲

We report on the results of testing the reference interaction site model 共RISM兲 for the estimation of

the hydration free energy of druglike molecules. The optimum model was selected after testing of

different RISM free energy expressions combined with different quantum mechanics and empirical

force-ﬁeld methods of structure optimization and atomic partial charge calculation. The ﬁnal model

gave a systematic error with a standard deviation of 2.6 kcal/mol for a test set of 31 molecules

selected from the SAMPL1 blind challenge set 关J. P. Guthrie, J. Phys. Chem. B 113, 4501 共2009兲兴.

After parametrization of this model to include terms for the excluded volume and the number of

atoms of different types in the molecule, the root mean squared error for a test set of 19 molecules

I. INTRODUCTION

Accurate calculation of the hydration free energies of

organic molecules is a long-standing challenge in computa-

tional chemistry and is important in many aspects of research

in the pharmaceutical and agrochemical industries. For ex-

ample, many of the pharmacokinetic properties of potential

drug molecules are deﬁned by their in vivo solvation and

acid-base behavior, which can be estimated from their hydra-

tion free energies.

1–7

Commonly used methods to calculate hydration free en-

ergy may be categorized as either explicit or implicit solvent

models. In the ﬁrst approach, each solvent molecule is in-

cluded explicitly and molecular simulation methods are used

to sample their conformational freedom.

1,2,8–13

In the second

approach, the implicit effect of solvent on solute is included

by solving either the Poisson–Boltzmann 共PB兲 or the

generalized-Born 共GB兲 equation.

14–18

While explicit solvent

methods are more scientiﬁcally rigorous, implicit models are

often preferred because they are less computationally expen-

sive. Both methods have recently been subjected to a blind

test for the calculation of hydration free energies of druglike

molecules 共SAMPL1 test兲.

The best predictions were in the

range RMSE

pred

=2.5–3.5 kcal mol

−1

, which equates to an

⬃2 log unit error in the related pharmacokinetic property

共estimated from ⌬G

solv

=−RT ln K兲.

20–22

Clearly, the current

methods to calculate hydration free energy are not accurate

enough for modern pharmaceutical research. Furthermore,

because these methods have been available for some time

before the ﬁrst benchmark on druglike molecules was pub-

lished, it has led to a situation where they have often been

used blindly beyond their domain of applicability.

Integral equation theory is an alternative framework for

the calculation of hydration free energies.

23–25

Unlike PB or

GB methods, it retains information about the solvent struc-

ture 共in terms of density correlation functions兲, but estimates

the solute chemical potential without long molecular dynam-

ics 共MD兲 or Monte Carlo 共MC兲 simulations. At present, there

are several approaches based on integral equations. The mo-

lecular Ornstein–Zernike theory is used to calculate the

three-dimensional 共3D兲 hydration structure in molecular

liquids.

26,27

The site-site Ornstein–Zernike 共SSOZ兲 integral

equation is used to calculate the properties of complex

solute-solvent systems in the reference interaction site model

共RISM兲 formalism developed by Chandler and others.

28–30

The theory has been applied successfully to calculate the

structural and thermodynamic properties of various chemical

and biological systems.

31–40

Despite a recent resurgence in interest in biomedical ap-

plications of the integral equation theory,

41–46

the methods

were not represented in the recent SAMPL1 blind test and

have not been adequately tested for druglike molecules. In

this work, we test several previously used RISM methods

combined with molecular geometries and partial charges cal-

culated using different quantum mechanics 共QM兲 and em-

pirical force-ﬁeld 共FF兲 methods for a subset of the SAMPL1

dataset of druglike molecules.

II. THEORY

A. RISM

The RISM, initially introduced by Chandler and

Andersen,

permits the description of the thermodynamics

of inﬁnitely dilute solutions by a set of integral equations. A

complete description of the RISM may be found in Ref. 25.

Here we give only the basic deﬁnitions that are needed to

calculate hydration free energies. In the RISM approach,

both the solute and the solvent molecules are treated as sets

of sites with spherically symmetric properties. In the sim-

plest case, the sites are just the atoms of the molecules. In

this paper, we are using the so-called one-dimensional 共1D兲

RISM approach

where solute and solvent interactions are

a兲

Author to whom correspondence should be addressed. Tel.: ⫹49 共0兲 341

9959 756. Electronic mail: fedorov@mis.mpg.de.

THE JOURNAL OF CHEMICAL PHYSICS 133, 044104 共2010兲

described by spherically symmetric site-to-site functions. Us-

ing this approach, one operates only with radial parts of these

functions and, therefore, all the numerical tasks are one di-

mensional 共which leads to a signiﬁcant reduction in compu-

tational cost兲.

Three types of site-site correlation functions are consid-

ered in the RISM: intramolecular correlation functions, total

correlation functions, and direct correlation functions. In-

tramolecular correlation functions describe the structure of

the molecule. For the two sites, s and s

⬘

of one molecule, the

intramolecular correlation function is

␻

⬘

共r兲 =

␦

共r − r

⬘

兲

␲

⬘

, 共1兲

where r

⬘

is the distance between the sites and

␦

共r −r

⬘

兲 is

the Dirac delta-function. Total correlation functions h

␣

共r兲

and direct correlation functions c

␣

共r兲 are deﬁned for each

pair of solute and solvent sites 共s and

␣

, respectively兲. The

total correlation functions can be expressed as h

␣

共r兲

␣

共r兲−1, where g

␣

共r兲 is the radial distribution function of

solvent sites around the solute sites. Bulk solvent total cor-

relation functions h

␣

␰

bulk

共r兲 are also considered and represent

the distribution of sites

␰

of solvent molecules around the site

␣

of the selected solvent molecule. Direct correlation func-

tions c

␣

共r兲 are calculated using the set of RISM equations

for the case of inﬁnitely diluted solution

␣

共r兲 =

兺

⬘

␰

具

␻

⬘

ⴱ c

⬘

␰

ⴱ 关

␻

␣

␰

bulk

␳

␣

␰

bulk

兴典共r兲,

共2兲

s = 1, ... ,N

solute

␣

= 1, ... ,N

solvent

, r 苸关0;⬁兲.

Here 具xⴱ y典共r兲 is the radial part of the spherically

symmetric three-dimensional convolution 具x ⴱ y典共r兲

=兰

3x共r −r

⬘

兲y共r

⬘

兲dr

⬘

, and

␳

is a number density of the bulk

solvent. To complete the set of RISM equations, one needs to

use a closure relationship, which has the general form

␣

共r兲 = e

⌶

␣

共r兲−B

␣

共r兲

− h

␣

共r兲 + c

␣

共r兲 −1, 共3兲

where ⌶

␣

共r兲=−

␤

␣

共r兲+h

␣

共r兲−c

␣

共r兲, u

␣

共r兲 is the atom-

atom potential, B

␣

共r兲 is a so-called bridge function,

25,47

␤

=1/ k

T, k

is the Boltzmann constant, and T is the tempera-

ture 共in our calculations we used T=300 K兲. The case B共r兲

⬅0 corresponds to the frequently used hypernetted chain

closure.

25,47

However, the RISM equations with hypernetted

chain closure did not converge for many molecules in the

investigated set 共the poor convergence of RISM with B共r 兲

⬅0 has been reported previously

25,31,48

兲. Therefore, to im

prove convergence of our algorithm, we used the partially

linearized hypernetted chain closure 共PLHNC兲,

49,50

␣

共r兲 =

再

⌶

␣

共r兲

− h

␣

共r兲 + c

␣

共r兲 −1, ⌶

␣

共r兲 ⬍ 0

−

␤

␣

共r兲, ⌶

␣

共r兲 ⬎ 0.

冎

共4兲

In our calculations, the intramolecular correlation func-

tions

␻

⬘

共r兲 were found from Eq. 共1兲. We used the modiﬁed

SPC/E water model 共MSPC/E兲 proposed by Lue and Blank-

shtein in Ref. 51. The MSPCE/E model differs from the

original SPC/E water model

by the additional Lennard-

Jones 共LJ兲 potential for the water hydrogen, which was

modiﬁed to prevent possible divergence of the

algorithm.

38,53–55

The total correlation functions of the bulk

solvent h

␣

␰

bulk

共r兲 were obtained from the previous work of

some of us

where these functions were calculated by the

RISM equations for solvent-solvent correlations

and the

wavelet-based algorithm for integral equations.

56–59

For all solutes, we used the LJ parameters from the

OPLS-2005 force-ﬁeld.

60,61

We obtained the solute partial

charges by different QM methods 共see below兲.

The set of RISM equations 共2兲, together with the closure

relation 共4兲, allows us to ﬁnd the functions h

␣

共r兲 and c

␣

共r兲,

which are used to calculate hydration free energies. There are

no known methods to solve the set of RISM equations ana-

lytically in the general case. Thus, in most cases the RISM

equations are solved numerically. In Ref. 62, it was shown

that RISM-like equations for monatomic particles can be ef-

fectively solved using a multigrid technique.

In the current

work, we use the RISM-MOL solver, which is the

MATLAB

realization of the multigrid algorithm for solving RISM

equations.

The RISM-MOL solver is one of the recent de

velopments of the Computational Physical Chemistry and

Biophysics Group of the Max-Planck-Institute for Math-

ematics in the Sciences. Since this program has not previ-

ously been reported in literature, we give a short description

of it in the Appendix of this article.

Within the RISM theory, there are several expressions

which allow one to obtain values of the hydration free energy

from the total and direct correlation functions h

␣

共r兲 and

␣

共r兲. In our work, we compare the accuracy of four of the

most popular free energy expressions.

55,64–66

The ﬁrst ex

pression is the hypernetted-chain 共HNC兲 approximation,

which the formula for the hydration free energy is

⌬G

HNC

␲␳

兺

␣

冕

⬁

关−2c

␣

共r兲 − h

␣

共r兲

⫻共c

␣

共r兲 − h

␣

共r兲兲兴r

dr. 共5兲

The hypernetted-chain method with repulsive bridge correc-

tion 共HNCB兲 proposed by Kovalenko and Hirata in Ref. 55

has a modiﬁed form of this hydration free energy expression.

Here we adopted the HNCB expression from the previous

work

for the case of PLHNC closure.

The hydration free energy expression for the HNCB is

⌬G

HNCB

= ⌬G

HNC

␲␳

兺

␣

冕

⬁

共h

␣

共r兲 +1兲

⫻共e

−B

␣

共r兲

−1兲r

dr. 共6兲

Here 兵B

␣

共r兲其 are repulsive bridge correction functions, de-

ﬁned for each pair of solute s and solvent

␣

atoms by the

expression

exp共− B

␣

共r兲兲 =

兿

␯

⫽

␣

冓

␻

␣

␯

bulk

exp

冉

−

␤

␧

␯

冉

␴

␯

冊

冔

, 共7兲

where

␻

␣

␯

bulk

共r兲 are the solvent intramolecular correlation

functions, and

␴

␯

and ␧

␯

are the site-site parameters of the

pairwise Lennard-Jones potential.

044104-2 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲

The third expression is the Gaussian ﬂuctuation 共GF兲

approximation,

65–67

in which the free energy is given as

⌬G

␲␳

兺

␣

冕

⬁

共−2c

␣

共r兲 − c

␣

共r兲h

␣

共r兲兲r

dr. 共8兲

The ﬁnal hydration free energy equation we consider is

the partial wave 共PW兲 expression.

65,68

It has previously been

demonstrated to be one of the most accurate methods for

calculating hydration free energies of simple organic mol-

ecules within the RISM framework.

38,65,66

The PW expres

sion for hydration free energy is

⌬G

= ⌬G

␲␳

兺

␣

冕

⬁

␣

共r兲h

␣

共r兲r

dr, 共9兲

where h

␣

共r兲=兺

⬘

␯

具

␻

⬘

ⴱ h

⬘

␯

ⴱ

␻

␯

␣

bulk

典,

␻

⬘

and

␻

␯

␣

bulk

are the

elements of matrices that are inverse to the matrices W

=共

␻

⬘

兲 and W

bulk

=共

␻

␣

␯

bulk

兲, which are built from the solute

and solvent intramolecular correlation functions

␻

⬘

and

␻

␣

␯

bulk

, respectively.

We note that there exists a more sophisticated version of

the RISM-3D RISM, which operates with three-dimensional

solute-solvent correlation functions.

69–72

This theory is based

on three-dimensional analogous of Eqs. 共2兲 and 共4兲共see, e.g.,

Refs 25 and 72 for details兲. However, we are not using this

approach in this work, ﬁrst because the use of 3D convolu-

tion in the 3D RISM calculations makes them very compu-

tationally expensive even when advanced numerical ap-

proaches and parallel programming are used to speed up the

calculations.

The computational cost makes 3D RISM dif

ﬁcult to use for screening of large series of compounds and

testing different theoretical models 共e.g., in this work we

performed about 10

different RISM calculations to choose

the best parameters for hydration free energy prediction兲.

Second, the large computational cost associated with the 3D

convolution operations in the 3D RISM approach limits the

ﬁnite grid spacing to ⬃0.5 Å and the potential cutoff to

8–10 Å,

72,73

which complicates taking integrals in the 3D

analogs of the hydration free energy expressions 共5兲 and 共8兲

共Ref. 73兲 and affects the numerical accuracy of these calcu-

lations. However, in general, we consider the 3D RISM to be

a promising approach and we plan to extend our model to 3D

RISM in the near future. However, to make the 3D RISM

more feasible for practical applications, signiﬁcant develop-

ments of the numerical part of the model are required. A

detailed investigation of the performance of hydration free

energy calculations using 1D RISM and 3D RISM ap-

proaches will be the subject of future work in our group.

III. MATERIALS AND METHODS

A. Dataset

The methods discussed above were tested on hydration

free energy data for 31 organic molecules taken from the

SAMPL1 dataset 共Table I兲.

The molecules in this dataset

present a stringent test of methods to calculate hydration free

energy because they are signiﬁcantly larger and more com-

plex 共i.e., more functional groups兲 than those previously con-

sidered as benchmarks for hydration free energy calculations.

In Table I, the experimental hydration free energies 共in

kcal/mol兲 are given for all 31 molecules as reported in the

original SAMPL1 publication. The data are tabulated as

⌬G

hydr

=−RT ln共c

/ c

gas

兲, with concentrations in mol/l,

which corresponds to the choice of standard states proposed

by Ben-Naim. The experimental data are given as grand

mean averages taken from multiple sources from the pub-

lished literature. Details of the methods used to compile the

dataset are given in Ref. 19 and will not be recapitulated

here. For the 21 molecules, the experimental hydration data

were reported with estimated experimental uncertainties,

which range from 0.1 to 0.44 kcal/mol with a median value

of 0.1 kcal/mol. The hydration free energies of the remaining

10 molecules were calculated from solubility and vapor pres-

sures; if these data were reported without experimental un-

certainties, they were arbitrarily assigned values of 1 log

unit. As such, the reported experimental uncertainties for

these molecules are pessimistic estimates, as has previously

been noted by other authors.

In the present study, we work with only 31 of the 63

molecules in the full SAMPL1 dataset in order to make it

computationally feasible to test a large number of different

free energy expressions and molecular geometry and partial

charges methods. The 31 molecules were selected at random

TABLE I. Hydration free energy data and SAMPL1 identiﬁcation code for

the 31 molecules used in the current work.

ID Molecule

⌬G

hydr

共kcal/mol兲

Error

共kcal/mol兲

Cup08002 1,2-dinitroxypropane ⫺4.95 0.1

Cup08004 2-butyl nitrate ⫺1.82 0.1

Cup08005 Isobutyl nitrate ⫺1.88 0.1

Cup08007 Alachlor ⫺8.21 0.29

Cup08009 Ametryn ⫺7.65 0.45

Cup08016 Carbofuran ⫺9.61 0.3

Cup08020 Chlorimuronethyl ⫺14.01 1.93

Cup08021 Chloropicrin ⫺1.45 0.1

Cup08024 Diazinon ⫺6.48 0.13

Cup08025 Dicamba ⫺9.86 1.93

Cup08026 Dichlobenil ⫺4.71 1.93

Cup08028 Dinoseb ⫺6.23 1.93

Cup08029 Endosulfan alpha ⫺4.23 0.26

Cup08030 Endrin ⫺4.82 0.1

Cup08033 Heptachlor ⫺2.55 0.1

Cup08034 Isophorone ⫺5.18 1.37

Cup08035 Lindane ⫺5.44 0.1

Cup08038 Methyparathion ⫺7.19 0.1

Cup08041 Nitroxyacetone ⫺5.99 0.1

Cup08043 Parathion ⫺6.74 0.1

Cup08044 Pebulate ⫺3.64 1.93

Cup08045 Phorate ⫺4.37 0.1

Cup08048 Propanil ⫺7.78 1.93

Cup08050 Simazine ⫺10.22 0.1

Cup08052 Terbacil ⫺11.14 1.93

Cup08053 Terbutryn ⫺6.68 0.42

Cup08057 Vernolate ⫺4.13 1.36

Cup08058 4-amino-4-nitroazobenzene ⫺11.24 0.44

Cup08018 Chlordane ⫺3.44 0.1

Cup08047 Prometryn ⫺8.43 0.1

Cup08032 Fenuron ⫺9.13 1.93

044104-3

RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲

from the full dataset. The selected molecules contain be-

tween 8 and 27 heavy atoms each and have molecular

weights ranging from 119 to 426 atomic mass units.

B. Geometry optimization and atomic partial charge

calculation

Estimates of hydration free energies obtained by the

RISM are sensitive to the input molecular geometry and to

the calculated atomic partial charges. In this work, we tested

a variety of different classical 共force-ﬁeld兲 and quantum me-

chanical methods for their calculation.

Molecular structures were obtained for each molecule in

the SAMPL1 test set from the supporting information of Ref.

19. As a preliminary preparation of these structures, a low-

mode conformational search was carried out for each mol-

ecule in both gas and aqueous phases using the OPLS-2005

force-ﬁeld

60,61

in MACROMODEL v.9.1,

where aqueous sol

vent was simulated using the generalized-Born surface area

approximation.

The global minimum energy conformers

were used as input to each of the following geometry opti-

mization and partial charge calculations.

First, molecular geometries were optimized using the

B3LYP hybrid density functional and 6-31G

ⴱⴱ

basis set with

diffuse orbitals for heavy atoms and hydrogen

in vacuum

and in aqueous solvent simulated separately by two different

hydration models: the polarizable continuum model

共PCM兲

77–79

and the conductorlike continuum model

共CPCM兲,

79,80

implemented in GAUSSIAN 03.

All electronic

structure calculations were carried out in

GAUSSIAN 03

RevE.01,

unless otherwise stated. For each of the opti

mized molecular geometries, atomic partial charges were es-

timated by seven different methods: CHELP 共Charges from

Electrostatic Potentials兲,

CHELPG 共grid-based CHELP兲,

ESP 共Merz–Kollman Electrostatic Potential charges兲,

84,85

CHELP-DIPOLE, CHELPG-DIPOLE, ESP-DIPOLE, and

natural population analysis 共NPA兲.

The CHELP, CHELPG,

and ESP methods calculate atomic partial charges that repro-

duce the electrostatic potential on grid points outside the van

der Waals surface of the molecule. The sufﬁx “-DIPOLE”

indicates that the atomic partial charges are also constrained

to reproduce the molecular dipole. In NPA atomic partial

charges are obtained by decomposing the molecular wave

function into atomic contributions. Since each atomic partial

charge calculation was repeated for three different solvation

models 共vacuum, PCM, CPCM兲, we have 3⫻7=21 geom-

etry and atomic partial charge sets calculated using density

functional theory 共DFT兲.

Second, the molecular geometries were optimized using

Hartree–Fock 共HF兲 theory and the 6-31G

ⴱⴱ

basis set in

vacuum. As for the DFT calculations, seven different partial

charge estimations were carried out.

Third, AM1-BCC and AM1-Mulliken atomic partial

charges were calculated using

MOPAC

87,88

and

ANTECHAMBER.

89,90

AM1-BCC charges are evaluated by ap

plying an empirical bond charge correction 共BCC兲 scheme to

AM1-Mulliken charges. Here we use the BCC parameters

derived by Jakalian et al.,

which were ﬁtted by these au

thors to make the AM1-BCC charges match the electrostatic

potential at the HF/ 6-31G

ⴱ

level.

Finally, the geometries and partial charges calculated by

the OPLS2005 force-ﬁeld in both vacuum and aqueous sol-

vent during the low-mode conformational search were used

as an additional set of parameters. In total, we have 21+7

+2+2=32 different pairs of molecular geometries and

atomic partial charges for each molecule. For each of these

sets, the hydration free energy was calculated using four dif-

ferent RISM free energy expressions 共HNC, HNCB, GF, and

PW兲. In total, this gives 32⫻4=128 different combinations

of free energy calculation methods. To identify the selected

methods, we will list slash separated names of QM method,

hydration model, partial charge method, and RISM expres-

sion. For example, B3LYP/PCM/CHELPG-DIPOLE/PW.

For each combination of methods, the values of the hy-

dration free energies of the 31 molecules from Table I were

calculated. The best of these models were then parametrized

to improve predictions of the hydration free energy using

separate training and independent test sets.

C. Statistical modeling

1. Error calculation

For all molecules of the dataset 共see Table I兲, hydration

free energy values were calculated using different structure

optimization methods, partial charge models, and RISM free

energy formulas 共HNC, HNCB, GF, and PW兲. To compare

calculated and experimental results, root mean squared de-

viation 共RMSD兲 was evaluated,

RMSD共⌬G,⌬G

expt

兲 =

冑

兺

共⌬G

共i兲

− ⌬G

expt

共i兲

兲

, 共10兲

where index i runs through the set of N selected molecules,

and ⌬G

共i兲

and ⌬G

expt

共i兲

are the calculated and the experimental

hydration free energy values of molecule i, respectively. The

total deviation can be split into the two parts: mean displace-

ment 共M兲 and standard deviation 共SD兲, which are calculated

by the formulas

M共⌬G − ⌬G

expt

兲 =

兺

i苸S

共⌬G

共i兲

− ⌬G

expt

共i兲

兲, 共11兲

SD共⌬G − ⌬G

expt

兲

冑

兺

i苸S

共⌬G

共i兲

− ⌬G

expt

共i兲

− M共⌬G − ⌬G

expt

兲兲

. 共12兲

The mean displacement gives the systematic error, which can

be corrected by a simple constant term. The standard devia-

tion gives the random error that is not explained by the

model. One can see the connection between these three for-

mulas,

RMSD共⌬G,⌬G

expt

兲

= M共⌬G − ⌬G

expt

兲

+SD共⌬G − ⌬G

expt

兲

. 共13兲

2. Fitting formula

In Ref. 38, it was shown that when excluded volume-

based correction terms are included in the RISM/PW for-

mula, the accuracy of the calculated hydration free energies

044104-4 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲

for simple nonpolar organic solutes improves considerably.

This result suggests that excluded volume corrections should

also be useful for improving the prediction of the RISM for

the SAMPL1 molecule set. In this case, we calculate the

excluded volume of the solute in inﬁnitely dilute aqueous

solution as a limiting case of the partial molar volume

formula

when the solute density tends to zero,

␳

␲

solute

兺

冕

⬁

共h

bulk

共r兲 − h

共r兲兲r

dr. 共14兲

Here h

bulk

共r兲 is the total oxygen-to-oxygen correlation func-

tion of bulk water and h

共r兲 is the total correlation function

between the solute site s and the water oxygen.

In Ref. 38, it is discussed that the RISM formulas may

systematically overestimate the hydration free energy of

small organic compounds, which contain certain types of

functional groups, e.g., charged groups or hydroxyl groups.

The authors introduced group contribution terms to correct

for these systematic errors. In a similar manner, additional

functional group corrections might be required for calcula-

tions of the larger molecules considered here. Due to the

structures of the molecules from SAMPL1 set, however,

there is no single obvious way to separate them by functional

groups. Therefore, to be consistent we used atom type rather

than functional group corrections. The 31 molecules given in

Table I contain hydrogen, carbon, oxygen, nitrogen, oxygen,

chlorine, phosphorus, and sulfur atoms. Thus, the ﬁtting for-

mula is

⌬G

corr

共b兲 = ⌬G

RISM

+ b

兺

, 共15兲

where j runs over the all atom types: j

苸兵H,C,N,O,Cl,P,S其, n

is a number of atoms of type j in

the molecule, and b=兵b

其 are the

coefﬁcients to be ﬁtted on the training molecule set. To pa-

rametrize the empirical model, we partitioned the 31-

molecule SAMPL1 subset into separate training and inde-

pendent test sets. As a training set, we chose 12 molecules,

which are listed in Table II. As one can see, the minimum

ﬁtting condition is satisﬁed: for each atom type there is at

least 1 molecule from the training set which contain atoms of

this type. The test set comprised the remaining 19 molecules

given in Table I, which are not in the 12-molecule

training set given in Table II. Coefﬁcients b

=兵b

其 in the formula 共15兲 were ﬁt-

ted to minimize the root mean squared deviation

RMSD共⌬G

expt

,⌬G

corr

共b兲兲 on the training set molecules.

3. Validation of the ﬁtting results

Because we have relatively small test and training sets,

the small error on the test set by itself was not enough to

validate the formula. An additional validation procedure was

needed. First, a standard analysis of the variance 共t-test and

F-test兲

was performed to make sure that both experimental

and corrected calculated results have the same mean values

and standard deviations. Second, the coefﬁcients of determi-

nation 共R

兲 were calculated to check the strength of correla-

tion between the corrected calculated and the experimental

results. To check that the ﬁtting coefﬁcients are not depen-

dent on the choice of the training set, three additional tests

were performed: 共i兲 leave-one-out cross-validation, 共ii兲

leave-ﬁve-out cross-validation, and 共iii兲 comparison of the

coefﬁcients obtained by ﬁtting to the training set and to the

full set. In the leave-one-out test, we perform a series of

ﬁttings using the training sets, which are the initial 31-

molecule test set from Table I with 1 molecule extracted. For

all possible choices of the extracted molecule, we have 31

different sets of ﬁtting coefﬁcients,

共k兲

= 兵b

共k兲

其, k = 1, ... ,31.

共16兲

We count the relative standard deviation of each type of ﬁt-

ting coefﬁcient,

␦

SD共兵b

共k兲

其兲

兩M共兵b

共k兲

其兲兩

⫻ 100%, 共17兲

where j is the type of the coefﬁcient: j

苸兵V,H,C,N,O,Cl,P,S其. Values

␦

show the sensitivity of

the coefﬁcient b

to the choice of training set. Low

␦

values

indicate that coefﬁcient b

is not arbitrary and we can trust its

value, while high

␦

values indicate physically nonreliable

coefﬁcients. The leave-ﬁve-out test is similar to leave-one-

out, but the training sets are constructed by excluding 5 mol-

ecules from the initial 31-molecule test set given in Table I.

TABLE II. The number of times each atom type occurs in each molecule of the training set.

ID n

Cup08002 6 3 2 6 0 0 0

Cup08009 17 9 5 0 0 0 1

Cup08021 0 1 1 2 3 0 0

Cup08024 21 12 2 3 0 1 1

Cup08026 3 7 1 0 2 0 0

Cup08029 6 9 0 3 6 0 1

Cup08032 12 9 2 1 0 0 0

Cup08034 14 9 0 1 0 0 0

Cup08035 6 6 0 0 6 0 0

Cup08038 10 8 1 5 0 1 1

Cup08044 21 10 1 1 0 0 1

Cup08057 21 10 1 1 0 0 1

044104-5

RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲

Because the number of possible choices of 5 molecules

among 31 is quite large 共C

=169 911兲, we chose randomly

1000 such extractions and calculate the values

␦

for them.

In addition, the training set ﬁtting coefﬁcients were com-

pared to the full set ﬁtting coefﬁcients. In this case, we per-

form two different ﬁttings. Using the full 31-molecule test

sets from Table I for training, we obtain the ﬁtting coefﬁ-

cients b

共full兲

, j 苸兵V,H,C,N,O,Cl,P,S其. Using the training

set from Table II, we obtain another set of ﬁtting coefﬁcient

共train兲

and calculate the

␦

values by the formula

␦

ⴱ

兩b

共full兲

− b

共train兲

兩

兩b

共full兲

兩

⫻ 100%. 共18兲

IV. RESULTS AND DISCUSSION

A. Analysis of calculated data

1. Models without empirical corrections

The hydration free energy values were calculated for the

31-molecule test set from Table I using 128 combinations of

RISM and structure calculation methods. One can ﬁnd the

results of the calculations in Ref. 94. The comparison with

experiment shows quite high RMSD values for all methods.

The smallest error is about 5.6 kcal/mol.

However, if we

look at the differences between the calculated and experi-

mental results, we can see that they are not random. For

many combinations of QM/RISM methods, differences are

distributed around a mean value and the standard deviation is

reasonably small 共see Fig. 1兲. The smallest standard devia-

tion of the differences is achieved using the B3LYP/gas/

CHELPG-DIPOLE/PW methods and is about 2.6 kcal/mol,

which is comparable to the results of the SAMPL1 hydration

free energy predictions found in literature.

20–22

We see that

although the RISM predictions contain large systematic er-

rors, the free energies calculated using the RISM are well

correlated with the experimental values. To support this

point, we calculated the correlation coefﬁcients for the ex-

perimental and calculated values for each combination of

methods. In Table III, correlation coefﬁcients are listed for

the methods that give the smallest standard deviation of the

differences between the calculated and the experimental hy-

dration free energy values. Results of these methods are well

correlated with the experimental data 共for most of them cor-

relation coefﬁcients are larger than 0.7兲. RMSD values, stan-

dard deviations, and correlation coefﬁcients for all methods

−15 −10 −5

Δ G

(kcal/mol)

Δ G

calc

−Δ G

exp

(kcal/mol)

RMSD=12.5

Mean =12.2

SD=2.6

B3LYP/gas/ChelpG−dipole/PW

FIG. 1. Systematic and random errors between the hydration free energies

calculated by the B3LYP/gas/CHELPG-DIPOLE/PW method and the ex-

perimental results.

TABLE III. 共a兲 RISM results with the smallest standard deviations of differences between experimental and

calculated hydration free energies. 共b兲 RISM results with the largest correlation coefﬁcients.

QM level Solvation model Partial charges Formula

Standard deviation

共kcal/mol兲 Correlation coefﬁcient

共a兲 Ten results with the smallest standard deviation

B3LYP Gas CHELPG-DIPOLE PW 2.599 0.749

B3LYP Gas CHELP-DIPOLE PW 2.642 0.685

B3LYP Gas CHELPG PW 2.647 0.744

B3LYP Gas CHELP PW 2.672 0.677

HF Gas CHELPG PW 3.132 0.769

HF Gas CHELPG-DIPOLE PW 3.187 0.766

FF Gas OPLS2005 PW 3.331 0.706

HF Gas CHELP-DIPOLE PW 3.391 0.688

HF Gas CHELP PW 3.459 0.679

B3LYP PCM CHELP PW 3.558 0.820

共b兲 Ten results with the highest correlation coefﬁcients

B3LYP CPCM CHELPG-DIPOLE PW 3.595 0.869

B3LYP CPCM CHELPG PW 3.582 0.868

B3LYP PCM CHELPG-DIPOLE PW 3.581 0.868

B3LYP PCM CHELPG PW 3.568 0.868

B3LYP CPCM CHELPG-DIPOLE GF 7.370 0.830

B3LYP CPCM CHELPG GF 7.349 0.830

B3LYP PCM CHELPG-DIPOLE GF 7.347 0.829

B3LYP CPCM CHELP-DIPOLE PW 3.689 0.824

B3LYP PCM CHELP-DIPOLE PW 3.575 0.823

B3LYP CPCM CHELP PW 3.701 0.821

044104-6 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲

are given in Ref. 94. Using only these preliminary results, we

can already select the most and least suitable methods. We

see that good correlations with experiment are observed with

both HF and B3LYP methods with CHELP, CHELPG, and

CHELP-DIPOLE, or CHELPG-DIPOLE charges. As we see,

the standard deviation for the OPLS2005 charges with PW

expression is about 3.3 kcal/mol; a promising result for this

level of theory. We also see that the smallest standard devia-

tions between the calculated and the experimental results are

obtained with the PW RISM formula, the GF formula gives

intermediate results 共the lowest standard deviation of error is

about 5.3 kcal/mol兲, while the HNC and HNCB free energy

formulas give quite large deviations from experiment 共stan-

dard deviations of errors are larger than 8.8 kcal/mol兲. The

methods for which we have reported small standard devia-

tions of the errors might be expected to be amenable to pa-

rametrization 共using, e.g., molecular volume and atom type

variables兲.

2. Models with empirical corrections

For each combination of methods, the coefﬁcients b

=兵b

其 in formula 共15兲 were ﬁtted

using the training set molecules from Table II. Each ﬁtting

formula was assessed using the test set 共comprising the re-

maining 19 molecules from the 31-molecule test set兲. The ten

best results with smallest RMSD on the test set are listed in

Table IV. Fitting results for other methods are given in Ref.

94. Comparing Table III 共smallest standard deviations兲 with

Table IV 共best ﬁtting results兲 we can see the same set of

structure optimization and partial charge methods. It is inter-

esting to note that although the GF formula gives much

larger standard deviations than the PW formula, after param-

etrization it is able to produce results, which are almost as

good as for the PW formula. We also note that OPLS2005

force-ﬁeld calculations combined with the PW formula give

good results after ﬁtting 共RMSD of about 2 kcal/mol兲.

The best combination of methods is HF/gas/CHELPG/

PW. The calculated values of RISM/PW hydration free ener-

gies and the calculated excluded volumes are given in Ref.

94. After ﬁtting, the RMSD value for the 19-molecule test set

is less than 1.2 kcal/mol. Differences between the calculated

and the experimental hydration free energies for this method

are presented in Fig. 2.

In Table V, values of the ﬁtting coefﬁcients b

=兵b

其 are presented for the HF/gas/

CHELPG/PW method. To validate these coefﬁcients, leave-

one-out and leave-ﬁve-out cross-validations have been car-

ried out, along with a comparison between the coefﬁcients

obtained by ﬁtting against either the training set or full

dataset. As one can see, the deviations between the different

training sets are quite small. The highest deviations are about

11% for sulfur and phosphorus 共these are the rarest elements

in the 31-molecule test set兲. The small relative deviations

mean that the formula will not change a lot if one uses a

different training set, i.e., the ﬁtting coefﬁcients are stable.

B. Comparison with other methods

Hydration free energies predicted by other methods for

the 63-molecule SAMPL1 set are given in Refs. 20–22. The

trend in these results is that continuum models 共which in-

clude some ﬁtted parameters兲 give RMS errors around

2.5 kcal/mol on the SAMPL1 set, while slightly higher

RMSD errors are reported for explicit solvent approaches. In

order to provide a direct comparison to our results, we have

used the data given in Refs. 20–22 to recalculate the RMSD

obtained by these methods for the 19 molecules of our test

set only 共Table VI兲. For the HF/gas/CHELPG/PW method,

TABLE IV. The ten ﬁtting results with smallest RMSDs for the 19-molecule test set.

QM level Solvation model Partial charges Formula

RMSD

共kcal/mol兲 R

F-test t-test

HF Gas CHELPG PW 1.138 0.897 Passed Passed

HF Gas CHELPG-DIPOLE PW 1.161 0.894 Passed Passed

B3LYP Gas CHELPG-DIPOLE GF 1.250 0.877 Passed Passed

B3LYP Gas CHELPG GF 1.270 0.871 Passed Passed

HF Gas CHELPG GF 1.344 0.857 Passed Passed

B3LYP Gas CHELPG-DIPOLE PW 1.372 0.859 Passed Passed

HF Gas CHELPG-DIPOLE GF 1.375 0.850 Passed Passed

B3LYP Gas CHELP-DIPOLE GF 1.417 0.831 Passed Passed

B3LYP Gas CHELPG PW 1.434 0.846 Passed Passed

AM1 Gas BCC PW 1.470 0.817 Passed Passed

−14 −12 −10 −8 −6 −4 −2

−5

−4

−3

−2

−1

Δ G

(kcal/mol)

Δ G

corr

− Δ G

exp

(kcal/mol)

RMSD = 1.138 kcal/mol

HF/gas/ChelpG/PW

Training set

Test set

FIG. 2. The results for the best ﬁtted model 共HF/gas/CHELPG/PW兲. The

RMSD on the test set is 1.14 kcal/mol.

044104-7

RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲

we obtained a RMSD on the test set of 1.14 kcal/mol, which

is almost half of that reported for the continuum models for

the same molecules.

Of course, such comparisons are not completely fair be-

cause results in Refs. 20–22 were obtained without knowl-

edge of the experimental hydration free energies, while re-

sults in the current paper were ﬁtted to give the best

performance on the SAMPL1 subset. However, analysis of

the performance of the different methods shows quite reason-

able trends: the best performing methods are those which use

better levels of QM theory and better RISM hydration free

energy expressions 共GF and PW兲. This indicates that good

agreement with experiment is not just a random result of

statistical ﬁtting but has a physical background. The authors

realize that the ﬁtting procedure proposed in this paper needs

to be improved and further validated before it can be used for

the accurate blind prediction of hydration free energies.

However, this paper illustrates a procedure by which the ef-

ﬁcient RISM-based method for calculating hydration free en-

ergies can be developed.

V. CONCLUSIONS

We have compared the performance of different models

based on RISM theory for the calculation of the hydration

free energies of druglike molecules. The best models were

identiﬁed among 128 possible combinations of four different

RISM free energy expressions and 32 different sets of mo-

lecular geometries and atomic partial charges.

TABLE V. Fitting coefﬁcients for the HF/gas/CHELPG/PW method and their deviations during the leave-one-

out, leave-ﬁve-out, and training vs full ﬁtting validations.

Coefﬁcient Value

One-left-test 共

␦

兲

共%兲

Five-left-test 共

␦

兲

共%兲

Train vs full ﬁt

␦

ⴱ

共%兲

⫺0.233 0.904 2.108 1.396

0.599 3.217 7.666 5.233

1.383 1.544 3.822 2.309

2.193 1.431 3.461 1.531

1.629 1.470 3.606 7.738

2.687 1.621 3.869 1.040

4.867 4.150 9.686 11.564

4.460 2.061 5.157 11.030

TABLE VI. Comparison of hydration free energies for the 19-molecule test set calculated by different methods 共kcal/mol兲.

Mol. ID Expt.

RISM

SM6

SM8

SMD

Klamt1

Klamt2

Sulea1

Sulea2

Sulea3

Cup08004 ⫺1.82 ⫺3.492 ⫺0.40 ⫺0.30 0.70 0.43 0.02 0.13 0.24 ⫺1.46

Cup08005 ⫺1.88 ⫺3.391 ⫺0.30 ⫺0.20 0.60 0.13 0.13 0.17 0.26 ⫺1.52

Cup08007 ⫺8.21 ⫺9.878 ⫺6.10 ⫺6.30 ⫺8.40 ⫺7.66 ⫺8.02 ⫺9.23 ⫺9.01 ⫺7.54

Cup08016 ⫺9.61 ⫺9.859 ⫺12.20 ⫺12.30 ⫺10.90 ⫺10.97 ⫺11.15 ⫺8.64 ⫺8.35 ⫺9.23

Cup08018 ⫺3.44 ⫺3.142 ⫺3.00 ⫺2.70 ⫺4.40 ¯¯⫺2.33 ⫺2.88 ⫺2.06

Cup08020 ⫺14.01 ⫺14.178 ⫺27.00 ⫺26.30 ⫺23.10 ⫺17.61 ⫺17.59 ⫺21.53 ⫺20.67

⫺21.59

Cup08025 ⫺9.86 ⫺9.965 ⫺8.00 ⫺7.90 ⫺6.80 ⫺9.46 ⫺9.46 ⫺7.63 ⫺7.73 ⫺8.08

Cup08028 ⫺6.23 ⫺7.212 ⫺9.70 ⫺9.60 ⫺8.30 ⫺4.54 ⫺4.54 ⫺4.12 ⫺4.26 ⫺6.60

Cup08030 ⫺4.82 ⫺3.924 ⫺6.30 ⫺5.60 ⫺4.70 ⫺7.34 ⫺7.34 ⫺4.47 ⫺5.11 ⫺5.01

Cup08033 ⫺2.55 ⫺2.667 ⫺2.10 ⫺1.80 ⫺2.30 ⫺5.91 ⫺5.91 ⫺0.62 ⫺1.08 ⫺0.77

Cup08041 ⫺5.99 ⫺6.086 ⫺5.40 ⫺5.10 ⫺3.50 ⫺3.81 ⫺3.97

⫺7.04 ⫺7.01 ⫺6.94

Cup08043 ⫺6.74 ⫺6.181 ⫺6.50 ⫺7.90 ⫺6.30 ⫺7.65 ⫺7.65 ⫺5.84 ⫺5.86 ⫺7.51

Cup08045 ⫺4.37 ⫺2.514 ⫺4.10 ⫺6.80 ⫺7.20 ⫺4.71 ⫺4.71 ⫺3.19 ⫺3.53 ⫺4.44

Cup08047 ⫺8.43 ⫺7.292 ⫺7.10 ⫺8.30 ⫺7.90 ⫺8.15 ⫺8.17 ⫺9.36 ⫺8.71 ⫺8.44

Cup08048 ⫺7.78 ⫺7.193 ⫺8.50 ⫺8.60 ⫺7.60 ⫺8.94 ⫺8.94 ⫺8.40 ⫺8.20 ⫺7.95

Cup08050 ⫺10.22 ⫺9.074 ⫺10.00 ⫺11.10 ⫺11.20

⫺9.74 ⫺9.74 ⫺9.91 ⫺9.14 ⫺8.68

Cup08052 ⫺11.14 ⫺9.266 ⫺8.90 ⫺9.60 ⫺9.20 ⫺11.27 ⫺11.27 ⫺15.67 ⫺15.35 ⫺14.47

Cup08053 ⫺6.68 ⫺7.337 ⫺8.40 ⫺9.40 ⫺8.10 ⫺7.38 ⫺7.63 ⫺10.00 ⫺9.34 ⫺9.26

Cup08058 ⫺11.24 ⫺12.923 ⫺13.80 ⫺13.10 ⫺11.40 ¯¯⫺13.36 ⫺14.12 ⫺16.27

RMSD 1.108 3.40 3.30 2.65 1.76 1.73 2.53 2.33 2.45

Experimental data 共Ref. 19兲.

HF/gas/CHELPG/PW RISM method 共with correction兲.

SM6, SM8, and SMD models 共Ref. 20兲.

Original prediction 共Ref. 21兲.

Prediction after cross merging 共Ref. 21兲.

Model 兵共1,0.9兲,16其共Ref. 22兲共supporting information兲.

Model 兵共1,0.9兲,25其共Ref. 22兲共supporting information兲.

Model 兵共2,1.0兲,25其共Ref. 22兲共supporting information兲.

044104-8 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲

The RISM calculations were validated against experi-

mental data taken from the SAMPL1 dataset. Since these

data were originally published as part of a blind challenge to

calculate hydration free energies, this has permitted us to

compare our results with those of the best implicit and ex-

plicit solvent approaches.

Although we observe that hydration free energies calcu-

lated with RISM theory contain signiﬁcant absolute errors,

for the best methods tested here, these are found to be domi-

nated by large systematic errors, while the random errors are

considerably smaller. Using the best free energy expression

共PW兲 combined with the best structure determination meth-

ods 共HF or B3LYP with CHELPG/CHELPG-DIPOLE

charges and AM1 with BCC charges兲, the random errors in

the calculated hydration free energies were approximately

2.6 kcal/mol, which is comparable to results obtained by the

best implicit and explicit solvent methods. After parametri-

zation using an excluded volume term and simple atom

counts, the RMSD calculated by the best model 共HF/gas/

CHELPG/PW兲 was less than 1.2 kcal/mol, which is about

half the error reported by continuum models for the same

molecules.

Hydration free energies calculated by RISM theory have

traditionally been considered to be too inaccurate to be use-

ful in practical applications such as pharmaceutical drug de-

sign. However, these assumptions have been based on pub-

lications that have tested the HNC, HNCB, or related free

energy expressions. The results presented here show that the

PW or GF expressions allow relatively accurate calculations

of hydration free energies, which may be systematically im-

proved by the addition of a small number of simple empirical

parameters.

The RISM calculations based on the HNC expression

give inaccurate estimates of hydration free energies because

they overestimate the energy required to form a cavity in the

solvent and underestimate the electrostatic contribution to

the hydration free energy of hydrogen bonding sites.

38,65,66

principle, it might be possible to eliminate some of these

errors through the design of an appropriate bridge function,

but this is presently an open problem in the integral equation

theory of molecular liquids.

The results presented here indicate that qualitatively cor-

rect results obtained by the best RISM expressions can be

improved by an empirical ﬁtting procedure to yield very ac-

curate quantitative predictions of hydration free energies.

The optimum model 共HF/gas/CHELPG/PW兲 is considerably

less computationally expensive than explicit solvent ap-

proaches for estimating hydration free energy. The results

suggest that after further development RISM theory has the

potential to be widely beneﬁcial in practical applications

such as, e.g., pharmaceutical drug discovery and drug devel-

opment.

ACKNOWLEDGMENTS

This work was supported by a grant from the Villum

Kahn Rasmussen foundation through a postdoctoral grant to

D.S.P. Computations were made possible through grants

from the Lundbeck Foundation, the Novo Nordisk Founda-

tion, the Carlsberg Foundation, and from the Danish Center

for Scientiﬁc Computing. We thank Gennady N. Chuev and

Andrey I. Frolov for useful discussions and critical reading

of the manuscript. We would also like to acknowledge the

support staff of the Max-Planck-Institute for Mathematics in

the Sciences and particularly Ms. Valeria Huenniger, Ms.

Heike Rackwitz, and Ms. Theresa Petsch for the technical

and administrative support of the collaboration with Aarhus

University.

APPENDIX: THE RISM-MOL SOLVER

In the current work, the calculations of the RISM solute-

solvent correlation functions were performed with the RISM-

MOL program, which was developed, for fast solution of the

RISM integral equations, by Fedorov and Sergiievskyi in the

Computational Physical Chemistry and Biophysics group of

the Max-Planck-Institute for Mathematics in the Sciences.

To solve the RISM equations, the RISM-MOL program

uses the Fourier iterative method

speeded up by the multi

grid technique.

It was shown recently that the multigrid

method

is able to speed up the Fourier iterations for the

atomic Ornstein–Zernike equation up to several dozen

times.

The same multigrid method has been implemented

in the RISM-MOL program for 1D RISM calculations. Using

this algorithm, the hydration free energy calculations for the

largest molecule in the set 共42 atoms兲 took about 30 s on one

single processor core. The average time required for the hy-

dration free energy calculations was 17 s/molecule.

As the input data, the RISM-MOL solver takes the Car-

tesian coordinates, parameters of the Lennard-Jones poten-

tial, and partial charges q

of the atoms of the solute mol-

ecule. The parameters of the solvent molecules, as well as

precalculated bulk-solvent correlation functions h

␣

bulk

共r兲, are

embedded in the program. Using the atomic parameters, the

site-site interaction potentials between the solute sites s and

the solvent sites

␣

are calculated,

␣

共r兲 = u

␣

共r兲 + u

␣

共r兲, 共A1兲

where u

␣

共r兲 is the Coulomb potential

␣

共r兲 =

␣

共A2兲

and u

␣

共r兲 is a Lennard-Jones potential

␣

共r兲 =4

⑀

␣

冉冉

␴

␣

冊

−

冉

␴

␣

冊

. 共A3兲

The pair Lennard-Jones parameters

␴

␣

and

⑀

␣

are calcu-

lated via the combining rules. By default, the Lorentz–

Berthelot rules are used

␴

␣

␴

␣

⑀

␣

冑

⑀

␣

. 共A4兲

Other combining rules can be deﬁned by the user.

In the RISM-MOL program, it is possible to vary the

number of grids, the number of grid points, the number of

iterations, and, hence, the accuracy of the calculation. In the

044104-9

RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲

current study, six-grid iterations were used. The ﬁnal solution

was obtained on a grid with 4096 grid points and 0.05 bohr

step size with L

-norm accuracy ␧=10

−4

The fast implementation of the algorithm for the numeri-

cal solution of the RISM equations, together with the pre-

sented possibilities for accurate hydration free energy calcu-

lations, makes the RISM-MOL solver a robust tool for

investigating the thermodynamics of solution. The program

can be obtained for academic users free of charge from

Fedorov by request.

C. A. Reynolds, P. M. King, and W. G. Richards, Mol. Phys. 76, 251

共1992兲.

P. Kollman, Chem. Rev. 共Washington, D.C.兲 93, 2395 共1993兲.

G. Perlovich and A. Bauer-Brandl, Curr. Drug Deliv. 1,213共2004兲.

G. L. Perlovich, T. V. Volkova, and A. Bauer-Brandl, J. Pharm. Sci. 95,

2158 共2006兲.

G. L. Perlovich, L. K. Hansen, T. V. Volkova, S. Mirza, A. N. Manin, and

A. Bauer-Brandl, Cryst. Growth Des. 7, 2643 共2007兲.

L. D. Hughes, D. S. Palmer, F. Nigsch, and J. B. O. Mitchell, J. Chem.

Inf. Model. 48,220共2008兲.

D. S. Palmer, A. Llinas, I. Morao, G. M. Day, J. M. Goodman, R. C.

Glen, and J. B. O. Mitchell, Mol. Pharmacol. 5,266共2008兲.

W. L. Jorgensen and J. TiradoRives, Perspect. Drug Discovery Des. 3,

123 共1995兲.

N. Matubayasi and M. Nakahara, J. Chem. Phys. 113, 6070 共2000兲.

N. Matubayasi and M. Nakahara, J. Mol. Liq. 119,23共2005兲.

M. R. Shirts and V. S. Pande, J. Chem. Phys. 122, 134508 共2005兲.

N. Matubayasi, Front. Biosci. 14, 3536 共2009兲.

J. L. Knight and C. L. Brooks, J. Comput. Chem. 30, 1692 共2009兲.

J. Tomasi and M. Persico, Chem. Rev. 共Washington, D.C.兲 94, 2027

共1994兲.

B. Roux and T. Simonson, Biophys. Chem. 78,1共1999兲.

D. Bashford and D. A. Case, Annu. Rev. Phys. Chem. 51, 129 共2000兲.

J. Tomasi, B. Mennucci, and R. Cammi, Chem. Rev. 共Washington, D.C.兲

105, 2999 共2005兲.

M. B. Ulmschneider, J. P. Ulmschneider, M. S. P. Sansom, and A. Di

Nola, Biophys. J. 92, 2338 共2007兲.

J. P. Guthrie, J. Phys. Chem. B 113, 4501 共2009兲.

A. V. Marenich, C. J. Cramer, and D. G. Truhlar, J. Phys. Chem. B 113,

4538 共2009兲.

A. Klamt, F. Eckert, and M. Diedenhofen, J. Phys. Chem. B 113,4508

共2009兲.

T. Sulea, D. Wanapun, S. Dennis, and E. O. Purisima, J. Phys. Chem. B

113,4511共2009兲.

P. A. Monson and G. P. Morriss, Adv. Chem. Phys. 77,451共1990兲.

J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, 3rd ed.

共Academic, London, 1991兲, http://www.sciencedirect.com/science/book/

9780123705358.

Molecular Theory of Solvation, edited by F. Hirata 共Kluwer Academic,

Dordrecht, 2003兲.

L. Blum and A. J. Torruella, J. Chem. Phys. 56,303共1972兲.

K. Amano and M. Kinoshita, Chem. Phys. Lett. 488,1共2010兲.

D. Chandler and H. C. Andersen, J. Chem. Phys. 57, 1930 共1972兲.

F. Hirata, B. M. Pettitt, and P. J. Rossky, J. Chem. Phys. 77, 509 共1982兲.

B. M. Pettitt and P. J. Rossky, J. Chem. Phys. 77, 1451 共1982兲.

M. Kinoshita, Y. Okamoto, and F. Hirata, J. Comput. Chem. 19,1724

共1998兲.

M. Kinoshita, Y. Okamoto, and F. Hirata, J. Am. Chem. Soc. 120, 1855

共1998兲.

M. Kinoshita, Y. Okamoto, and F. Hirata, J. Chem. Phys. 110, 4090

共1999兲.

T. Imai, M. Kinoshita, and F. Hirata, Bull. Chem. Soc. Jpn. 73, 1113

共2000兲.

T. Imai, R. Hiraoka, A. Kovalenko, and F. Hirata, J. Am. Chem. Soc.

127, 15334 共2005兲.

N. Yoshida, S. Phongphanphanee, Y. Maruyama, T. Imai, and F. Hirata, J.

Am. Chem. Soc. 128, 12042 共2006兲.

N. Yoshida, S. Phongphanphanee, and F. Hirata, J. Phys. Chem. B 111,

4588 共2007兲.

G. Chuev, M. Fedorov, and J. Crain, Chem. Phys. Lett. 448, 198 共2007兲.

M. V. Fedorov and A. A. Kornyshev, Mol. Phys. 105,1共2007兲.

G. N. Chuev and M. V. Fedorov, J. Chem. Phys. 131, 074503 共2009兲.

T. Imai, Y. Harano, M. Kinoshita, A. Kovalenko, and F. Hirata, J. Chem.

Phys. 126, 225102 共2007兲.

T. Imai, S. Ohyama, A. Kovalenko, and F. Hirata, Protein Sci. 16,1927

共2007兲.

D. Yokogawa, H. Sato, T. Imai, and S. Sakaki, J. Chem. Phys. 130,

064111 共2009兲.

T. Imai, K. Oda, A. Kovalenko, F. Hirata, and A. Kidera, J. Am. Chem.

Soc. 131, 12430 共2009兲.

Y. Kiyota, R. Hiraoka, N. Yoshida, Y. Maruyama, I. Imai, and F. Hirata,

J. Am. Chem. Soc. 131, 3852 共2009兲.

K. Nishiyama, T. Yamaguchi, and F. Hirata, J. Phys. Chem. B 113, 2800

共2009兲.

J.-P. Hansen and I. R. McDonald, Theory of Simple Liquids, 4th ed.

共Elsevier Academic Press, Amsterdam, The Netherlands, 2000兲.

M. Kinoshita, Y. Okamoto, and F. Hirata, J. Comput. Chem. 18, 1320

共1997兲.

A. Kovalenko and F. Hirata, J. Phys. Chem. B 103 , 7942 共1999兲.

A. Kovalenko and F. Hirata, J. Chem. Phys. 110, 10095 共1999兲.

L. Lue and D. Blankschtein, J. Phys. Chem. 96, 8582 共1992兲.

H. J. C. Berendsen, J. R. Grigera, and T. P. Straatsma, J. Phys. Chem. 91,

6269 共1987兲.

F. Hirata and P. J. Rossky, Chem. Phys. Lett. 83,329共1981兲.

P. H. Lee and G. M. Maggiora, J. Phys. Chem. 97, 10175 共1993兲.

A. Kovalenko and F. Hirata, J. Chem. Phys. 113, 2793 共2000兲.

G. N. Chuev and M. V. Fedorov, J. Comput. Chem. 25,1369共2004兲.

G. N. Chuev and M. V. Fedorov, J. Chem. Phys. 120, 1191 共2004兲.

M. V. Fedorov and G. N. Chuev, J. Mol. Liq. 120, 159 共2005兲.

M. V. Fedorov, H. J. Flad, G. N. Chuev, L. Grasedyck, and B. N.

Khoromskij, Computing 80,47共2007兲.

W. L. Jorgensen, D. S. Maxwell, and J. TiradoRives, J. Am. Chem. Soc.

118, 11225 共1996兲.

G. A. Kaminski, R. A. Friesner, J. Tirado-Rives, and W. L. Jorgensen, J.

Phys. Chem. B 105, 6474 共2001兲.

M. V. Fedorov and W. Hackbusch, “A multigrid solver for the integral

equations of the theory of liquids,” Preprint No. 88 共Max-Planck-Institut

fuer Mathematik in den Naturwissenschaften, 2008兲.

W. Hackbusch, Multi-Grid Methods and Applications 共Springer-Verlag,

Berlin, 1985兲.

S. J. Singer and D. Chandler, Mol. Phys. 55,621共1985兲.

S. Ten-no, J. Chem. Phys. 115, 3724 共2001兲.

K. Sato, H. Chuman, and S. Ten-no, J. Phys. Chem. B 109, 17290

共2005兲.

D. Chandler, Y. Singh, and D. M. Richardson, J. Chem. Phys. 81,1975

共1984兲.

S. Ten-no and S. Iwata, J. Chem. Phys. 111, 4865 共1999兲.

C. M. Cortis, P. J. Rossky, and R. A. Friesner, J. Chem. Phys. 107, 6400

共1997兲.

Q. H. Du, D. Beglov, and B. Roux, J. Phys. Chem. B 104, 796 共2000兲.

A. Kovalenko, F. Hirata, and M. Kinoshita, J. Chem. Phys. 113, 9830

共2000兲.

T. Luchko, S. Gusarov, D. R. Roe, C. Simmerling, D. A. Case, J. Tuszyn

ski, and A. Kovalenko, J. Chem. Theory Comput. 6, 607 共2010兲.

S. Genheden, T. Luchko, S. Gusarov, A. Kovalenko, and U. Ryde, J.

Phys. Chem. B 114,8505共2010兲.

Schrödinger LLC 共2008兲, SCHRODINGER SUITE 2008, MAESTRO Version 8.5,

MACROMODEL Version 9.6.

W. C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson, J. Am.

Chem. Soc. 112, 6127 共1990兲.

R. Krishnan, J. S. Binkley, R. Seeger, and J. A. Pople, J. Chem. Phys. 72,

650 共1980兲.

E. Cancès, B. Mennucci, and J. Tomasi, J. Chem. Phys. 107, 3032

共1997兲.

B. Mennucci and J. Tomasi, J. Chem. Phys. 106, 5151 共1997兲.

M. Cossi, N. Rega, G. Scalmani, and V. Barone, J. Comput. Chem. 24,

669 共2003兲.

V. Barone and M. Cossi, J. Phys. Chem. A 102, 1995 共1998兲.

M. J. Frisch, G. W. Trucks, H. B. Schlegel et al., GAUSSIAN 03, Gaussian,

Inc., Wallingford, CT, 2004.

L. E. Chirlian and M. M. Francl, J. Comput. Chem. 8, 894 共1987兲.

C. M. Breneman and K. B. Wiberg, J. Comput. Chem. 11,361共1990兲.

B. H. Besler, K. M. Merz, and P. A. Kollman, J. Comput. Chem. 11,431

共1990兲.

U. C. Singh and P. A. Kollman, J. Comput. Chem. 5, 129 共1984兲.

044104-10 Palmer et al. J. Chem. Phys. 133, 044104 共2010兲

A. E. Reed, R. B. Weinstock, and F. Weinhold, J. Chem. Phys. 83, 735

共1985兲.

J. J. P. Stewart, MOPAC 6.00, Fujitsu Limited, Tokyo, Japan.

J. J. P. Stewart, J. Comput.-Aided Mol. Des. 4,1共1990兲.

J. M. Wang, W. Wang, P. A. Kollman, and D. A. Case, J. Mol. Graphics

Modell. 25, 247 共2006兲.

J. M. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman, and D. A. Case,

J. Comput. Chem. 25,1157共2004兲.

A. Jakalian, D. B. Jack, and C. I. Bayly, J. Comput. Chem. 23, 1623

共2002兲.

J. G. Kirkwood and F. P. Buff, J. Chem. Phys. 19, 774 共1951兲.

K. Knight, Mathematical Statistics 共CRC, Boca Raton, FL, 2000兲,p.502.

See supplementary material at http://dx.doi.org/10.1063/1.3458798 for

the results of the RISM calculations, results of the statistical analysis of

the calculations, results of the ﬁtting and brief analysis of the computa-

tional performance of the algorithm.

For B3LYP calculations in vacuum with CHELP-DIPOLE charges and

Gaussian ﬂuctuation 共GF兲 RISM free energy formula.

044104-11

RISM ⌬G of druglike molecules J. Chem. Phys. 133, 044104 共2010兲

On The Effect of Mutations in Bovine or Camel Chymosin on the Thermodynamics of Binding κ -Caseins

Article

Oct 2017
PROTEINS

Bovine and camel chymosins are aspartic proteases that are used in dairy food manufacturing. Both enzymes catalyse proteolysis of a milk protein, κ-casein, which helps to initiate milk coagulation. Surprisingly, camel chymosin shows a 70% higher clotting activity than bovine chymosin for bovine milk, while exhibiting only 20% of the unspecific proteolytic activity. By contrast, bovine chymosin is a poor coagulant for camel milk. Although both enzymes are marketed commercially, the disparity in their catalytic activity is not yet well understood at a molecular level, due in part to a lack of atomistic resolution data about the chymosin - κ-casein complexes. Here, we report computational alanine scanning calculations of all four chymosin - κ-casein complexes, allowing us to elucidate the influence that individual residues have on binding thermodynamics. Of the 12 sequence differences in the binding sites of bovine and camel chymosin, eight are shown to be particularly important for understanding differences in the binding thermodynamics (Asp112Glu, Lys221Val, Gln242Arg, Gln278Lys. Glu290Asp, His292Asn, Gln294Glu, and Lys295Leu. Residue in bovine chymosin written first). The relative binding free energies of single-point mutants of chymosin are calculated using the molecular mechanics three dimensional reference interaction site model (MM-3DRISM). Visualisation of the solvent density functions calculated by 3DRISM reveals the difference in solvation of the binding sites of chymosin mutants. This article is protected by copyright. All rights reserved.

Multi-Solvent Models for Solvation Free Energy Predictions Using 3D-RISM Hydration Thermodynamic Descriptors

Preprint

Feb 2020

The potential to predict Solvation Free Energies (SFEs) in any solvent using a machine learning (ML) model based on thermodynamic output, extracted from 3D-RISM simulations in water is investigated. The models on multiple solvents take into account both the solute and solvent description and offer the possibility to predict SFEs of any solute in any solvent with root mean squared errors less than 1 kcal/mol. Validations that involve exclusion of fractions or clusters of the solutes or solvents exemplify the model’s capability to predict SFEs of novel solutes and solvents with diverse chemical profiles. In addition to being predictive, our models can identify the solute and solvent features that influence SFE predictions. Furthermore, using 3D-RISM hydration thermodynamic output to predict SFEs in any organic solvent reduces the need to run 3D-RISM simulations in all these solvents. Altogether, our multi-solvent models for SFE predictions that take advantage of the solvation effects are expected to have an impact in the property prediction space.

Analysis of molecular dynamics simulations of 10-residue peptide, chignolin, using statistical mechanics: Relaxation mode analysis and three-dimensional reference interaction site model theory

Article

Full-text available

Nov 2019

Molecular dynamics simulation is a fruitful tool for investigating the structural stability, dynamics, and functions of biopolymers at an atomic level. In recent years, simulations can be performed on time scales of the order of milliseconds using specialpurpose systems. Since the most stable structure, as well as meta-stable structures and intermediate structures, is included in trajectories in long simulations, it is necessary to develop analysis methods for extracting them from trajectories of simulations. For these structures, methods for evaluating the stabilities, including the solvent effect, are also needed. We have developed relaxation mode analysis to investigate dynamics and kinetics of simulations based on statistical mechanics. We have also applied the three-dimensional reference interaction site model theory to investigate stabilities with solvent effects. In this paper, we review the results for designing amino-acid substitution of the 10-residue peptide, chignolin, to stabilize the misfolded structure using these developed analysis methods. Fullsize Image

A molecular reconstruction approach to site-based 3D-RISM and comparison to GIST hydration thermodynamic maps in an enzyme active site

Article

Full-text available

Jul 2019
PLOS ONE

Computed, high-resolution, spatial distributions of solvation energy and entropy can provide detailed information about the role of water in molecular recognition. While grid inhomogeneous solvation theory (GIST) provides rigorous, detailed thermodynamic information from explicit solvent molecular dynamics simulations, recent developments in the 3D reference interaction site model (3D-RISM) theory allow many of the same quantities to be calculated in a fraction of the time. However, 3D-RISM produces atomic-site, rather than molecular, density distributions, which are difficult to extract physical meaning from. To overcome this difficulty, we introduce a method to reconstruct molecular density distributions from atomic-site density distributions. Furthermore, we assess the quality of the resulting solvation thermodynamics density distributions by analyzing the binding site of coagulation Factor Xa with both GIST and 3D-RISM. We find good qualitative agreement between the methods for oxygen and hydrogen densities as well as direct solute-solvent energetic interactions. However, 3D-RISM predicts lower energetic and entropic penalties for moving water from the bulk to the binding site.

Solvation in atomic liquids: Connection between Gaussian field theory and density functional theory

Article

Full-text available

Aug 2017

For the problem of molecular solvation, formulated as a liquid submitted to the external potential field created by a molecular solute of arbitrary shape dissolved in that solvent, we draw a connection between the Gaussian Field Theory derived by David Chandler [Phys. Rev. E, 48, 2898 (1993)] and classical Density Functional Theory. We show that Chandler's results concerning the solvation of a hard core of arbitrary shape can be recovered by either minimising a linearised HNC functional using an auxiliary Lagrange multiplier field to impose a vanishing density inside the core, or by minimising this functional directly outside the core --indeed a simpler procedure. Those equivalent approaches are compared to two other variants of DFT, either in the HNC, or partially linearised HNC approximation, for the solvation of a Lennard-Jones solute of increasing size in a Lennard-Jones solvent. Compared to Monte-Carlo simulations, all those theories give acceptable results for the inhomogeneous solvent structure, but are completely out-of-range for the solvation free-energies. This can be fixed in DFT by adding a hard-sphere bridge correction to the HNC functional.

Multi-Solvent Models for Solvation Free Energy Predictions using 3D-RISM Hydration Thermodynamic Descriptors

Article

Apr 2020

The potential to predict Solvation Free Energies (SFEs) in any solvent using a machine learning (ML) model based on thermodynamic output, extracted exclusively from 3D-RISM simulations in water is investigated. The models on multiple solvents take into account both the solute and solvent description and offer the possibility to predict SFEs of any solute in any solvent with root mean squared errors less than 1 kcal/mol. Validations that involve exclusion of fractions or clusters of the solutes or solvents exemplify the model’s capability to predict SFEs of novel solutes and solvents with diverse chemical profiles. In addition to being predictive, our models can identify the solute and solvent features that influence SFE predictions. Furthermore, using 3D-RISM hydration thermodynamic output to predict SFEs in any organic solvent reduces the need to run 3D-RISM simulations in all these solvents. Altogether, our multi-solvent models for SFE predictions that take advantage of the solvation effects are expected to have an impact in the property prediction space.

Recent Developments in Integral Equation Theory for Solvation to Treat Density Inhomogeneity at Solute–Solvent Interface

Article

Full-text available

May 2019

The integration equation theory (IET) provides highly efficient tools for the calculation of structural and thermodynamic properties of molecular liquids. In recent years, the 3D reference interaction site model (3DRISM), the most developed IET for solvation, has been widely applied to study protein solvation, aggregation, and drug‐receptor binding. However, hydrophobic solutes with sufficient size (>nm) can induce water density depletion at the solute–solvent interface. This density depletion is not considered in the original 3DRISM theory. The authors here review the recent developments of 3DRISM at hydrophobic surfaces and related theories to address this challenge. At hydrophobic surfaces, an additional hydrophobicity‐induced density inhomogeneity equation is introduced to 3DRISM theory to consider this density depletion. Accordingly, several new closures equations including D2 closure and D2MSA closures are developed to enable stable numerical solutions of 3DRISM equations. These newly developed theories hold great promise for an accurate and rapid calculation of the solvation effect for complex molecular systems such as proteins. At the end of the report, the authors also provide a perspective on other challenges of the IETs as an efficient solvation model.

Statistical efficiency of methods for computing free energy of hydration

Article

Oct 2018

The hydration free energy (HFE) is a critical property for predicting and understanding chemical and biological processes in aqueous solution. There are a number of computational methods to derive HFE, generally classified into the equilibrium or non-equilibrium methods, based on the type of calculations used. In the present study, we compute the hydration free energies of 34 small, neutral, organic molecules with experimental HFE between +2 and -16 kcal/mol. The one-sided non-equilibrium methods Jarzynski Forward (JF) and Backward (JB), the two-sided non-equilibrium methods Jarzynski mean based on the average of JF and JB, Crooks Gaussian Intersection (CGI), and the Bennett Acceptance Ratio (BAR) are compared to the estimates from the two-sided equilibrium method Multistate Bennett Acceptance Ratio (MBAR), which is considered as the reference method for HFE calculations, and experimental data from the literature. Our results show that the estimated hydration free energies from all the methods are consistent with MBAR results, and all methods provide a mean absolute error of ∼0.8 kcal/mol and root mean square error of ∼1 kcal for the 34 organic molecules studied. In addition, the results show that one-sided methods JF and JB result in systematic deviations that cannot be corrected entirely. The statistical efficiency ε of the different methods can be expressed as the one over the simulation time times the average variance in the HFE. From such an analysis, we conclude that ε(MBAR) > ε(BAR) ≈ ε(CGI) > ε(JX), where JX is any of the Jarzynski methods. In other words, the non-equilibrium methods tested here for the prediction of HFE have lower computational efficiency than the MBAR method.

Comparative Molecular Field Analysis Using Molecular Integral Equation Theory

Article

Mar 2018

Recently, Güssregen et al. used solute-solvent distribution functions calculated by the three-dimensional Reference Interaction Site Model (3DRISM) in a 3D quantitative structure-activity relationship (QSAR) approach to model activity data for a set of serine protease inhibitors; this approach was referred to as Comparative Analysis of 3D RISM Maps (CARMa). [ J. Chem. Inf. Model: 2017, 57, 1652-1666] Here we extend this idea by introducing probe atoms into the 3DRISM solvent model in order to directly capture other molecular interactions in addition to those related to hydration/dehydration. Benchmark results for six different protein-ligand systems show that CARMa models trained on probe atom descriptors give consistently more accurate predictions than Comparative Molecular Field Analysis (CoMFA) and other common QSAR approaches.

Application of reference-modified density functional theory: Temperature and pressure dependences of solvation free energy

Article

Nov 2017

Recently, we proposed a reference-modified density functional theory (RMDFT) to calculate solvation free energy (SFE), in which a hard-sphere fluid was introduced as the reference system instead of an ideal molecular gas. Through the RMDFT, using an optimal diameter for the hard-sphere reference system, the values of the SFE calculated at room temperature and normal pressure were in good agreement with those for more than 500 small organic molecules in water as determined by experiments. In this study, we present an application of the RMDFT for calculating the temperature and pressure dependences of the SFE for solute molecules in water. We demonstrate that the RMDFT has high predictive ability for the temperature and pressure dependences of the SFE for small solute molecules in water when the optimal reference hard-sphere diameter determined for each thermodynamic condition is used. We also apply the RMDFT to investigate the temperature and pressure dependences of the thermodynamic stability of an artificial small protein, chignolin, and discuss the mechanism of high-temperature and high-pressure unfolding of the protein. © 2017 Wiley Periodicals, Inc.

Natural Population Analysis

Article

Full-text available

Jul 1985
CHEM PHYS

A method of 'natural population analysis' has been developed to calculate atomic charges and orbital populations of molecular wave functions in general atomic orbital basis sets. The natural analysis is an alternative to conventional Mulliken population analysis, and seems to exhibit improved numerical stability and to better describe the electron distribution in compounds of high ionic character, such as those containing metal atoms. An ab initio calculation is conducted of SCF-MO wave functions for compounds of type CH3X and LiX (X = F, OH, NH2, CH3, BH2, BeH, Li, H) in a variety of basis sets to illustrate the generality of the method, and to compare the natural populations with results of Mulliken analysis, density integration, and empirical measures of ionic character. Natural populations are found to give a satisfactory description of these molecules, providing a unified treatment of covalent and extreme ionic limits at modest computational cost.

A multigrid solver for the integral equations of the theory of liquids

Technical Report

Full-text available

Jan 2008

Abstract In this article we present a new multigrid algorithm to solve the Ornstein-Zernike type integral equations of the theory of liquids. This approach is based on ideas com- ing from the multigrid methods for numerical solutions of integral equations (see §16 in [13]). We describe this method in a general manner as a ’template’ for construc- tion of efficient multilevel iterations for numerical solution of the integral equations in the theory of liquids. We report on several numerical experiments to illustrate the effectiveness of the method. The algorithm is tested on a model problem - a simple monoatomic,fluid with a continuous short ranged potential. The tests have indicated that the method sufficiently accelerates the convergence of the numerical solution in all considered cases. AMS Subject Classification: 65R99, 45G15 PACS numbers: 02.60.Nm, 61.20.Ne, 61.20.Gy Key words. Ornstein-Zernike equation, integral equations theory of liquids, multigrid methods.

Theory of Simple Liquids

Article

Jan 1986

Free Energy Calculations: Applications to Chemical and Biochemical Phenomena

Article

Nov 1993

Peter. Kollman

no abstract

Self-Consistent Molecular Orbital Methods. XX. A Basis Set for Correlated Wave Functions

Article

Jan 1980

A contracted Gaussian basis set (6‐311G∗∗) is developed by optimizing exponents and coefficients at the Møller–Plesset (MP) second‐order level for the ground states of first‐row atoms. This has a triple split in the valence s and p shells together with a single set of uncontracted polarization functions on each atom. The basis is tested by computing structures and energies for some simple molecules at various levels of MP theory and comparing with experiment.

Optimized Cluster Expansion for Classical Fluids. II: Theory of Molecular Liquids

Article

Sep 1972

The optimized cluster expansion methods developed in the first article of this series (I) are generalized to apply to molecular fluids. These methods make use of summations of ring and chain cluster diagrams. The summations are performed explicitly for certain classes of molecular models. The molecules in these classes contain several ``interaction sites,'' and the total interaction between two molecules is a sum of site-site potentials that depend on the scalar distances between sites on the two molecules. The principal results of this work are computationally simple techniques for calculating the thermo-dynamic properties and pair correlation functions of molecular fluids in which the intermolecular interactions are highly angular dependent. The techniques should be reliable since they arise from the same approximations that have been shown to be very accurate when applied to simple fluids.

Salt Effect on Stability and Solvation Structure of Peptide: An Integral Equation Study

Article

May 2000

Salt effects on the stability and on the solvation structure of a peptide in a variety of aqueous solutions of the alkalihalide ions are studied by means of the reference interaction site model (RISM) theory. The order of salt effect on the peptide stability is consistent with the experimental results; the order follows the Hofmeister series. The results are further analyzed in order to clarify the nature of the salt effect which determines the Hofmeister series and to find the reason why the Hofmeister series applies so generally to a variety of solutes in aqueous solutions, ii heuristic model for explaining salt effects on the solvation structure of the peptide is proposed based on changes in the peptide-water pair correlation functions due to the ion perturbation.

Free Energy Functions in the Extended RISM Approximation

Article

Jun 1985

It is shown that the free energies associated with the solutions of extended RISM integral equations can be obtained in closed form thus avoiding the necessity of numerical coupling parameter integrations. In addition, variational principles are deduced which provide a basis for efficient algorithms to solve extended RISM integral equations.

An approach to the solvation free energy in terms of the distribution functions of the solute–solvent interaction energy

Article

May 2005
J MOL LIQ

The energy representation of the molecular configuration in a dilute solution is introduced to express the solvent distribution around the solute over a one-dimensional coordinate specifying the solute–solvent interaction energy. On the basis of the energy representation, an approximate functional for the solvation free energy of a solute in solution is constructed by adopting the Percus-Yevick-type approximation in the unfavorable region of the solute–solvent interaction and the hypernetted-chain-type approximation in the favorable region. The solvation free energy is then given exactly to second order with respect to the solvent density and to the solute–solvent interaction. It is demonstrated that the solvation free energies of nonpolar, polar, and ionic solutes in water are evaluated accurately and efficiently from the single functional over a wide range of thermodynamic conditions. The extension to a flexible solute molecule is straightforward. The applicability of the method is illustrated for solute molecules with a stretching or torsional degree of freedom.

An extended RISM equation for polar fluids

Article

Oct 1981
CHEM PHYS LETT

The RISM integral equation is extended to molecules with charged sites via a renormalization of the Coulomb potentials and the introduction of appropriate closure relations. For a fluid of diatomics with atomic charges of ±0.2 e the equation yields site-site correlation functions in qualitative agreement with those from computer simulation.

Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model

Abstract and Figures

Recommended publications

International conference on Zero Greenhouse Gas Emission in High Productive Agriculture

Performance of the IEF-MST Solvation Continuum Model in a Blind Test Prediction of Hydration Free En...

Hydration Thermodynamics Using the Reference Interaction Site Model: Speed or Accuracy?

An Accurate Prediction of Hydration Free Energies by Combination of Molecular Integral Equations The...

Model for calculating the free energy of hydration of bioactive compounds based on integral equation...

3DRISM Multigrid Algorithm for Fast Solvation Free Energy Calculations

Towards a universal method for calculating hydration free energies: A 3D reference interaction site...