ArticlePDF Available

Molecular Surface Recognition: Determination of Geometric Fit Between Proteins and Their Ligands by Correlation Techniques

Authors:

Abstract and Figures

A geometric recognition algorithm was developed to identify molecular surface complementarity. It is based on a purely geometric approach and takes advantage of techniques applied in the field of pattern recognition. The algorithm involves an automated procedure including (i) a digital representation of the molecules (derived from atomic coordinates) by three-dimensional discrete functions that distinguishes between the surface and the interior; (ii) the calculation, using Fourier transformation, of a correlation function that assesses the degree of molecular surface overlap and penetration upon relative shifts of the molecules in three dimensions; and (iii) a scan of the relative orientations of the molecules in three dimensions. The algorithm provides a list of correlation values indicating the extent of geometric match between the surfaces of the molecules; each of these values is associated with six numbers describing the relative position (translation and rotation) of the molecules. The procedure is thus equivalent to a six-dimensional search but much faster by design, and the computation time is only moderately dependent on molecular size. The procedure was tested and validated by using five known complexes for which the correct relative position of the molecules in the respective adducts was successfully predicted. The molecular pairs were deoxyhemoglobin and methemoglobin, tRNA synthetase-tyrosinyl adenylate, aspartic proteinase-peptide inhibitor, and trypsin-trypsin inhibitor. A more realistic test was performed with the last two pairs by using the structures of uncomplexed aspartic proteinase and trypsin inhibitor, respectively. The results are indicative of the extent of conformational changes in the molecules tolerated by the algorithm.
Content may be subject to copyright.
Proc.
Nati.
Acad.
Sci.
USA
Vol.
89,
pp.
2195-2199,
March
1992
Biophysics
Molecular
surface
recognition:
Determination
of
geometric
fit
between
proteins
and
their
ligands
by
correlation
techniques
(protein-protein
interaction/surface
complementarity/macromolecular
complex
prediction/molecular
docking)
EPHRAIM
KATCHALSKI-KATZIRtt,
ISAAC
SHARIV§,
MIRIAM
EISENSTEIN¶,
ASHER
A.
FRIESEM§,
CLAUDE
AFLALO
II,
AND
ILYA
A.
VAKSERt
Departments
of
tMembrane
Research
and
Biophysics,
§Electronics,
sStructural
Biology,
and
IlBiochemistry,
Weizmann
Institute
of
Science,
Rehovot
76100,
Israel
Contributed
by
Ephraim
Katchalski-Katzir,
October
24,
1991
ABSTRACT
A
geometric
recognition
algorithm
was
devel-
oped
to
identify
molecular
surface
complementarity.
It
is
based
on
a
purely
geometric
approach
and
takes
advantage
of
tech-
niques
applied
in
the
field
of
pattern
recognition.
The
algorithm
involves
an
automated
procedure
including
(i)
a
digital
repre-
sentation
of
the
molecules
(derived
from
atomic
coordinates)
by
three-dimensional
discrete
functions
that
distinguishes
between
the
surface
and
the
interior;
(ii)
the
calculation,
using
Fourier
transformation,
of
a
correlation
function
that
assesses
the
degree
of
molecular
surface
overlap
and
penetration
upon
relative
shifts
of
the
molecules
in
three
dimensions;
and
(iii)
a
scan
of
the
relative
orientations
of
the
molecules
in
three
dimensions.
The
algorithm
provides
a
list
of
correlation
values
indicating
the
extent
of
geometric
match
between
the
surfaces
of
the
molecules;
each
of
these
values
is
associated
with
six
numbers
describing
the
relative
position
(translation
and
ro-
tation)
of
the
molecules.
The
procedure
is
thus
equivalent
to
a
six-dimensional
search
but
much
faster
by
design,
and
the
computation
time
is
only
moderately
dependent
on
molecular
size.
The
procedure
was
tested
and
validated
by
using
five
known
complexes
for
which
the
correct
relative
position
of
the
molecules
in
the
respective
adducts
was
successfully
predicted.
The
molecular
pairs
were
deoxyhemoglobin
and
methemoglo-
bin,
tRNA
synthetase-tyrosinyl
adenylate,
aspartic
protein-
ase-peptide
inhibitor,
and
trypsin-trypsin
inhibitor.
A
more
realistic
test
was
performed
with
the
last
two
pairs
by
using
the
structures
of
uncomplexed
aspartic
proteinase
and
trypsin
inhibitor,
respectively.
The
results
are
indicative
of
the
extent
of
conformational
changes
in
the
molecules
tolerated
by
the
algorithm.
The
association
of
proteins
with
their
ligands
involves
intri-
cate
inter-
and
intramolecular
interactions,
solvation
effects,
and
conformational
changes.
In
view
of
such
complexity,
a
comprehensive
and
efficient
approach
for
predicting
the
formation of
protein-ligand
complexes
from
the
structure
of
their
free
components
is
not
yet
available.
However,
with
some
assumptions,
such
predictions
become
feasible,
and
several
attempts
based
on
energy
minimization
have
been
partially
successful
(1-6).
Another
simplifying
approach
that
could
alleviate
some
of
these
difficulties
is
based
on
geomet-
ric
considerations.
The
three-dimensional
(3D)
structures
of
most
protein
complexes
reveal
a
close
geometric
match
between
those
parts
of
the
respective
surfaces
of
the
protein
and
the
ligand
that
are
in
contact.
Indeed,
the
shape
and
other
physical
characteristics
of
the
surfaces
largely
determine
the
nature
of
the
specific
molecular
interactions
in
the
complex.
Further-
more,
in
many
cases
the
3D
structure
of
the
components
in
the
complex
closely
resembles
that
of
the
molecules
in
their
free,
native
state.
Geometric
matching
thus
seems
to
play
an
important
role
in
determining
the
structure
of
a
complex.
Several
investigators
have
exploited
a
geometric
approach
to
find
shape
complementarity
between
a
given
protein
and
its
ligand
(7-19).
They
considered
geometric
match
between
molecular
surfaces
as
a
fundamental
condition
for
the
for-
mation
of
a
specific
complex
and
pointed
out
the
advantages
of
the
geometric
approach
(13).
In
this
approach,
which
treats
proteins
as
rigid
bodies,
the
complementarity
between
sur-
faces
is
estimated.
Furthermore,
the
geometric
analysis
could
serve
as
the
foundation
for
a
more
complete
approach
including
energy
considerations.
However,
the
methods
heretofore
developed
for
analyzing
geometric
matching
do
not
seem
to
simultaneously
fulfill
the
requirements
for
gen-
erality,
accuracy,
reliability,
and
reasonable
computation
time.
In
this
paper,
we
present
a
geometry-based
algorithm
for
predicting
the
structure
of
a
possible
complex
between
mol-
ecules
of
known
structures.
This
relatively
simple
and
straightforward
algorithm
relies
on
the
well-established
cor-
relation
and
Fourier
transformation
techniques
used
in
the
field
of
pattern
recognition.
The
algorithm
requires
only
that
the
3D
structure
of
the
molecules
under
consideration
be
known.
Moreover,
it
provides
quantitative
data
related
to
the
quality
of
the
contact
between
the
molecules.
The
algorithm
was
tested
and
validated
in
the
analysis
of
the
following
complexes,
whose
structures
are
known:
the
a-f
hemoglobin
dimer,
tRNA
synthetase-tyrosinyl
adenylate,
aspartic
pro-
teinase-peptide
inhibitor,
and
trypsin-trypsin
inhibitor.
The
correct
relative
position
of
the
molecules
within
these
com-
plexes
were
successfully
predicted.
METHOD
Geometric
Recognition
Algorithm.
We
begin
with
a
geo-
metric
description
of
the
protein
and
the
ligand
molecules,
derived
from
their
known
atomic
coordinates.
The
two
molecules
denoted
by
a
and
b,
are
projected
onto
a
three
dimensional
grid
of
N
x
N
x
N
points,
where
they
are
represented
by
the
discrete
functions
al,m,n
=
{-
inside
the
molecule
outside
the
molecule,
[la]
and
bi,m,n
=
{o
inside
the
molecule
outside
the
molecule,
[lb]
Abbreviations:
3D,
three
dimensional;
DFT,
discrete
Fourier
trans-
form;
IFT,
inverse
Fourier
transform.
tTo
whom
reprint
requests
should
be
addressed.
2195
The
publication
costs
of
this
article
were
defrayed
in
part
by
page
charge
payment.
This
article
must
therefore
be
hereby
marked
"advertisement"
in
accordance
with
18
U.S.C.
§1734
solely
to
indicate
this
fact.
21%
Biophysics:
Katchalski-Katzir
et
al.
where
I,
m,
and
n
are
the
indices
of
the
3D
grid
(1,
m,
n
=
{1
...
N}).
Any
grid
point
is
considered
inside
the
molecule
if
there
is
at
least
one
atom
nucleus
within
a
distance
r
from
it,
where
r
is
of
the
order
of
van
der
Waals
atomic
radii.
Examples
for
two-dimensional
cross
sections
of
these
func-
tions
are
presented
in
Fig.
1
a
and
b.
Next,
to
distinguish
between
the
surface
and
the
interior
of
each
molecule,
we
retain
the
value
of
1
for
the
grid
points
along
a
thin
surface
layer
only
and
assign
other
values
to
the
internal
grid
points.
The
resulting
functions
thus
become
1
on
the
surface
of
the
molecule
a,,mn=
p
inside
the
molecule
0
outside
the
molecule,
[2a]
and
1
on
the
surface
of
the
molecule
T1mm=
3
inside
the
molecule
[2b]
O
outside
the
molecule,
where
the
surface
is
defined
here
as
a
boundary
layer
of
finite
width
between
the
inside
and
the
outside
of
the
molecule.
The
parameters
p
and
8
describe
the
value
of
the
points
inside
the
molecules,
and
all
points
outside
are
set
to
zero.
Two-
dimensional
cross
sections
of
these
functions
are
shown
in
Figs.
1
c
and
d.
In
our
method,
matching
of
surfaces
is
accomplished
by
calculating
correlation
functions.
The
correlation
between
the
discrete
functions
af
and
i
is
defined
as
N
N N
=
E
Z
al,m,n
*bl+a,m+,P,n+y,
[3]
1=1
m=1
n=1
where
a,
3,
and
y
are
the
number
of
grid
steps
by
which
molecule
b
is
shifted
with
respect
to
molecule
a
in
each
dimension.
If
the
shift
vector
{a43,'y}
is
such
that
there
is
no
contact
between
the
two
molecules
(see
Fig.
2a),
the
corre-
lation
value
is
zero.
If
there
is
a
contact
between
the
surfaces
FIG.
1.
Typical
cross
sections
through
the
3D
grid
representa-
tions
of
the
molecules.
(a)
Cross
section
(at
I
=
46)
through
the
function
alm
derived
by
projecting
the
a
subunit
of
hemoglobin
(from
2HHB;
see
text)
onto
a
3D
grid
(N
=
90).
The
values
0
and
1
are
represented
in
white
and
black,
respectively.
(b)
The
cross
section
b46i,mn
was
similarly
derived
for
the
P
subunit
(from
2HHB).
Other
details
are
as
in
a.
(c)
The
cross
section
(at
I
=
46)
through
the
function
m
which
was
obtained
by
distinguishing
the
surface
layer
from
the
interior
of
the
molecule
in
the
function
a/
n.
The
large
negative
value
for
p
is
represented
in
gray.
(d)
Cross
section
b46,mnt
similarly
derived
from
blm,n.
The
small
positive
value
for
8
is
represented
in
a
different
shade
of
gray.
The
values
for
r
and
i1
were
1.8
A
and
1.2
A,
respectively.
l.d:
'
.
..-I
I'
,..
C
bi
d
FIG.
2.
Different
relative
positions
of
molecules
a
and
b,
illus-
trated
by
the
cross
sections
a46,m
"
and
bj46,mn
from
Fig.
1.
The
relative
orientation
of
the
molecules
is
as
in
the
known
a-,B
dimer.
(a)
No
contact.
(b)
Limited
contact.
(c)
Penetration.
The
penetrated
area
is
represented
in
black.
(d)
Good
geometric
match,
as
indicated
by
the
extensive
overlap
of
complementary
surface
layers.
(Fig.
2b),
the contribution
to
the
correlation
value
is
positive.
Nonzero
correlation
values
could
also
be
obtained
when
one
molecule
penetrates
into
the
other
(Fig.
2c).
Since
such
penetration
is
physically
forbidden,
a
distinction
between
surface
contact
and
penetration
must
be
clearly
formulated.
To
do
so,
we
assign
large
negative
values
to
p
in
a
and
small
nonnegative
values
to
8
in
b.
Thus,
when
the
shift
vector
{a,,y}
is
such
that
molecule
b
penetrates
molecule
a,
the
multiplication
of
the
negative
numbers
(p)
in
7aby
the
positive
numbers
(1
or
8)
in
b
results
in
a
negative
contribution
to
the
overall
correlation
value.
Consequently,
the
correlation
value
for
each
displacement
is
simply
the
score
for
overlap-
ping
surfaces
corrected
by
the
penalty
for
penetration.
Positive
correlation
values
are
obtained
when
the
contri-
bution
from
surface
contact
outweighs
that
from
penetration.
Thus,
a
good
geometric
match
(such
as
in
Fig.
2d)
is
represented
by
a
high
positive
peak,
and
low
values
reflect
a
poor
match
between
the
molecules.
A
cross
section
of
a
typical
correlation
function
for
a
good
match
is
presented
in
Fig.
3.
The
coordinates
of
the
prominent
peak
denote
the
relative
shift
of
molecule
b
yielding
a
good
match
with
molecule
a.
The
location
of
the
recognition
sites
on
the
surface
of
each
molecule
can
readily
be
determined
from
these
coordinates.
In
addition,
the
width
of
the
peak
provides
a
measure
for
the
relative
displacement
allowed
before
matching
is
lost.
A
direct
calculation
of
the
correlation
between
the
two
functions
(see
Eq.
3)
is
rather
lengthy,
since
it
involves
N3
multiplications
and
additions
for
each
of
the
N3
possible
relative
shifts
{a,8,y},
resulting
in
an
order
of
N6
computing
steps.
Therefore,
we
chose
to
take
advantage
of
Fourier
transformation
that
allowed
us
to
calculate
the
correlation
function
much
more
rapidly.
The
discrete
Fourier
transform
(20)
(DFT)
of
a
function
xlmn
is
defined
as
N
N N
Xopq=
E
Y
E
exp[-21ri(ol
+
pm
+
qn)/N]-X
n
1=1
m=1
n=1
[4]
where
o,
p,
q
=
{1
.
..
N}
and
i
=
1.
The
application
of
this
transformation
to
both
sides
of
Eq.
3
yields
(21)
Cop,q
=
A*pq
'
Bopq,
[5]
where
C
and
B
are
the
DFT
of
the
functions
c
and
b,
respectively,
and
A*
is
the
complex
conjugate
of
the
DFT
of
a
b
C
d
-
I
k
_eI
Proc.
Natl.
Acad.
Sci.
USA
89
(1992)
I
r
Proc.
Natl.
Acad.
Sci.
USA
89
(1992)
2197
1l
FIG.
3.
Cross
section
(at
a
=
0)
through
a3D
correlation
function
7F
,,
Ad
The
correlation
function
shown
was
calculated
for
the
a
and
,B
subunits
of
hemoglobin,
oriented
as
in
the
dimer
(from
2HHB,
see
Figs.
1
c
and
d).
The
correlation
value
at
each
shift
vector
{0,,t,y}
is
represented
by
the
height
of
the
graph.
The
prominent
peak
at
{a
=
0,
8
=
14,
y
=
17}
corresponds
to
the
correct
match
between
the
molecules
(see
Fig.
2d).
Other
intermolecular
surface
contacts
(such
as
in
Fig.
2b)
give
rise
to
the
low
positive
correlation
values
around
the
center
of
the
graph.
The
negative
correlation
values
caused
by
penetration
(see
Fig.
2c)
are
omitted,
leaving
the
empty
area
at
the
center.
D.
Eq.
5
indicates
that
the
transformed
correlation
function
C
is
obtained
by
a
simple
multiplication
of
the
two
functions
A*
and
B.
The
inverse
Fourier
transform
(20)
(IFT),
defined
as
Ca,3,y=
1N
N N
3
E
exp[2iri(oa
+
pB
+
qy)/N]
*
Cop,q,
[6]
N
o=1
p=1
q=1
is
used
to
obtain
the
desired
correlation
between
the
two
original
functions
a
and
b.
The
Fourier
transformations
can
be
performed
with
the
fast
Fourier
transform
algorithm
(20),
which
requires
less
than
the
order
of
N3
In(N3)
steps
for
transforming
a
3D
function
of
N
x
N
x
N
values.
Thus,
the
overall
procedure
leading
to
Eq.
6
is
significantly
faster
than
the
direct
calculation
of
c
according
to
Eq.
3.
Finally,
to
complete
a
general
search
for
a
match
between
the
surfaces
of
molecules
a
and
b,
the
correlation
function
c
has
to
be
calculated
for
all
relative
orientations
of
the
molecules.
In
practice,
molecule
a
is
fixed,
whereas
the
three
Euler
angles
defining
the
orientation
of
molecule
b
(xyz
convention
in
ref.
22)
are
varied
at
fixed
intervals
of
A
degrees.
This
results
in
a
complete
scan
of
360
x
360
x
180/A3
orientations
for
which
the
correlation
function
c
must
be
calculated.
The
entire
procedure
described
above
can
be
summarized
by
the
following
steps:
(i)
derive
a1
from
atomic
coordinates
of
molecule
a
(Eq.
2),
(ii)
A*
=
[DFT(Z!)]*
(Eq.
4),
(iii)
derive
b
from
atomic
coordinates
of
molecule
b
(Eq.
2),
(iv)
B
=
DFT(b)
(Eq.
4),
(v)
C
=
A*.B
(Eq.
5),
(vi)
c
=
IFT(C)
(Eq.
6),
(vii)
look
for
a
sharp
positive
peak
of
cE,
(viii)
rotate
molecule
b
to
a
new
orientation,
(ix)
repeat
steps
iii-viii
and
end
when
the
orientations
scan
is
completed,
and
(x)
sort
all
of
the
peaks
by
their
height.
Each
high
and
sharp
peak
found
by
this
procedure
indi-
cates
geometric
match
and
thus
represents
a
potential
com-
plex.
The
relative
position
and
orientation
of
the
molecules
within
each
such
complex
can
readily
be
derived
from
the
coordinates
of
the
correlation
peak,
and
from
the
three
Euler
angles
at
which
the
peak
was
found.
Implementation
of
the
Algorithm.
To
implement
our
algo-
rithm,
it
is
necessary
to
assign
specific
values
to
the
various
parameters
involved-i.e.,
the
surface
layer
thickness,
r,
A,
p,
8,
N,
and
the
grid
step
size
denoted
by
y.
The
choice
of
these
values
is
based
on
a
number
of
considerations,
outlined
in
this
section.
We
begin
by
noting
that
the
match
between
the
functions
a
and
b
is
not
perfect.
One
reason
is
that
the
structure
of
known
complexes
reveals
small
gaps
between
the
molecules,
which
are
also
reflected
in
their
mathematical
representation.
Furthermore,
the
functions
a
and
b
are
derived
from
atomic
coordinates
sets
that
do
not
include
hydrogen
atoms.
This,
in
addition
to
the
limited
accuracy
of
the
coordinates,
may
affect
the
quality
of
the
match.
Finally,
minor
conformational
changes
may
occur
at
the
surface
of
molecules
upon
complex
formation
(locally
induced
fit).
Such
changes
are
not
incor-
porated
in
the
functions
a
and
b
when
they
represent
native
molecules
that
are
assumed
to
be
rigid.
Therefore,
penetra-
tion
and
small
gaps
occur
along
the
contact
area.
To
ensure
that
the
correct
match
between
molecules
is
not
missed,
our
algorithm
must
be
able
to
tolerate
these
imperfections.
This
is
achieved
by
assigning
more
than
one
layer
of
grid
points
to
the
surface
in
a!
so
that
the
surface
thickness
for
molecule
a
is
1.5-2.5
A
(see
Fig.
1c).
Consequently,
penetrations
and
gaps
that
are
smaller
than
these
values
are
tolerated.
It
should
be
noted
that
an
inherent
drawback
in
the
choice
of
a
thicker
surface
layer
is
the
concomitant
increase
in
the
number
of
faulty
matches.
The
thickness
of
the
surface
layer
also
influences
the
angular
tolerance.
This
tolerance
is
defined
as the
maximal
deviation
from
the
correct
match
orientation
that
would
still
result
in
a
distinct
correlation
peak.
Typically,
a
surface
layer
thickness
of
2
A
yielded
an
angular
tolerance
of
about
+
100.
Thus,
the
angular
step
A
was
set
to
200,
resulting
in
2916
different
orientations
of
molecule
b
at
each
of
which
the
correlation
function
had
to
be
evaluated.
The
parameter
r,
used
to
derive
the
functions
alm
n
and
bimn
(see
Eq.
1),
was
set
to
1.8
A,
which
is
larger
by
about
0.2
A
than
the
average
van
der
Waals
radius
for
carbon,
nitrogen,
and
oxygen.
This
compensated
for
the
fact
that
hydrogen
atoms,
missing
in
the
coordinates
sets,
are
not
projected
on
our
grids.
The
parameters
p
and
8,
representing
the
interior
of
the
molecules,
were
set
to
-15
and
1,
respectively.
This
ensures
that
the
correlation
value
is
substantially
reduced
in
case
of
penetration.
Several
other
choices
for
p
and
8,
in
the
ranges
p
<<
-1
and
0
s
8
c
1,
did
not
significantly
affect
the
performance
of
the
algorithm.
Another
important
parameter
of
the
algorithm
is
the
grid
step
size,
7-.
Optimal
results
were
obtained
when
q
was
set
to
0.7-0.8
A,
corresponding
to
half
of
the
carbon-carbon
bond
length.
Yet,
since
the
product
q-N
should
be
larger
than
the
size
of
any
potential
complex,
a
finer
grid
requires
a
larger
number
of
points
N.
This
leads
in
turn
to
excessive
compu-
tation
time.
Therefore,
we
performed
an
initial
scan
of
the
angular
orientations
with
larger
grid
steps
(71
=
1.0-1.2
A);
thus,
computations
that
would
take
days
with
the
finer
grid
were
performed
in
hours.
However,
with
such
large
grid
steps,
spurious
correlation
peaks,
which
may
even
be
higher
than
the
correct
peak,
appear.
Hence,
the
scan
stage
was
followed
by
a
discrimination
stage,
in
which
the
correlation
functions
were
recalculated
with
a
finer
grid
(7
7
0.7-0.8
A),
but
only
for
those
orientations
that
yielded
the
highest
peaks
in
the
scan
stage.
This
discrimination
stage
enhanced
the
correct
correlation
peak
and
suppressed
spurious
peaks.
A
FORTRAN
program
was
developed
for
implementing
the
algorithm.
The
parameters
of
the
program,
in
accordance
Biophysics:
Katchalski-Katzir
et
al.
2198
Biophysics:
Katchalski-Katzir
et
al.
with
the
arguments
given
above,
were
assigned
the
following
values:
r
=
1.8
A,
A
=
20',
p
=
-15,
8
=
1,
N
=
90
(q
1.0-1.2
A)
for
the
scan
stage,
and
N
=
128
(71
0.7-0.8
A)
for
the
discrimination
stage.
The
program
was
run
on
a
Convex
C-220
computer
with
the
Veclib
fast
Fourier
trans-
form
subroutine.
The
computation
time
for
each
iteration
(steps
iii-viii
in
the
summarized
algorithm)
in
the
scan
stage
was
9
sec.
The
total
computation
time
for
matching
two
molecules
in
the
range
of
1100
atoms
each,
including
both
the
initial
scan
and
the
discrimination
stage,
was
typically
7.5
hr.
RESULTS
Our
algorithm
was
applied
to
several
known
complexes,
whose
coordinates
are
given
in
the
Brookhaven
Protein
Data
Bank
(Brookhaven
National
Laboratory,
Upton,
NJ)
to
test
its
ability
to
predict
correct
structures
of
protein
complexes.
We
chose
complexes
that
represent
a
wide
variety
of
relative
sizes
for
molecules
a
and
b
(30-2500
atoms).
These
are
two
hemoglobin
variants:
human
deoxyhemoglobin
(23)
(desig-
nated
2HHB)
and
horse
methemoglobin
(24)
(designated
2MHB),
representing
naturally
occurring
heterodimers;
and
three
complexes:
tRNA
synthetase-tyrosinyl
adenylate
(25)
(designated
3TS1),
aspartic
proteinase-peptide
inhibitor
(26)
(designated
3APR),
and
trypsin-trypsin
inhibitor
(27)
(des-
ignated
2PTC).
In
these
tests,
the
component
molecules
were
treated
as
separate
entities
by
using
their
respective
atomic
coordinates
within
the
complex.
Additional
tests
were
per-
formed
with
native
aspartic
proteinase
(28)
and
its
peptide
inhibitor
(designated
2APR)
and
with
trypsin
and
native
trypsin
inhibitor
(29)
(designated
4PTI).
The
relative
position
of
the
molecules
yielding
the best
geometric
fit
in
a
complex,
as
determined
by
the
algorithm,
was
finally
compared
with
the
corresponding
known
complex.
The
results
are
summarized
in
Fig.
4.
It
shows
histograms
of
10
correlation
peaks
for
each
pair
of
molecules.
The
left
side
of
each
panel
presents
the
highest
10
peaks
obtained
at
the
scan
stage,
whereas
the
right
side
shows
the
peaks
reevaluated
for
the
same
10
orientations
in
the
discrimination
stage.
As
evident
from
the
figure,
the
correlation
peak
for
the
known
complex
(shaded)
is
not
necessarily
the
highest
in
the
scan
stage.
However,
the
highest
peak
that
was
obtained
after
discrimination
represents
the
right
orientation
and
po-
sition
of
molecule
b
with
respect
to
a,
and
it
is
significantly
higher
than
the
other
peaks.
Application
of
the
algorithm
to
the
a
and
f8
subunits
of
human
hemoglobin
(2HHB
in
Fig.
4a)
revealed
that
the
highest
peak
at
the
scan
stage
(score
312),
corresponds
to
the
well-known
a-,8
dimer.
In
the
horse
methemoglobin
variant,
however
(2MHB
in
Fig.
4b),
the
correct
position
for
the
dimer
is
represented
by
the
third
peak
(score
290)
in
the
sorted
histogram
for
the
scan
stage.
Nevertheless,
both
these
peaks
became
predominant
at
the
discrimination
stage
(scores
302
and
347
for
2HHB
and
2MHB,
respectively).
The
hemoglobin
molecules
contain
two
a-P
dimers
symmetrically
arranged
so
that
each
a
subunit
is
in
contact
with
two
13
subunits.
The
algorithm
should
thus
yield,
in
principle,
two
major
correlation
peaks
for
the
interaction
between
a
and
13
subunits.
The
first,
mentioned
above,
corresponds
to
the
tight
contact
between
the
subunits
of
the
a-P
dimer,
and
the
other
corresponds
to
the
looser
contact
between
the
a
subunit
of
one
dimer
with
the
18
subunit
of
the
other.
This
second
expected
peak
(not
shown)
was
rather
low
(scores
190
and
178
for
2HHB
and
2MHB,
respectively),
so
it
was
not
included
among
the
10
peaks
in
the
scan
stage.
However,
it
was
enhanced
upon
recalculation
with
the
finer
grid
(scores
260
and
185,
respectively),
in
contrast
with
the
spurious
peaks,
which
were
all
reduced.
The
relation
between
the
extent
of
geometric
fit
in
these
two
associations
may
reflect
.
',
7
Q
,
-
1
)
S
-
%.
xk
i
P
II
-)
(d
..17--
.
ID
3AI\
I
R
I
.
9
i)
P
.a
!)
5
IDli
~ ~
iI
WE:-
PeakI
1)
"ak
I
)
FIG.
4.
Correlation
results
for
different
pairs
of
molecules.
The
pairs
are
identified
by
their
respective
codes
(see
text).
In
each
panel,
the
histogram
on
the
left
shows
the
10
highest
correlation
peaks
obtained
in
the
scan
stage
(71
=
1.0-1.2
A),
sorted
by
their
score.
Each
of
these
peaks
was
obtained
at
a
different
relative
orientation
of
the
molecules
and
corresponds
to
a
potential
geometric
match.
The
shaded
peak
in
each
histogram
corresponds
to
the
known
complex
between
the
molecules
considered.
The
histogram
on
the
right
side
of
each
panel
shows
the
scores
obtained
at
the
discrimi-
nation
stage
(71
=
0.7-0.8
A),
for
the
10
orientations
singled
out
in
the
scan
stage.
Note
that
in
the
discrimination
stage
the
spurious
peaks
(plain)
are
suppressed,
whereas
the
correct
peak
(shaded)
becomes
prominent.
the
well-known
higher
stability
for
the
interdimer
associa-
tion.
Next,
we
applied
the
algorithm
to
the
tRNA
synthetase-
tyrosinyl
adenylate
pair
(3TS1
in
Fig.
4c),
which
served
as
an
example
for
a
complex
between
a
high
molecular
weight
protein
and
a
small
ligand.
In
this
case
the
correlation
peak,
which
corresponds
to
the
correct
position
of
the
ligand
in
the
complex,
was
not
the
highest
one
at
the
scan
stage.
However,
discrimination
yielded
the
expected
result-i.e.,
the
correct
orientation
was
associated
with
a
peak
distinctly
higher
than
the
other
peaks.
Further
assessment
of
the
procedure
was
carried
out
by
analyzing
the
complex
between
aspartic
proteinase
and
its
peptide
inhibitor
(3APR
in
Fig.
4).
This
system
illustrates
a
case
in
which
the
structure
of
the
protein
in
the
complex
closely
resembles
that
of
the
native
protein
(26,
28).
It
is
thus
possible
to
look
for
the
best
match
between
the
structure
of
the
complexed
peptide
and
the
protein,
either
in
its
com-
plexed
(3APR)
or
native
(2APR)
structure.
With
the
com-
plexed
protein,
the
correct
relative
position
of
the
ligand
yielded
the
highest-peak
already
at
the
scan
stage
(Fig.
4d),
whereas
with
the
native
protein,
the
peak
describing
the
correct
position
was
only
the
fourth
in
the
sorted
list
(Fig.
4e).
However,
the
hierarchy
of
the
peaks
changed
markedly
in
the
discrimination
stage,
where
the
highest
correlation
peak
indicated
a
structure
closely
resembling
that
of
the
Proc.
NaM
Acad
Sci.
USA
89
(1992)
Proc.
Natl.
Acad.
Sci.
USA
89
(1992)
2199
known
complex.
When
the
native
protein
is
used,
the
cor-
relation
peaks
at
both
stages
are
somewhat
lower
than
the
corresponding
ones
for
the
protein
in
the
complex,
indicating
a
slightly
poorer
fit.
Analysis
of
the
complex
trypsin-trypsin
inhibitor
(2PITC
in
Fig.
4)
was
chosen
because
the
native
structure
of
one
of
the
components,
the
inhibitor,
differs
from
that
in
the
complex.
Specifically,
conformational
changes
involving
the
side
chains
of
three
amino
acids,
located
in
the
binding
site
of
the
inhibitor,
occur
upon
complex
formation
(27,
29).
When
the
structure
of
the
inhibitor
in
the
complex
was
used
(Fig.
4f),
the
highest
peak
after
discrimination
corresponded
to
the
correct
position
of
the
inhibitor
in
the
complex.
However,
when
the native
structure
of
the
inhibitor
(4PTI)
was
used
(Fig.
4g),
the
algorithm
did
not
yield
a
distinct
correlation
peak
neither
in
the
scan
stage
nor
in
the discrimination
stage.
This
result
indicates
that
the
extent
of
the
conformational
change
occurring
at
the
surface
of
the
inhibitor
upon
binding
to
trypsin
exceeds
that
tolerated
by
the
algorithm.
CONCLUSION
Our
geometry-based
algorithm
predicts
the
structure
of
com-
plexes
formed
between
the
two
constituent
molecules
by
using
their
atomic
coordinates,
without
any
prior
information
as
to
their
binding
sites.
The
molecular
surfaces
need
not
undergo
transformation
except
a
simple
3D
digitization;
thus,
all
the
surface
geometric
features
are
fully
preserved
within
the
accuracy
of
the
grid
step
size.
The
values
chosen
for
the
parameters
of
the
algorithm
are
general
and
do
not
have
to
be
readjusted
for
each
molecular
pair.
Our
algorithm
exploits
Fourier
transformation
and
correlation
techniques,
so
that
all
possible
associations
between
the
molecules
are
evaluated
much
more
rapidly
than
the
equivalent
exhaustive
search
in
six
dimensions.
Another
important
feature
of
the
algorithm
is
that
the
computation
time
is
approximately
proportional
to
kln(k),
where
k
is
the
number
of
atoms
in
the
complex.
Consequently,
the increase
in
computation
time
with
larger
molecules
is
moderate.
We
tested
our
algorithm
on
five
known
complexes,
for
which
the
correct
structure
of
the
complex
was
predicted
from
the
atomic
coordinates
of
the
component
molecules
within
the
complex.
A
test
carried
out
using
the
coordinates
of
native
aspartic
proteinase
(see
Fig.
4e)
also
resulted
in
the
prediction
of
the
correct
known
complex
structure.
How-
ever,
when
the
algorithm
was
applied
to
trypsin
and
its
native
inhibitor,
no
distinct
match
was
found
(see
Fig.
4g).
This
is
most
likely
due
to
the
known
conformational
change
in
the
trypsin
inhibitor
binding
site
upon
complex
formation
(27,
29)
(see
also
refs.
4,
18,
and
19).
The
results
of
our
tests
indicate
that
as
long
as
the
conformational
changes
are
small,
the
algorithm
may
be
used
successfully
to
predict
the
structure
of
hitherto
unknown
complexes
from
the
structure
of
two
known
components.
Further
enhancements
of
the
algorithm
are
presently
being
developed
to
introduce
some
physical
features
to
the
molecular
interface,
such
as
surface
charges
and
degrees of
hydrophobicity.
We
thank
I.
Steinberg
for
helpful
discussions
and
A.
Heimrath
and
D.
Revacha
for
technical
assistance.
M.E.
acknowledges
support
from
the
Kimmelman
Center
for
biomolecular
structure
and
assem-
bly;
C.A.
and
I.A.V.
thank
the
Ministry
of
Absorption
and
"Fon-
dation
RASCHI"
for
partial
financial
support;
and
I.S.
thanks
the
Ministry
of
Science
and
Technology
for
support.
1.
Wodak,
S.
J.
&
Janin,
J.
(1978)
J.
Mol.
Biol.
124,
323-342.
2.
Goodford,
P.
J.
(1985)
J.
Med.
Chem.
28,
849-857.
3.
Billeter,
M.,
Havel,
T.
F.
&
Kuntz,
I.
D.
(1987)
Biopolymers
26,
777-793.
4.
Warwicker,
J.
(1989)
J.
Mol.
Biol.
206,
381-395.
5.
Goodsell,
D.
S.
&
Olson,
A.
J.
(1990)
Proteins
8,
195-202.
6.
Yue,
S.-Y.
(1990)
Protein
Eng.
4,
177-184.
7.
Greer,
J.
&
Bush,
B.
L.
(1978)
Proc.
Natl.
Acad.
Sci.
USA
75,
303-307.
8.
Kuntz,
I.
D.,
Blaney,
J.
M.,
Oatley,
S.
J.,
Langridge,
R.
&
Ferrin,
T.
E.
(1982)
J.
Mol.
Biol.
161,
269-288.
9.
Zielenkiewicz,
P.
&
Rabczenko,
A.
(1984)
J.
Theor.
Biol.
111,
17-30.
10.
Zielenkiewicz,
P.
&
Rabczenko,
A.
(1985)
J.
Theor.
Biol.
116,
607-612.
11.
Fanning,
D.
W.,
Smith,
J.
A.
&
Rose,
G.
D.
(1986)
Biopoly-
mers
25,
863-883.
12.
Novotny,
J.,
Handschumacher,
M.,
Haber,
E.,
Bruccoleri,
R.
E.,
Carlson,
W.
B.,
Fanning,
D.
W.,
Smith,
J.
A.
&
Rose,
G.
D.
(1986)
Proc.
Natl.
Acad.
Sci.
USA
83,
226-230.
13.
Connolly,
M.
L.
(1986)
Biopolymers
25,
1229-1247.
14.
DesJarlais,
R.
L.,
Sheridan,
R.
P.,
Seibel,
G.
L.,
Dixon,
J.
S.,
Kuntz,
I.
D.
&
Venkataraghavan,
R.
(1988)
J.
Med.
Chem.
31,
722-729.
15.
Chirgadze,
Y.,
Kurochkina,
N.
&
Nikonov,
S.
(1989)
Protein
Eng.
3,
105-110.
16.
Lewis,
R.
A.
&
Dean,
P.
M.
(1989)
Proc.
R.
Soc.
London
Ser.
B
236,
141-162.
17.
Wang,
H.
(1991)
J.
Comput.
Chem.
12,
746-750.
18.
Jiang,
F.
&
Kim,
S.
H.
(1991)
J.
Mol.
Biol.
219,
79-102.
19.
Schoichet,
B.
K.
&
Kuntz,
I.
D.
(1991)
J.
Mol.
Biol.
221,
327-346.
20.
Elliott,
D.
F.
&
Rao,
K.
R.
(1982)
in
Fast
Transforms:
Algo-
rithms,
Analyses,
Applications
(Academic,
Orlando,
FL),
pp.
58-90.
21.
Papoulis,
A.
(1962)
in
The
Fourier
Integral
and
Its
Applications
(MacGraw-Hill,
New
York),
pp.
244-245.
22.
Goldstein,
H.
(1980)
in
Classical
Mechanics
(Addison-Wesley,
Reading,
MA),
p.
608.
23.
Fermi,
G.,
Perutz,
M.
F.,
Shaanan,
B.
&
Fourme,
R.
(1984)
J.
Mol.
Biol.
175,
159-174.
24.
Ladner,
R.
C.,
Heidner,
E.
G.
&
Perutz,
M.
F.
(1977)
J.
Mol.
Biol.
114,
385-414.
25.
Brick,
P.,
Bhat,
T.
N.
&
Blow,
D.
M.
(1989)
J.
Mol.
Biol.
208,
83-98.
26.
Suguna,
K.,
Padlan,
E.
A.,
Smith,
C.
W.,
Carlson,
W.
D.
&
Davies,
D.
R.
(1987)
Proc.
Natl.
Acad.
Sci.
USA
84,
7009-
7013.
27.
Marquart,
M.,
Walter,
J.,
Deisenhofer,
J.,
Bode,
W.
&
Huber,
R.
(1983)
Acta
Crystallogr.
Sect.
B
39,
480-490.
28.
Suguna,
K.,
Bott,
R.
R.,
Padlan,
E.
A.,
Subramanian,
E.,
Sheriff,
S.,
Cohen,
G.
H.
&
Davies,
D.
R.
(1987)
J.
Mol.
Biol.
196,
877-900.
29.
Wlodawer,
A.,
Deisenhofer,
J.
&
Huber,
R.
(1987)
J.
Mol.
Biol.
193,
145-156.
Biophysics:
Katchalski-Katzir
et
al.
... To ensure the accuracy of the docking results, the protein was prepared by the AutoDock-Tools-1.5.7 [78], and the water molecules were manually eliminated from the protein and the polar hydrogen was added. Docking Web Server (GRAMM) was used for protein-protein docking [79,80]. The resulting protein-protein complex was also manually optimized by removing water and adding polar hydrogen by the AutoDockTools-1.5.7. ...
Article
Full-text available
In females, the pathophysiological mechanism of poor ovarian response (POR) is not fully understood. Considering the expression level of p62 was significantly reduced in the granulosa cells (GCs) of POR patients, this study focused on identifying the role of the selective autophagy receptor p62 in conducting the effect of follicle-stimulating hormone (FSH) on antral follicles (AFs) formation in female mice. The results showed that p62 in GCs was FSH responsive and that its level increased to a peak and then decreased time-dependently either in ovaries or in GCs after gonadotropin induction in vivo. GC-specific deletion of p62 resulted in subfertility, a significantly reduced number of AFs and irregular estrous cycles, which were same as pathophysiological symptom of POR. By conducting mass spectrum analysis, we found the ubiquitination of proteins was decreased, and autophagic flux was blocked in GCs. Specifically, the level of nonubiquitinated Wilms tumor 1 homolog (WT1), a transcription factor and negative controller of GC differentiation, increased steadily. Co-IP results showed that p62 deletion increased the level of ubiquitin-specific peptidase 5 (USP5), which blocked the ubiquitination of WT1. Furthermore, a joint analysis of RNA-seq and the spatial transcriptome sequencing data showed the expression of steroid metabolic genes and FSH receptors pivotal for GCs differentiation decreased unanimously. Accordingly, the accumulation of WT1 in GCs deficient of p62 decreased steroid hormone levels and reduced FSH responsiveness, while the availability of p62 in GCs simultaneously ensured the degradation of WT1 through the ubiquitin‒proteasome system and autophagolysosomal system. Therefore, p62 in GCs participates in GC differentiation and AF formation in FSH induction by dynamically controlling the degradation of WT1. The findings of the study contributes to further study the pathology of POR.
... The default hydrogen bond distance is 2.5 Å, but when the two forces are weak, the distance is set to 3.0 Å, and when the two forces are strong, the distance is set to 2.0 Å. Then, the docking server (GRAMM) was used for protein-protein docking 66,67 . The obtained protein-protein complexes were also optimized with AutoDockTools-1.5.7 for dehydration and hydrogenation. ...
Article
Full-text available
PRMT5, a type II arginine methyltransferase, is involved in transcriptional regulation, RNA processing and other biological processes and signal transduction. Secondary metabolites are vital pharmacological compounds in Ganoderma lucidum, and their content is an important indicator for evaluating the quality of G. lucidum. Here, we found that GlPRMT5 negatively regulates the biosynthesis of secondary metabolites. In further in-depth research, GlPP2C1 (a type 2C protein phosphatase) was identified out as an interacting protein of GlPRMT5 by immunoprecipitation-mass spectrometry (IP-MS). Further mass spectrometry detection revealed that GlPRMT5 symmetrically dimethylates the arginine 99 (R99) and arginine 493 (R493) residues of GlPP2C1 to weaken its activity. The symmetrical dimethylation modification of the R99 residue is the key to affecting GlPP2C1 activity. Symmetrical demethylation-modified GlPP2C1 does not affect the interaction with GlPRMT5. In addition, silencing GlPP2C1 clearly reduced GA content, indicating that GlPP2C1 positively regulates the biosynthesis of secondary metabolites in G. lucidum. In summary, this study reveals the molecular mechanism by which GlPRMT5 regulates secondary metabolites, and these studies provide further insights into the target proteins of GlPRMT5 and symmetric dimethylation sites. Furthermore, these studies provide a basis for the mutual regulation between different epigenetic modifications.
... To study the competition observed between R4_64 and R6_5 in Fig 5, putative tertiary structures of the variable regions of the sequences reported in this study were generated using in silico methods [49][50][51][52][53][54] (S4.1 contains detailed methodology in S4 File) and compared to the crystallographic structure of the subunit of hCG (1HRP.PDB [6]) (S4.2 Table in S4 File). The variable regions of the in-silico folded aptamers are of similar size to the β subunit of hCG: they have similar volumes (12.6×10 3 Å 3 for hCG's β subunit, compared to an average volume of 10.6 ± 0.3 Å 3 for the predicted tertiary structures of sequences R4_1, R4_64, R5_4, and R6_5) and similar surface areas (6.5×10 3 Å 2 , compared to 5.8 ± 0.2 ×10 3 Å 2 for the protein's βsubunit and folded aptamers' variable regions, respectively). ...
Article
Full-text available
Human chorionic gonadotropin (hCG) is a glycoprotein hormone used as a biomarker for several medical conditions, including pregnancy, trophoblastic and nontrophoblastic cancers. Most commercial hCG tests rely on a combination of antibodies, one of which is usually specific to the C-terminal peptide of the β-subunit. However, cleavage of this region in many hCG degradation variants prevents rapid diagnostic tests from quantifying all hCG variants in serum and urine samples. An epitope contained within the core fragment, β1, represents an under-researched opportunity for developing immunoassays specific to most variants of hCG. In the study described here, we report on a SELEX procedure tailored towards the identification of two pools of aptamers, one specific to the β-subunit of hCG and another to the β1 epitope within it. The described SELEX procedure utilized antibody-blocked targets, which is an underutilized strategy to exert negative selection pressure and in turn direct aptamer enrichment to a specific epitope. We report on the first aptamers, designated as R4_64 and R6_5, each capable of recognising two distinct sites of the hCG molecule—the β-subunit and the (presumably) β 1 -epitope, respectively. This study therefore presents a new SELEX approach and the generation of novel aptamer sequences that display potential hCG-specific biorecognition.
... Even if the structural distinctions between the bound and free forms are just marginally different, the accuracy gap raises the possibility that the rigidity assumption may not be totally supported. Additionally, using straight forward scoring criteria like the assessment of surface complementarity (Katchalski-Katzir et al.,1992), surface area accessible to solvent (SASA) burial, solvation-222 free energy, electrostatic interaction energy, or total molecular dynamics energy, it is impossible to distinguish between structures that are close to native and those that are far from it. (Shoichet et al.,1991). ...
Article
Full-text available
Molecular docking is a routinely employed tool in computer-aided structure-based rational drug design. It evaluates how well the ligands, or small molecules, and the target molecule fit together. In order to predict how minute molecules will interact with a target protein whose 3D structure is known, a programme called Auto Dock Tools (ADT) was developed. In this docking study, the ligand position within the enzyme binding site and the binding energy may both be visualised. It can be utilised to create novel medications and comprehend how binding works. The heterocyclic nitrogen-containing compound quinazoline, which is a constituent of many synthetic molecules, can be produced via a variety of synthetic methods. Quinazoline and quinazolinone scaffolds have caught the interest of medicinal chemists for the development of novel medications or therapeutic prospects due to their distinct pharmacological features. In addition to its diverse applications, quinazoline has anticancer, antimicrobial, anticonvulsant, and antihyperlipidemic properties. The pharmacological activity and molecular docking studies of quinazoline scaffolds are summarised in this article. The review also helps to hasten the drug development process by identifying the potential contribution of these hybridised pharmacophoric traits to the manifestation of various pharmacological actions.
Article
Full-text available
Excessive neuroinflammation after spinal cord injury (SCI) is a major hurdle during nerve repair. Although proinflammatory macrophage/microglia-mediated neuroinflammation plays important roles, the underlying mechanism that triggers neuroinflammation and aggravating factors remain unclear. The present study identified a proinflammatory role of semaphorin3C (SEMA3C) in immunoregulation after SCI. SEMA3C expression level peaked 7 days post-injury (dpi) and decreased by 14 dpi. In vivo and in vitro studies revealed that macrophages/microglia expressed SEMA3C in the local microenvironment, which induced neuroinflammation and conversion of proinflammatory macrophage/microglia. Mechanistic experiments revealed that RAGE/NF-κB was downstream target of SEMA3C. Inhibiting SEMA3C-mediated RAGE signaling considerably suppressed proinflammatory cytokine production, reversed polarization of macrophages/microglia shortly after SCI. In addition, inhibition of SEMA3C-mediated RAGE signaling suggested that the SEMA3C/RAGE axis is a feasible target to preserve axons from neuroinflammation. Taken together, our study provides the first experimental evidence of an immunoregulatory role for SEMA3C in SCI via an autocrine mechanism.
Article
Full-text available
Investigating protein–protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein–protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein–protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.
Chapter
Ca2+/Calmodulin-activated kinase II (CaMKII) as a switchable enzyme with autonomous inhibition is critical to learning and memory, working together with its various regulators, such as Tiam1 (T-cell lymphoma invasion and metastasis 1), a guanine nucleotide exchange factor (GEF). We here propose a model of molecular memory that both CaMKII and Tiam1 would be concurrently switched between the two basic states: autonomously-inhibited versus reciprocally-unlocked, by their multi-domain interactions. It is reported that the kinase domain (KD) of CaMKII interacts with Tiam1 mainly through the carboxyl tail (CT). Based on the documented evidence and our simulation results, we propose that CT could bind the GEF domain DH–PHC thus playing a key role in Tiam1 autoinhibition, which is relieved by CT/KD binding. This implies a duo complex of CaMKII/Tiam1 that consists of two binding pairs: DH–PHC/AID in addition to CT/KD, providing new mechanistic insights for both CaMKII and Tiam1. Taken together, cellular activities would be concurrently memorized by reciprocal interactions into both CaMKII and Tiam1, potentially more robust and reliable, awaiting future experimental explorations.
Article
Full-text available
Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
Article
A computational method for attempting to predict protein complexes from the coordinates of the individual proteins has been developed. It is based on matching complementary patterns of knobs and holes. The computer algorithm correctly and uniquely predicts the association of the alpha and beta subunits to form the αβ dimer corresponding to the α1β1 interface in the hemoglobin tetramer. It fails to correctly dock trypsin inhibitor onto trypsin. Nevertheless, this lone success is still a significant advance over previous protein-docking algorithms. The method is also important because it introduces several ways to measure the shape of protein surface regions.
Article
An algorithm for solving the protein docking problem is presented. Many tentative dockings are first generated by requiring a hole on the surface of one protein to match a knob on the surface of the other. All the tentative dockings are then applied. The initial configurations thus generated are further optimized. The optimization is facilitated by giving a discrete representation to the protein interior and a double-layer discrete representation to the protein surface. The algorithm presented correctly predicts the association of trypsin with its inhibitor as well as that of the α and β subunits in hemoglobin.
Article
A quantitative function equivalent to the "molecular" surface proposed by F. M. Richards [(1977) Annu. Rev. Biophys. Bioeng. 6, 151--176] is defined by the closest approach of solvent spheres to a macromolecule. The function can be used to visualize surface topography, polarity, and charge either as a three-dimensional net or by mapping onto a plane; to calculate surface areas; and to demarcate complementary sites in contacts between subunits. Applications to shape-specific recognition in protein structure and aggregation are discussed.
Article
The structure of horse methaemoglobin has been redetermined by phase extension and refinement. This has improved our knowledge of the haem geometry and the stereochemistry of the interfaces between the subunits, and confirmed the disorder of the C-terminal residues. Using new four-circle diffractometer data between the limiting spheres of radius 10 and 2.0 Å−1, the co-ordinates determined by Perutz et al. (1968a,b) were subjected to successive cycles of real-space refinement into electron density maps calculated with observed ¦F¦ values and phases derived from the latest refined model, until the reliability index had dropped from an initial value of 0.45 to 0.23. The positions of the iron atoms relative to the planes of the porphyrin rings were refined separately, and checked by Fourier syntheses based on anomalous scattering and by difference Fourier syntheses calculated with coefficients from which the iron contributions had been removed. The general root-mean-squared error in atomic positions is 0.32 Å; the probable error in the displacement of the iron atoms from the porphyrin planes is 0.06 Å. The difference Fourier synthesis, obtained after refinement of the protein was complete, showed 41 bound water molecules per asymmetric unit and also revealed five errors in amino acid sequence, one of which was confirmed chemically.
Article
An automatic procedure which generates possible modes of protein-protein association is developed and applied to the bovine pancreatic trypsin inhibitor-trypsin complex as a test case. Using a simplified model in which each residue is replaced by one interaction center, all possible modes of interaction between the inhibitor and the active center of the enzyme are generated systematically. The non-bonded interactions between the molecules and the protein surface area buried in the generated interfaces are evaluated and used as criteria for selecting stable complexes. We show that satisfactory estimates of accessible and buried surface areas can be made using the simplified model.The procedure leads to about nine structures having non-bonded interactions and buried surface areas similar to those of the native complex. This suggests that the major contributions to the free energy of dissociation are taken into account by our selection procedure, though complementarity and specificity are not properly represented in the simplified model. However, it makes it possible to scan a much larger number of configurations than would otherwise be feasible, chiefly through elimination of side-chain detail.
Article
Predicting the structures of protein-protein complexes is a difficult problem owing to the topographical and thermodynamic complexity of these structures. Past efforts in this area have focussed on fitting the interacting proteins together using rigid body searches, usually with the conformations of the proteins as they occur in crystal structure complexes. Here we present work which uses a rigid body docking method to generate the structures of three known protein complexes, using both the bound and unbound conformations of the interacting molecules. In all cases we can regenerate the geometry of the crystal complexes to high accuracy. We also are able to find geometries that do not resemble the crystal structure but nevertheless are surprisingly reasonable both mechanistically and by some simple physical criteria. In contrast to previous work in this area, we find that simple methods for evaluating the complementarity at the protein-protein interface cannot distinguish between the configurations that resemble the crystal structure complex and those that do not. Methods that could not distinguish between such similar and dissimilar configurations include surface area burial, solvation free energy, packing and mechanism-based filtering. Evaluations of the total interaction energy and the electrostatic interaction energy of the complexes were somewhat better. Of the techniques that we tried, energy minimization distinguished most clearly between the "true" and "false" positives, though even here the energy differences were surprisingly small. We found the lowest total interaction energy from amongst all of the putative complexes generated by docking was always within 5 A root-mean-square of the crystallographic structure. There were, however, several putative complexes that were very dissimilar to the crystallographic structure but had energies that were close to that of the low energy structure. The magnitude of the error in energy calculations has not been established in macromolecular systems, and thus the reliability of the small differences in energy remains to be determined. The ability of this docking method to regenerate the crystallographic configurations of the interacting proteins using their unbound conformations suggests that it will be a useful tool in predicting the structures of unsolved complexes.
Article
Molecular recognition is achieved through the complementarity of molecular surface structures and energetics with, most commonly, associated minor conformational changes. This complementarity can take many forms: charge-charge interaction, hydrogen bonding, van der Waals' interaction, and the size and shape of surfaces. We describe a method that exploits these features to predict the sites of interactions between two cognate molecules given their three-dimensional structures. We have developed a “cube representation” of molecular surface and volume which enables us not only to design a simple algorithm for a six-dimensional search but also to allow implicitly the effects of the conformational changes caused by complex formation. The present molecular docking procedure may be divided into two stages. The first is the selection of a population of complexes by geometric “soft docking”, in which surface structures of two interacting molecules are matched with each other, allowing minor conformational changes implicitly, on the basis of complementarity in size and shape, close packing, and the absence of steric hindrance. The second is a screening process to identify a subpopulation with many favorable energetic interactions between the buried surface areas. Once the size of the subpopulation is small, one may further screen to find the correct complex based on other criteria or constraints obtained from biochemical, genetic, and theoretical studies, including visual inspection.
Article
An optimized method based on the principle of simulated annealing is presented for determining the relative position and orientation of interacting molecules. The spatial relationships of these molecules are described by intermolecular distance constraints between specific pairs of atoms, such as found in hydrogen bonds or from experimentally determined data. The method makes use of a random walk through six rotational and translational degrees of freedom where the constituent molecules are treated as rigid bodies. Van der Waals repulsions are used only to define a lower bound on distances between constrained atom pairs within the docking procedure. A cost function comprised of purely geometric constraints is optimized via simulated annealing, in order to search for the best orientation and position of the two molecules. Our docking procedure is applied to eight serine proteinase complexes from the Brookhaven Protein Data Bank. For each simulation 100 computations were performed. A typical docking computation requires only a few seconds of CPU time on a VAXserver 3500. The influence of the number of constraints on the final docked positions was studied. The sensitivity of the docking procedure to a ligand structure which is not well defined is also addressed. Possible applications of this method include using approximate distances incorporating complete energy functions.
Article
The Metropolis technique of conformation searching is combined with rapid energy evaluation using molecular affinity potentials to give an efficient procedure for docking substrates to macromolecules of known structure. The procedure works well on a number of crystallographic test systems, functionally reproducing the observed binding modes of several substrates.