ArticlePDF Available

Molecular Surface Recognition: Determination of Geometric Fit Between Proteins and Their Ligands by Correlation Techniques

April 1992
Proceedings of the National Academy of Sciences 89(6):2195-9

April 1992
89(6):2195-9

DOI:10.1073/pnas.89.6.2195

Source
PubMed

Authors:

Isaac Shariv

Weizmann Institute of Science

Miriam Eisenstein

Weizmann Institute of Science

Show all 6 authorsHide

A geometric recognition algorithm was developed to identify molecular surface complementarity. It is based on a purely geometric approach and takes advantage of techniques applied in the field of pattern recognition. The algorithm involves an automated procedure including (i) a digital representation of the molecules (derived from atomic coordinates) by three-dimensional discrete functions that distinguishes between the surface and the interior; (ii) the calculation, using Fourier transformation, of a correlation function that assesses the degree of molecular surface overlap and penetration upon relative shifts of the molecules in three dimensions; and (iii) a scan of the relative orientations of the molecules in three dimensions. The algorithm provides a list of correlation values indicating the extent of geometric match between the surfaces of the molecules; each of these values is associated with six numbers describing the relative position (translation and rotation) of the molecules. The procedure is thus equivalent to a six-dimensional search but much faster by design, and the computation time is only moderately dependent on molecular size. The procedure was tested and validated by using five known complexes for which the correct relative position of the molecules in the respective adducts was successfully predicted. The molecular pairs were deoxyhemoglobin and methemoglobin, tRNA synthetase-tyrosinyl adenylate, aspartic proteinase-peptide inhibitor, and trypsin-trypsin inhibitor. A more realistic test was performed with the last two pairs by using the structures of uncomplexed aspartic proteinase and trypsin inhibitor, respectively. The results are indicative of the extent of conformational changes in the molecules tolerated by the algorithm.

Cross section (at a = 0) through a 3D correlation function 7F ,, Ad The correlation function shown was calculated for the a and ,B subunits of hemoglobin, oriented as in the dimer (from 2HHB, see Figs. 1 c and d). The correlation value at each shift vector {0,,t,y} is represented by the height of the graph. The prominent peak at {a = 0, 8 = 14, y = 17} corresponds to the correct match between the molecules (see Fig. 2d). Other intermolecular surface contacts (such as in Fig. 2b) give rise to the low positive correlation values around the center of the graph. The negative correlation values caused by penetration (see Fig. 2c) are omitted, leaving the empty area at the center. D. Eq. 5 indicates that the transformed correlation function C is obtained by a simple multiplication of the two functions A* and B. The inverse Fourier transform (20) (IFT), defined as Ca,3,y= 1N N N 3

…

Correlation results for different pairs of molecules. The pairs are identified by their respective codes (see text). In each panel, the histogram on the left shows the 10 highest correlation peaks obtained in the scan stage (71 = 1.0-1.2 A), sorted by their score. Each of these peaks was obtained at a different relative orientation of the molecules and corresponds to a potential geometric match. The shaded peak in each histogram corresponds to the known complex between the molecules considered. The histogram on the right side of each panel shows the scores obtained at the discrimination stage (71 = 0.7-0.8 A), for the 10 orientations singled out in the scan stage. Note that in the discrimination stage the spurious peaks (plain) are suppressed, whereas the correct peak (shaded) becomes prominent.

…

Different relative positions of molecules a and b, illustrated by the cross sections a46,m " and bj46,mn from Fig. 1. The relative orientation of the molecules is as in the known a-,B dimer. (a) No contact. (b) Limited contact. (c) Penetration. The penetrated area is represented in black. (d) Good geometric match, as indicated by the extensive overlap of complementary surface layers.

…

Figures - uploaded by Claude Aflalo

Content may be subject to copyright.

Content uploaded by Claude Aflalo

Content may be subject to copyright.

Proc.

Nati.

Acad.

Sci.

USA

Vol.

89,

pp.

2195-2199,

March

1992

Biophysics

Molecular

surface

recognition:

Determination

geometric

fit

between

proteins

and

their

ligands

correlation

techniques

(protein-protein

interaction/surface

complementarity/macromolecular

complex

prediction/molecular

docking)

EPHRAIM

KATCHALSKI-KATZIRtt,

ISAAC

SHARIV§,

MIRIAM

EISENSTEIN¶,

ASHER

FRIESEM§,

CLAUDE

AFLALO

II,

AND

ILYA

VAKSERt

Departments

tMembrane

Research

and

Biophysics,

§Electronics,

sStructural

Biology,

and

IlBiochemistry,

Weizmann

Institute

Science,

Rehovot

76100,

Israel

Contributed

Ephraim

Katchalski-Katzir,

October

24,

1991

ABSTRACT

geometric

recognition

algorithm

was

devel-

oped

identify

molecular

surface

complementarity.

based

purely

geometric

approach

and

takes

advantage

tech-

niques

applied

the

field

pattern

recognition.

The

algorithm

involves

automated

procedure

including

(i)

digital

repre-

sentation

the

molecules

(derived

from

atomic

coordinates)

three-dimensional

discrete

functions

that

distinguishes

between

the

surface

and

the

interior;

(ii)

the

calculation,

using

Fourier

transformation,

correlation

function

that

assesses

the

degree

molecular

surface

overlap

and

penetration

upon

relative

shifts

the

molecules

three

dimensions;

and

(iii)

scan

the

relative

orientations

the

molecules

three

dimensions.

The

algorithm

provides

list

correlation

values

indicating

the

extent

geometric

match

between

the

surfaces

the

molecules;

each

these

values

associated

with

six

numbers

describing

the

relative

position

(translation

and

ro-

tation)

the

molecules.

The

procedure

thus

equivalent

six-dimensional

but

much

faster

design,

and

the

computation

time

only

moderately

dependent

molecular

size.

The

procedure

was

tested

and

validated

using

five

known

complexes

for

which

the

correct

relative

position

the

molecules

the

respective

adducts

was

successfully

predicted.

The

molecular

pairs

were

deoxyhemoglobin

and

methemoglo-

bin,

tRNA

synthetase-tyrosinyl

adenylate,

aspartic

protein-

ase-peptide

inhibitor,

and

trypsin-trypsin

inhibitor.

realistic

test

was

performed

with

the

last

two

pairs

using

the

structures

uncomplexed

aspartic

proteinase

and

trypsin

inhibitor,

respectively.

The

results

are

indicative

the

extent

conformational

changes

the

molecules

tolerated

the

algorithm.

The

association

proteins

with

their

ligands

involves

intri-

cate

inter-

and

intramolecular

interactions,

solvation

effects,

and

conformational

changes.

view

such

complexity,

comprehensive

and

efficient

approach

for

predicting

the

formation of

protein-ligand

complexes

from

the

structure

their

free

components

not

yet

available.

However,

with

some

assumptions,

such

predictions

become

feasible,

and

several

attempts

based

energy

minimization

have

been

partially

successful

(1-6).

Another

simplifying

approach

that

could

alleviate

some

these

difficulties

based

geomet-

ric

considerations.

The

three-dimensional

(3D)

structures

most

protein

complexes

reveal

geometric

match

between

those

parts

the

respective

surfaces

the

protein

and

the

ligand

that

are

contact.

Indeed,

the

shape

and

other

physical

characteristics

the

surfaces

largely

determine

the

nature

the

specific

molecular

interactions

the

complex.

Further-

more,

many

cases

the

structure

the

components

the

complex

closely

resembles

that

the

molecules

their

free,

native

state.

Geometric

matching

thus

seems

play

important

role

determining

the

structure

complex.

Several

investigators

have

exploited

geometric

approach

find

shape

complementarity

between

given

protein

and

its

ligand

(7-19).

They

considered

geometric

match

between

molecular

surfaces

fundamental

condition

for

the

for-

mation

specific

complex

and

pointed

out

the

advantages

the

geometric

approach

(13).

this

approach,

which

treats

proteins

rigid

bodies,

the

complementarity

between

sur-

faces

estimated.

Furthermore,

the

geometric

analysis

could

serve

the

foundation

for

complete

approach

including

energy

considerations.

However,

the

methods

heretofore

developed

for

analyzing

geometric

matching

not

seem

simultaneously

fulfill

the

requirements

for

gen-

erality,

accuracy,

reliability,

and

reasonable

computation

time.

this

paper,

present

geometry-based

algorithm

for

predicting

the

structure

possible

complex

between

mol-

ecules

known

structures.

This

relatively

simple

and

straightforward

algorithm

relies

the

well-established

cor-

relation

and

Fourier

transformation

techniques

used

the

field

pattern

recognition.

The

algorithm

requires

only

that

the

structure

the

molecules

under

consideration

known.

Moreover,

provides

quantitative

data

the

quality

the

contact

between

the

molecules.

The

algorithm

was

tested

and

validated

the

analysis

the

following

complexes,

whose

structures

are

known:

the

a-f

hemoglobin

dimer,

tRNA

synthetase-tyrosinyl

adenylate,

aspartic

pro-

teinase-peptide

inhibitor,

and

trypsin-trypsin

inhibitor.

The

correct

relative

position

the

molecules

within

these

com-

plexes

were

successfully

predicted.

METHOD

Geometric

Recognition

Algorithm.

begin

with

geo-

metric

description

the

protein

and

the

ligand

molecules,

derived

from

their

known

atomic

coordinates.

The

two

molecules

denoted

and

are

projected

onto

three

dimensional

grid

points,

where

they

are

represented

the

discrete

functions

al,m,n

inside

the

molecule

outside

the

molecule,

[la]

and

bi,m,n

inside

the

molecule

outside

the

molecule,

[lb]

Abbreviations:

3D,

three

dimensional;

DFT,

discrete

Fourier

trans-

form;

IFT,

inverse

Fourier

transform.

tTo

whom

reprint

requests

should

addressed.

2195

The

publication

costs

this

article

were

defrayed

part

page

charge

payment.

This

article

must

therefore

hereby

marked

"advertisement"

accordance

with

U.S.C.

§1734

solely

indicate

this

fact.

21%

Biophysics:

Katchalski-Katzir

al.

where

and

are

the

indices

the

grid

(1,

...

N}).

Any

grid

point

considered

inside

the

molecule

there

least

one

atom

nucleus

within

distance

from

it,

where

the

order

van

der

Waals

atomic

radii.

Examples

for

two-dimensional

cross

sections

these

func-

tions

are

presented

Fig.

and

Next,

distinguish

between

the

surface

and

the

interior

each

molecule,

retain

the

value

for

the

grid

points

along

thin

surface

layer

only

and

assign

other

values

the

internal

grid

points.

The

resulting

functions

thus

become

the

surface

the

molecule

a,,mn=

inside

the

molecule

outside

the

molecule,

[2a]

and

the

surface

the

molecule

T1mm=

inside

the

molecule

[2b]

outside

the

molecule,

where

the

surface

defined

here

boundary

layer

finite

width

between

the

inside

and

the

outside

the

molecule.

The

parameters

and

describe

the

value

the

points

inside

the

molecules,

and

all

points

outside

are

set

zero.

Two-

dimensional

cross

sections

these

functions

are

shown

Figs.

and

our

method,

matching

surfaces

accomplished

calculating

correlation

functions.

The

correlation

between

the

discrete

functions

and

defined

N N

al,m,n

*bl+a,m+,P,n+y,

[3]

1=1

m=1

n=1

where

and

are

the

number

grid

steps

which

molecule

shifted

with

respect

molecule

each

dimension.

the

shift

vector

{a43,'y}

such

that

there

contact

between

the

two

molecules

(see

Fig.

2a),

the

corre-

lation

value

zero.

there

contact

between

the

surfaces

FIG.

Typical

cross

sections

through

the

grid

representa-

tions

the

molecules.

(a)

Cross

section

(at

46)

through

the

function

alm

derived

projecting

the

subunit

hemoglobin

(from

2HHB;

see

text)

onto

grid

90).

The

values

and

are

represented

white

and

black,

respectively.

(b)

The

cross

section

b46i,mn

was

similarly

derived

for

the

subunit

(from

2HHB).

Other

details

are

(c)

The

cross

section

(at

46)

through

the

function

which

was

obtained

distinguishing

the

surface

layer

from

the

interior

the

molecule

the

function

The

large

negative

value

for

represented

gray.

(d)

Cross

section

b46,mnt

similarly

derived

from

blm,n.

The

small

positive

value

for

represented

different

shade

gray.

The

values

for

and

were

1.8

and

1.2

respectively.

l.d:

..-I

,..

FIG.

Different

relative

positions

molecules

and

illus-

trated

the

cross

sections

a46,m

and

bj46,mn

from

Fig.

The

relative

orientation

the

molecules

the

known

a-,B

dimer.

(a)

contact.

(b)

Limited

contact.

(c)

Penetration.

The

penetrated

area

represented

black.

(d)

Good

geometric

match,

indicated

the

extensive

overlap

complementary

surface

layers.

(Fig.

2b),

the contribution

the

correlation

value

positive.

Nonzero

correlation

values

could

also

obtained

when

one

molecule

penetrates

into

the

other

(Fig.

2c).

Since

such

penetration

physically

forbidden,

distinction

between

surface

contact

and

penetration

must

clearly

formulated.

so,

assign

large

negative

values

and

small

nonnegative

values

Thus,

when

the

shift

vector

{a,,y}

such

that

molecule

penetrates

molecule

the

multiplication

the

negative

numbers

(p)

7aby

the

positive

numbers

results

negative

contribution

the

overall

correlation

value.

Consequently,

the

correlation

value

for

each

displacement

simply

the

score

for

overlap-

ping

surfaces

corrected

the

penalty

for

penetration.

Positive

correlation

values

are

obtained

when

the

contri-

bution

from

surface

contact

outweighs

that

from

penetration.

Thus,

good

geometric

match

(such

Fig.

2d)

represented

high

positive

peak,

and

low

values

reflect

poor

match

between

the

molecules.

cross

section

typical

correlation

function

for

good

match

presented

Fig.

The

coordinates

the

prominent

peak

denote

the

relative

shift

molecule

yielding

good

match

with

molecule

The

location

the

recognition

sites

the

surface

each

molecule

can

readily

determined

from

these

coordinates.

addition,

the

width

the

peak

provides

measure

for

the

relative

displacement

allowed

before

matching

lost.

direct

calculation

the

correlation

between

the

two

functions

(see

Eq.

rather

lengthy,

since

involves

multiplications

and

additions

for

each

the

possible

relative

shifts

{a,8,y},

resulting

order

computing

steps.

Therefore,

chose

take

advantage

Fourier

transformation

that

allowed

calculate

the

correlation

function

much

rapidly.

The

discrete

Fourier

transform

(20)

(DFT)

function

xlmn

defined

N N

Xopq=

exp[-21ri(ol

qn)/N]-X

1=1

m=1

n=1

[4]

where

and

The

application

this

transformation

both

sides

Eq.

yields

(21)

Cop,q

A*pq

Bopq,

[5]

where

and

are

the

DFT

the

functions

and

respectively,

and

the

complex

conjugate

the

DFT

_eI

Proc.

Natl.

Acad.

Sci.

USA

(1992)

Proc.

Natl.

Acad.

Sci.

USA

(1992)

2197

FIG.

Cross

section

(at

through

a3D

correlation

function

The

correlation

function

shown

was

calculated

for

the

and

subunits

hemoglobin,

oriented

the

dimer

(from

2HHB,

see

Figs.

and

d).

The

correlation

value

each

shift

vector

{0,,t,y}

represented

the

height

the

graph.

The

prominent

peak

14,

17}

corresponds

the

correct

match

between

the

molecules

(see

Fig.

2d).

Other

intermolecular

surface

contacts

(such

Fig.

2b)

give

rise

the

low

positive

correlation

values

around

the

center

the

graph.

The

negative

correlation

values

caused

penetration

(see

Fig.

2c)

are

omitted,

leaving

the

empty

area

the

center.

Eq.

indicates

that

the

transformed

correlation

function

obtained

simple

multiplication

the

two

functions

and

The

inverse

Fourier

transform

(20)

(IFT),

defined

Ca,3,y=

N N

exp[2iri(oa

qy)/N]

Cop,q,

[6]

o=1

p=1

q=1

used

obtain

the

desired

correlation

between

the

two

original

functions

and

The

Fourier

transformations

can

performed

with

the

fast

Fourier

transform

algorithm

(20),

which

requires

less

than

the

order

In(N3)

steps

for

transforming

function

values.

Thus,

the

overall

procedure

leading

Eq.

significantly

faster

than

the

direct

calculation

according

Eq.

Finally,

complete

general

for

match

between

the

surfaces

molecules

and

the

correlation

function

has

calculated

for

all

relative

orientations

the

molecules.

practice,

molecule

fixed,

whereas

the

three

Euler

angles

defining

the

orientation

molecule

(xyz

convention

ref.

22)

are

varied

fixed

intervals

degrees.

This

results

complete

scan

360

180/A3

orientations

for

which

the

correlation

function

must

calculated.

The

entire

procedure

described

above

can

summarized

the

following

steps:

(i)

derive

from

atomic

coordinates

molecule

(Eq.

2),

(ii)

[DFT(Z!)]*

(Eq.

4),

(iii)

derive

from

atomic

coordinates

molecule

(Eq.

2),

(iv)

DFT(b)

(Eq.

4),

(v)

A*.B

(Eq.

5),

(vi)

IFT(C)

(Eq.

6),

(vii)

look

for

sharp

positive

peak

cE,

(viii)

rotate

molecule

new

orientation,

(ix)

repeat

steps

iii-viii

and

end

when

the

orientations

scan

completed,

and

(x)

sort

all

the

peaks

their

height.

Each

high

and

sharp

peak

found

this

procedure

indi-

cates

geometric

match

and

thus

represents

potential

com-

plex.

The

relative

position

and

orientation

the

molecules

within

each

such

complex

can

readily

derived

from

the

coordinates

the

correlation

peak,

and

from

the

three

Euler

angles

which

the

peak

was

found.

Implementation

the

Algorithm.

implement

our

algo-

rithm,

necessary

assign

specific

values

the

various

parameters

involved-i.e.,

the

surface

layer

thickness,

and

the

grid

step

size

denoted

The

choice

these

values

based

number

considerations,

outlined

this

section.

begin

noting

that

the

match

between

the

functions

and

not

perfect.

One

reason

that

the

structure

known

complexes

reveals

small

gaps

between

the

molecules,

which

are

also

reflected

their

mathematical

representation.

Furthermore,

the

functions

and

are

derived

from

atomic

coordinates

sets

that

not

include

hydrogen

atoms.

This,

addition

the

limited

accuracy

the

coordinates,

may

affect

the

quality

the

match.

Finally,

minor

conformational

changes

may

occur

the

surface

molecules

upon

complex

formation

(locally

induced

fit).

Such

changes

are

not

incor-

porated

the

functions

and

when

they

represent

native

molecules

that

are

assumed

rigid.

Therefore,

penetra-

tion

and

small

gaps

occur

along

the

contact

area.

ensure

that

the

correct

match

between

molecules

not

missed,

our

algorithm

must

able

tolerate

these

imperfections.

This

achieved

assigning

than

one

layer

grid

points

the

surface

that

the

surface

thickness

for

molecule

1.5-2.5

(see

Fig.

1c).

Consequently,

penetrations

and

gaps

that

are

smaller

than

these

values

are

tolerated.

should

noted

that

inherent

drawback

the

choice

thicker

surface

layer

the

concomitant

increase

the

number

faulty

matches.

The

thickness

the

surface

layer

also

influences

the

angular

tolerance.

This

tolerance

defined

as the

maximal

deviation

from

the

correct

match

orientation

that

would

still

result

distinct

correlation

peak.

Typically,

surface

layer

thickness

yielded

angular

tolerance

about

100.

Thus,

the

angular

step

was

set

200,

resulting

2916

different

orientations

molecule

each

which

the

correlation

function

had

evaluated.

The

parameter

used

derive

the

functions

alm

and

bimn

(see

Eq.

1),

was

set

1.8

which

larger

about

0.2

than

the

average

van

der

Waals

radius

for

carbon,

nitrogen,

and

oxygen.

This

compensated

for

the

fact

that

hydrogen

atoms,

missing

the

coordinates

sets,

are

not

projected

our

grids.

The

parameters

and

representing

the

interior

the

molecules,

were

set

-15

and

respectively.

This

ensures

that

the

correlation

value

substantially

reduced

case

penetration.

Several

other

choices

for

and

the

ranges

-1

and

did

not

significantly

affect

the

performance

the

algorithm.

Another

important

parameter

the

algorithm

the

grid

step

size,

7-.

Optimal

results

were

obtained

when

was

set

0.7-0.8

corresponding

half

the

carbon-carbon

bond

length.

Yet,

since

the

product

q-N

should

larger

than

the

size

any

potential

complex,

finer

grid

requires

larger

number

points

This

leads

turn

excessive

compu-

tation

time.

Therefore,

performed

initial

scan

the

angular

orientations

with

larger

grid

steps

(71

1.0-1.2

A);

thus,

computations

that

would

take

days

with

the

finer

grid

were

performed

hours.

However,

with

such

large

grid

steps,

spurious

correlation

peaks,

which

may

even

higher

than

the

correct

peak,

appear.

Hence,

the

scan

stage

was

followed

discrimination

stage,

which

the

correlation

functions

were

recalculated

with

finer

grid

0.7-0.8

A),

but

only

for

those

orientations

that

yielded

the

highest

peaks

the

scan

stage.

This

discrimination

stage

enhanced

the

correct

correlation

peak

and

suppressed

spurious

peaks.

FORTRAN

program

was

developed

for

implementing

the

algorithm.

The

parameters

the

program,

accordance

Biophysics:

Katchalski-Katzir

al.

2198

Biophysics:

Katchalski-Katzir

al.

with

the

arguments

given

above,

were

assigned

the

following

values:

1.8

20',

-15,

1.0-1.2

for

the

scan

stage,

and

128

(71

0.7-0.8

for

the

discrimination

stage.

The

program

was

run

Convex

C-220

computer

with

the

Veclib

fast

Fourier

trans-

form

subroutine.

The

computation

time

for

each

iteration

(steps

iii-viii

the

summarized

algorithm)

the

scan

stage

was

sec.

The

total

computation

time

for

matching

two

molecules

the

range

1100

atoms

each,

including

both

the

initial

scan

and

the

discrimination

stage,

was

typically

7.5

hr.

RESULTS

Our

algorithm

was

applied

several

known

complexes,

whose

coordinates

are

given

the

Brookhaven

Protein

Data

Bank

(Brookhaven

National

Laboratory,

Upton,

NJ)

test

its

ability

predict

correct

structures

protein

complexes.

chose

complexes

that

represent

wide

variety

relative

sizes

for

molecules

and

(30-2500

atoms).

These

are

two

hemoglobin

variants:

human

deoxyhemoglobin

(23)

(desig-

nated

2HHB)

and

horse

methemoglobin

(24)

(designated

2MHB),

representing

naturally

occurring

heterodimers;

and

three

complexes:

tRNA

synthetase-tyrosinyl

adenylate

(25)

(designated

3TS1),

aspartic

proteinase-peptide

inhibitor

(26)

(designated

3APR),

and

trypsin-trypsin

inhibitor

(27)

(des-

ignated

2PTC).

these

tests,

the

component

molecules

were

treated

separate

entities

using

their

respective

atomic

coordinates

within

the

complex.

Additional

tests

were

per-

formed

with

native

aspartic

proteinase

(28)

and

its

peptide

inhibitor

(designated

2APR)

and

with

trypsin

and

native

trypsin

inhibitor

(29)

(designated

4PTI).

The

relative

position

the

molecules

yielding

the best

geometric

fit

complex,

determined

the

algorithm,

was

finally

compared

with

the

corresponding

known

complex.

The

results

are

summarized

Fig.

shows

histograms

correlation

peaks

for

each

pair

molecules.

The

left

side

each

panel

presents

the

highest

peaks

obtained

the

scan

stage,

whereas

the

right

side

shows

the

peaks

reevaluated

for

the

same

orientations

the

discrimination

stage.

evident

from

the

figure,

the

correlation

peak

for

the

known

complex

(shaded)

not

necessarily

the

highest

the

scan

stage.

However,

the

highest

peak

that

was

obtained

after

discrimination

represents

the

right

orientation

and

po-

sition

molecule

with

respect

and

significantly

higher

than

the

other

peaks.

Application

the

algorithm

the

and

subunits

human

hemoglobin

(2HHB

Fig.

4a)

revealed

that

the

highest

peak

the

scan

stage

(score

312),

corresponds

the

well-known

a-,8

dimer.

the

horse

methemoglobin

variant,

however

(2MHB

Fig.

4b),

the

correct

position

for

the

dimer

represented

the

third

peak

(score

290)

the

sorted

histogram

for

the

scan

stage.

Nevertheless,

both

these

peaks

became

predominant

the

discrimination

stage

(scores

302

and

347

for

2HHB

and

2MHB,

respectively).

The

hemoglobin

molecules

contain

two

a-P

dimers

symmetrically

arranged

that

each

subunit

contact

with

two

subunits.

The

algorithm

should

thus

yield,

principle,

two

major

correlation

peaks

for

the

interaction

between

and

subunits.

The

first,

mentioned

above,

corresponds

the

tight

contact

between

the

subunits

the

a-P

dimer,

and

the

other

corresponds

the

looser

contact

between

the

subunit

one

dimer

with

the

subunit

the

other.

This

second

expected

peak

(not

shown)

was

rather

low

(scores

190

and

178

for

2HHB

and

2MHB,

respectively),

was

not

included

among

the

peaks

the

scan

stage.

However,

was

enhanced

upon

recalculation

with

the

finer

grid

(scores

260

and

185,

respectively),

contrast

with

the

spurious

peaks,

which

were

all

reduced.

The

relation

between

the

extent

geometric

fit

these

two

associations

may

reflect

)

..17--

3AI\

IDli

~ ~

WE:-

PeakI

"ak

)

FIG.

Correlation

results

for

different

pairs

molecules.

The

pairs

are

identified

their

respective

codes

(see

text).

each

panel,

the

histogram

the

left

shows

the

highest

correlation

peaks

obtained

the

scan

stage

(71

1.0-1.2

A),

sorted

their

score.

Each

these

peaks

was

obtained

different

relative

orientation

the

molecules

and

corresponds

potential

geometric

match.

The

shaded

peak

each

histogram

corresponds

the

known

complex

between

the

molecules

considered.

The

histogram

the

right

side

each

panel

shows

the

scores

obtained

the

discrimi-

nation

stage

(71

0.7-0.8

A),

for

the

orientations

singled

out

the

scan

stage.

Note

that

the

discrimination

stage

the

spurious

peaks

(plain)

are

suppressed,

whereas

the

correct

peak

(shaded)

becomes

prominent.

the

well-known

higher

stability

for

the

interdimer

associa-

tion.

Next,

applied

the

algorithm

the

tRNA

synthetase-

tyrosinyl

adenylate

pair

(3TS1

Fig.

4c),

which

served

example

for

complex

between

high

molecular

weight

protein

and

small

ligand.

this

case

the

correlation

peak,

which

corresponds

the

correct

position

the

ligand

the

complex,

was

not

the

highest

one

the

scan

stage.

However,

discrimination

yielded

the

expected

result-i.e.,

the

correct

orientation

was

associated

with

peak

distinctly

higher

than

the

other

peaks.

Further

assessment

the

procedure

was

carried

out

analyzing

the

complex

between

aspartic

proteinase

and

its

peptide

inhibitor

(3APR

Fig.

4).

This

system

illustrates

case

which

the

structure

the

protein

the

complex

closely

resembles

that

the

native

protein

(26,

28).

thus

possible

look

for

the

best

match

between

the

structure

the

complexed

peptide

and

the

protein,

either

its

com-

plexed

(3APR)

native

(2APR)

structure.

With

the

com-

plexed

protein,

the

correct

relative

position

the

ligand

yielded

the

highest-peak

already

the

scan

stage

(Fig.

4d),

whereas

with

the

native

protein,

the

peak

describing

the

correct

position

was

only

the

fourth

the

sorted

list

(Fig.

4e).

However,

the

hierarchy

the

peaks

changed

markedly

the

discrimination

stage,

where

the

highest

correlation

peak

indicated

structure

closely

resembling

that

the

Proc.

NaM

Acad

Sci.

USA

(1992)

Proc.

Natl.

Acad.

Sci.

USA

(1992)

2199

known

complex.

When

the

native

protein

used,

the

cor-

relation

peaks

both

stages

are

somewhat

lower

than

the

corresponding

ones

for

the

protein

the

complex,

indicating

slightly

poorer

fit.

Analysis

the

complex

trypsin-trypsin

inhibitor

(2PITC

Fig.

was

chosen

because

the

native

structure

one

the

components,

the

inhibitor,

differs

from

that

the

complex.

Specifically,

conformational

changes

involving

the

side

chains

three

amino

acids,

located

the

binding

site

the

inhibitor,

occur

upon

complex

formation

(27,

29).

When

the

structure

the

inhibitor

the

complex

was

used

(Fig.

4f),

the

highest

peak

after

discrimination

corresponded

the

correct

position

the

inhibitor

the

complex.

However,

when

the native

structure

the

inhibitor

(4PTI)

was

used

(Fig.

4g),

the

algorithm

did

not

yield

distinct

correlation

peak

neither

the

scan

stage

nor

the discrimination

stage.

This

result

indicates

that

the

extent

the

conformational

change

occurring

the

surface

the

inhibitor

upon

binding

trypsin

exceeds

that

tolerated

the

algorithm.

CONCLUSION

Our

geometry-based

algorithm

predicts

the

structure

com-

plexes

formed

between

the

two

constituent

molecules

using

their

atomic

coordinates,

without

any

prior

information

their

binding

sites.

The

molecular

surfaces

need

not

undergo

transformation

except

simple

digitization;

thus,

all

the

surface

geometric

features

are

fully

preserved

within

the

accuracy

the

grid

step

size.

The

values

chosen

for

the

parameters

the

algorithm

are

general

and

not

have

readjusted

for

each

molecular

pair.

Our

algorithm

exploits

Fourier

transformation

and

correlation

techniques,

that

all

possible

associations

between

the

molecules

are

evaluated

much

rapidly

than

the

equivalent

exhaustive

six

dimensions.

Another

important

feature

the

algorithm

that

the

computation

time

approximately

proportional

kln(k),

where

the

number

atoms

the

complex.

Consequently,

the increase

computation

time

with

larger

molecules

moderate.

tested

our

algorithm

five

known

complexes,

for

which

the

correct

structure

the

complex

was

predicted

from

the

atomic

coordinates

the

component

molecules

within

the

complex.

test

carried

out

using

the

coordinates

native

aspartic

proteinase

(see

Fig.

4e)

also

resulted

the

prediction

the

correct

known

complex

structure.

How-

ever,

when

the

algorithm

was

applied

trypsin

and

its

native

inhibitor,

distinct

match

was

found

(see

Fig.

4g).

This

most

likely

due

the

known

conformational

change

the

trypsin

inhibitor

binding

site

upon

complex

formation

(27,

29)

(see

also

refs.

18,

and

19).

The

results

our

tests

indicate

that

long

the

conformational

changes

are

small,

the

algorithm

may

used

successfully

predict

the

structure

hitherto

unknown

complexes

from

the

structure

two

known

components.

Further

enhancements

the

algorithm

are

presently

being

developed

introduce

some

physical

features

the

molecular

interface,

such

surface

charges

and

degrees of

hydrophobicity.

thank

Steinberg

for

helpful

discussions

and

Heimrath

and

Revacha

for

technical

assistance.

M.E.

acknowledges

support

from

the

Kimmelman

Center

for

biomolecular

structure

and

assem-

bly;

C.A.

and

I.A.V.

thank

the

Ministry

Absorption

and

"Fon-

dation

RASCHI"

for

partial

financial

support;

and

I.S.

thanks

the

Ministry

Science

and

Technology

for

support.

Wodak,

Janin,

(1978)

Mol.

Biol.

124,

323-342.

Goodford,

(1985)

Med.

Chem.

28,

849-857.

Billeter,

M.,

Havel,

Kuntz,

(1987)

Biopolymers

26,

777-793.

Warwicker,

(1989)

Mol.

Biol.

206,

381-395.

Goodsell,

Olson,

(1990)

Proteins

195-202.

Yue,

S.-Y.

(1990)

Protein

Eng.

177-184.

Greer,

Bush,

(1978)

Proc.

Natl.

Acad.

Sci.

USA

75,

303-307.

Kuntz,

D.,

Blaney,

M.,

Oatley,

J.,

Langridge,

Ferrin,

(1982)

Mol.

Biol.

161,

269-288.

Zielenkiewicz,

Rabczenko,

(1984)

Theor.

Biol.

111,

17-30.

10.

Zielenkiewicz,

Rabczenko,

(1985)

Theor.

Biol.

116,

607-612.

11.

Fanning,

W.,

Smith,

Rose,

(1986)

Biopoly-

mers

25,

863-883.

12.

Novotny,

J.,

Handschumacher,

M.,

Haber,

E.,

Bruccoleri,

E.,

Carlson,

B.,

Fanning,

W.,

Smith,

Rose,

(1986)

Proc.

Natl.

Acad.

Sci.

USA

83,

226-230.

13.

Connolly,

(1986)

Biopolymers

25,

1229-1247.

14.

DesJarlais,

L.,

Sheridan,

P.,

Seibel,

L.,

Dixon,

S.,

Kuntz,

Venkataraghavan,

(1988)

Med.

Chem.

31,

722-729.

15.

Chirgadze,

Y.,

Kurochkina,

Nikonov,

(1989)

Protein

Eng.

105-110.

16.

Lewis,

Dean,

(1989)

Proc.

Soc.

London

Ser.

236,

141-162.

17.

Wang,

(1991)

Comput.

Chem.

12,

746-750.

18.

Jiang,

Kim,

(1991)

Mol.

Biol.

219,

79-102.

19.

Schoichet,

Kuntz,

(1991)

Mol.

Biol.

221,

327-346.

20.

Elliott,

Rao,

(1982)

Fast

Transforms:

Algo-

rithms,

Analyses,

Applications

(Academic,

Orlando,

FL),

pp.

58-90.

21.

Papoulis,

(1962)

The

Fourier

Integral

and

Its

Applications

(MacGraw-Hill,

New

York),

pp.

244-245.

22.

Goldstein,

(1980)

Classical

Mechanics

(Addison-Wesley,

Reading,

MA),

608.

23.

Fermi,

G.,

Perutz,

F.,

Shaanan,

Fourme,

(1984)

Mol.

Biol.

175,

159-174.

24.

Ladner,

C.,

Heidner,

Perutz,

(1977)

Mol.

Biol.

114,

385-414.

25.

Brick,

P.,

Bhat,

Blow,

(1989)

Mol.

Biol.

208,

83-98.

26.

Suguna,

K.,

Padlan,

A.,

Smith,

W.,

Carlson,

Davies,

(1987)

Proc.

Natl.

Acad.

Sci.

USA

84,

7009-

7013.

27.

Marquart,

M.,

Walter,

J.,

Deisenhofer,

J.,

Bode,

Huber,

(1983)

Acta

Crystallogr.

Sect.

39,

480-490.

28.

Suguna,

K.,

Bott,

R.,

Padlan,

A.,

Subramanian,

E.,

Sheriff,

S.,

Cohen,

Davies,

(1987)

Mol.

Biol.

196,

877-900.

29.

Wlodawer,

A.,

Deisenhofer,

Huber,

(1987)

Mol.

Biol.

193,

145-156.

Biophysics:

Katchalski-Katzir

al.

P62 promotes FSH-induced antral follicle formation by directing degradation of ubiquitinated WT1

Article

Full-text available

May 2024
CELL MOL LIFE SCI

In females, the pathophysiological mechanism of poor ovarian response (POR) is not fully understood. Considering the expression level of p62 was significantly reduced in the granulosa cells (GCs) of POR patients, this study focused on identifying the role of the selective autophagy receptor p62 in conducting the effect of follicle-stimulating hormone (FSH) on antral follicles (AFs) formation in female mice. The results showed that p62 in GCs was FSH responsive and that its level increased to a peak and then decreased time-dependently either in ovaries or in GCs after gonadotropin induction in vivo. GC-specific deletion of p62 resulted in subfertility, a significantly reduced number of AFs and irregular estrous cycles, which were same as pathophysiological symptom of POR. By conducting mass spectrum analysis, we found the ubiquitination of proteins was decreased, and autophagic flux was blocked in GCs. Specifically, the level of nonubiquitinated Wilms tumor 1 homolog (WT1), a transcription factor and negative controller of GC differentiation, increased steadily. Co-IP results showed that p62 deletion increased the level of ubiquitin-specific peptidase 5 (USP5), which blocked the ubiquitination of WT1. Furthermore, a joint analysis of RNA-seq and the spatial transcriptome sequencing data showed the expression of steroid metabolic genes and FSH receptors pivotal for GCs differentiation decreased unanimously. Accordingly, the accumulation of WT1 in GCs deficient of p62 decreased steroid hormone levels and reduced FSH responsiveness, while the availability of p62 in GCs simultaneously ensured the degradation of WT1 through the ubiquitin‒proteasome system and autophagolysosomal system. Therefore, p62 in GCs participates in GC differentiation and AF formation in FSH induction by dynamically controlling the degradation of WT1. The findings of the study contributes to further study the pathology of POR.

GlPRMT5 inhibits GlPP2C1 via symmetric dimethylation and regulates the biosynthesis of secondary metabolites in Ganoderma lucidum

Article

Full-text available

Feb 2024

PRMT5, a type II arginine methyltransferase, is involved in transcriptional regulation, RNA processing and other biological processes and signal transduction. Secondary metabolites are vital pharmacological compounds in Ganoderma lucidum, and their content is an important indicator for evaluating the quality of G. lucidum. Here, we found that GlPRMT5 negatively regulates the biosynthesis of secondary metabolites. In further in-depth research, GlPP2C1 (a type 2C protein phosphatase) was identified out as an interacting protein of GlPRMT5 by immunoprecipitation-mass spectrometry (IP-MS). Further mass spectrometry detection revealed that GlPRMT5 symmetrically dimethylates the arginine 99 (R99) and arginine 493 (R493) residues of GlPP2C1 to weaken its activity. The symmetrical dimethylation modification of the R99 residue is the key to affecting GlPP2C1 activity. Symmetrical demethylation-modified GlPP2C1 does not affect the interaction with GlPRMT5. In addition, silencing GlPP2C1 clearly reduced GA content, indicating that GlPP2C1 positively regulates the biosynthesis of secondary metabolites in G. lucidum. In summary, this study reveals the molecular mechanism by which GlPRMT5 regulates secondary metabolites, and these studies provide further insights into the target proteins of GlPRMT5 and symmetric dimethylation sites. Furthermore, these studies provide a basis for the mutual regulation between different epigenetic modifications.

Generation of epitope-specific hCG aptamers through a novel targeted selection approach

Article

Full-text available

Feb 2024
PLOS ONE

Human chorionic gonadotropin (hCG) is a glycoprotein hormone used as a biomarker for several medical conditions, including pregnancy, trophoblastic and nontrophoblastic cancers. Most commercial hCG tests rely on a combination of antibodies, one of which is usually specific to the C-terminal peptide of the β-subunit. However, cleavage of this region in many hCG degradation variants prevents rapid diagnostic tests from quantifying all hCG variants in serum and urine samples. An epitope contained within the core fragment, β1, represents an under-researched opportunity for developing immunoassays specific to most variants of hCG. In the study described here, we report on a SELEX procedure tailored towards the identification of two pools of aptamers, one specific to the β-subunit of hCG and another to the β1 epitope within it. The described SELEX procedure utilized antibody-blocked targets, which is an underutilized strategy to exert negative selection pressure and in turn direct aptamer enrichment to a specific epitope. We report on the first aptamers, designated as R4_64 and R6_5, each capable of recognising two distinct sites of the hCG molecule—the β-subunit and the (presumably) β 1 -epitope, respectively. This study therefore presents a new SELEX approach and the generation of novel aptamer sequences that display potential hCG-specific biorecognition.

Article Info Ar tic le histo ry A review on quinazoline containing compounds: Molecular docking and pharmacological activites Annals of Phytomedicine 12(1): 220-229, 2023 Annals of Phytomedicine: An International Journal http://www.ukaazpublications.com/publications/index.php

Article

Full-text available

Jun 2023

Indhumathy Manoharan

Molecular docking is a routinely employed tool in computer-aided structure-based rational drug design. It evaluates how well the ligands, or small molecules, and the target molecule fit together. In order to predict how minute molecules will interact with a target protein whose 3D structure is known, a programme called Auto Dock Tools (ADT) was developed. In this docking study, the ligand position within the enzyme binding site and the binding energy may both be visualised. It can be utilised to create novel medications and comprehend how binding works. The heterocyclic nitrogen-containing compound quinazoline, which is a constituent of many synthetic molecules, can be produced via a variety of synthetic methods. Quinazoline and quinazolinone scaffolds have caught the interest of medicinal chemists for the development of novel medications or therapeutic prospects due to their distinct pharmacological features. In addition to its diverse applications, quinazoline has anticancer, antimicrobial, anticonvulsant, and antihyperlipidemic properties. The pharmacological activity and molecular docking studies of quinazoline scaffolds are summarised in this article. The review also helps to hasten the drug development process by identifying the potential contribution of these hybridised pharmacophoric traits to the manifestation of various pharmacological actions.

From GPUs to AI and quantum: three waves of acceleration in bioinformatics

Article

Full-text available

Apr 2024
DRUG DISCOV TODAY

Semaphorin3C identified as mediator of neuroinflammation and microglia polarization after spinal cord injury

Article

Full-text available

Mar 2024

Excessive neuroinflammation after spinal cord injury (SCI) is a major hurdle during nerve repair. Although proinflammatory macrophage/microglia-mediated neuroinflammation plays important roles, the underlying mechanism that triggers neuroinflammation and aggravating factors remain unclear. The present study identified a proinflammatory role of semaphorin3C (SEMA3C) in immunoregulation after SCI. SEMA3C expression level peaked 7 days post-injury (dpi) and decreased by 14 dpi. In vivo and in vitro studies revealed that macrophages/microglia expressed SEMA3C in the local microenvironment, which induced neuroinflammation and conversion of proinflammatory macrophage/microglia. Mechanistic experiments revealed that RAGE/NF-κB was downstream target of SEMA3C. Inhibiting SEMA3C-mediated RAGE signaling considerably suppressed proinflammatory cytokine production, reversed polarization of macrophages/microglia shortly after SCI. In addition, inhibition of SEMA3C-mediated RAGE signaling suggested that the SEMA3C/RAGE axis is a feasible target to preserve axons from neuroinflammation. Taken together, our study provides the first experimental evidence of an immunoregulatory role for SEMA3C in SCI via an autocrine mechanism.

Computational Approaches to Predict Protein-Protein Interactions in Crowded Cellular Environments

Article

Full-text available

Mar 2024
CHEM REV

Investigating protein–protein interactions is crucial for understanding cellular biological processes because proteins often function within molecular complexes rather than in isolation. While experimental and computational methods have provided valuable insights into these interactions, they often overlook a critical factor: the crowded cellular environment. This environment significantly impacts protein behavior, including structural stability, diffusion, and ultimately the nature of binding. In this review, we discuss theoretical and computational approaches that allow the modeling of biological systems to guide and complement experiments and can thus significantly advance the investigation, and possibly the predictions, of protein–protein interactions in the crowded environment of cell cytoplasm. We explore topics such as statistical mechanics for lattice simulations, hydrodynamic interactions, diffusion processes in high-viscosity environments, and several methods based on molecular dynamics simulations. By synergistically leveraging methods from biophysics and computational biology, we review the state of the art of computational methods to study the impact of molecular crowding on protein–protein interactions and discuss its potential revolutionizing effects on the characterization of the human interactome.

Dedicated Bioinformatics Analysis Hardware

Chapter

Jan 2024

Reciprocal Unlocking Between Autoinhibitory CaMKII and Tiam1: A Simulation Study

Chapter

Feb 2024

Ca2+/Calmodulin-activated kinase II (CaMKII) as a switchable enzyme with autonomous inhibition is critical to learning and memory, working together with its various regulators, such as Tiam1 (T-cell lymphoma invasion and metastasis 1), a guanine nucleotide exchange factor (GEF). We here propose a model of molecular memory that both CaMKII and Tiam1 would be concurrently switched between the two basic states: autonomously-inhibited versus reciprocally-unlocked, by their multi-domain interactions. It is reported that the kinase domain (KD) of CaMKII interacts with Tiam1 mainly through the carboxyl tail (CT). Based on the documented evidence and our simulation results, we propose that CT could bind the GEF domain DH–PHC thus playing a key role in Tiam1 autoinhibition, which is relieved by CT/KD binding. This implies a duo complex of CaMKII/Tiam1 that consists of two binding pairs: DH–PHC/AID in addition to CT/KD, providing new mechanistic insights for both CaMKII and Tiam1. Taken together, cellular activities would be concurrently memorized by reciprocal interactions into both CaMKII and Tiam1, potentially more robust and reliable, awaiting future experimental explorations.

CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2

Article

Full-text available

Feb 2024
Br J Pharmacol

Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.

The Geometry of the Reactive Site and of the Peptide Groups in Trypsin, Trypsinogen and its Complexes with Inhibitors

Article

Aug 1983

Shape complementarity at the hemoglobin

Article

Jul 1986
BIOPOLYMERS

Michael L. Connolly

A computational method for attempting to predict protein complexes from the coordinates of the individual proteins has been developed. It is based on matching complementary patterns of knobs and holes. The computer algorithm correctly and uniquely predicts the association of the alpha and beta subunits to form the αβ dimer corresponding to the α1β1 interface in the hemoglobin tetramer. It fails to correctly dock trypsin inhibitor onto trypsin. Nevertheless, this lone success is still a significant advance over previous protein-docking algorithms. The method is also important because it introduces several ways to measure the shape of protein surface regions.

Grid-Search Molecular Accessible Surface Algorithm for Solving the Protein Docking Problem

Article

Jul 1991
J COMPUT CHEM

Huajun Wang

An algorithm for solving the protein docking problem is presented. Many tentative dockings are first generated by requiring a hole on the surface of one protein to match a knob on the surface of the other. All the tentative dockings are then applied. The initial configurations thus generated are further optimized. The optimization is facilitated by giving a discrete representation to the protein interior and a double-layer discrete representation to the protein surface. The algorithm presented correctly predicts the association of trypsin with its inhibitor as well as that of the α and β subunits in hemoglobin.

Macromolecular Shape and Surface Maps by Solvent Extrusion

Article

Feb 1978

A quantitative function equivalent to the "molecular" surface proposed by F. M. Richards [(1977) Annu. Rev. Biophys. Bioeng. 6, 151--176] is defined by the closest approach of solvent spheres to a macromolecule. The function can be used to visualize surface topography, polarity, and charge either as a three-dimensional net or by mapping onto a plane; to calculate surface areas; and to demarcate complementary sites in contacts between subunits. Applications to shape-specific recognition in protein structure and aggregation are discussed.

The structure of horse methaemoglobin at 2-0 A resolution

Article

Sep 1977

The structure of horse methaemoglobin has been redetermined by phase extension and refinement. This has improved our knowledge of the haem geometry and the stereochemistry of the interfaces between the subunits, and confirmed the disorder of the C-terminal residues. Using new four-circle diffractometer data between the limiting spheres of radius 10 and 2.0 Å−1, the co-ordinates determined by Perutz et al. (1968a,b) were subjected to successive cycles of real-space refinement into electron density maps calculated with observed ¦F¦ values and phases derived from the latest refined model, until the reliability index had dropped from an initial value of 0.45 to 0.23. The positions of the iron atoms relative to the planes of the porphyrin rings were refined separately, and checked by Fourier syntheses based on anomalous scattering and by difference Fourier syntheses calculated with coefficients from which the iron contributions had been removed. The general root-mean-squared error in atomic positions is 0.32 Å; the probable error in the displacement of the iron atoms from the porphyrin planes is 0.06 Å. The difference Fourier synthesis, obtained after refinement of the protein was complete, showed 41 bound water molecules per asymmetric unit and also revealed five errors in amino acid sequence, one of which was confirmed chemically.

Computer Analysis of Protein-Protein Interaction

Article

Oct 1978

An automatic procedure which generates possible modes of protein-protein association is developed and applied to the bovine pancreatic trypsin inhibitor-trypsin complex as a test case. Using a simplified model in which each residue is replaced by one interaction center, all possible modes of interaction between the inhibitor and the active center of the enzyme are generated systematically. The non-bonded interactions between the molecules and the protein surface area buried in the generated interfaces are evaluated and used as criteria for selecting stable complexes. We show that satisfactory estimates of accessible and buried surface areas can be made using the simplified model.The procedure leads to about nine structures having non-bonded interactions and buried surface areas similar to those of the native complex. This suggests that the major contributions to the free energy of dissociation are taken into account by our selection procedure, though complementarity and specificity are not properly represented in the simplified model. However, it makes it possible to scan a much larger number of configurations than would otherwise be feasible, chiefly through elimination of side-chain detail.

Protein Docking and Complementarity

Article

Oct 1991

Predicting the structures of protein-protein complexes is a difficult problem owing to the topographical and thermodynamic complexity of these structures. Past efforts in this area have focussed on fitting the interacting proteins together using rigid body searches, usually with the conformations of the proteins as they occur in crystal structure complexes. Here we present work which uses a rigid body docking method to generate the structures of three known protein complexes, using both the bound and unbound conformations of the interacting molecules. In all cases we can regenerate the geometry of the crystal complexes to high accuracy. We also are able to find geometries that do not resemble the crystal structure but nevertheless are surprisingly reasonable both mechanistically and by some simple physical criteria. In contrast to previous work in this area, we find that simple methods for evaluating the complementarity at the protein-protein interface cannot distinguish between the configurations that resemble the crystal structure complex and those that do not. Methods that could not distinguish between such similar and dissimilar configurations include surface area burial, solvation free energy, packing and mechanism-based filtering. Evaluations of the total interaction energy and the electrostatic interaction energy of the complexes were somewhat better. Of the techniques that we tried, energy minimization distinguished most clearly between the "true" and "false" positives, though even here the energy differences were surprisingly small. We found the lowest total interaction energy from amongst all of the putative complexes generated by docking was always within 5 A root-mean-square of the crystallographic structure. There were, however, several putative complexes that were very dissimilar to the crystallographic structure but had energies that were close to that of the low energy structure. The magnitude of the error in energy calculations has not been established in macromolecular systems, and thus the reliability of the small differences in energy remains to be determined. The ability of this docking method to regenerate the crystallographic configurations of the interacting proteins using their unbound conformations suggests that it will be a useful tool in predicting the structures of unsolved complexes.

Soft Docking: Matching of Molecular Surface Cubes

Article

Jun 1991

Molecular recognition is achieved through the complementarity of molecular surface structures and energetics with, most commonly, associated minor conformational changes. This complementarity can take many forms: charge-charge interaction, hydrogen bonding, van der Waals' interaction, and the size and shape of surfaces. We describe a method that exploits these features to predict the sites of interactions between two cognate molecules given their three-dimensional structures. We have developed a “cube representation” of molecular surface and volume which enables us not only to design a simple algorithm for a six-dimensional search but also to allow implicitly the effects of the conformational changes caused by complex formation. The present molecular docking procedure may be divided into two stages. The first is the selection of a population of complexes by geometric “soft docking”, in which surface structures of two interacting molecules are matched with each other, allowing minor conformational changes implicitly, on the basis of complementarity in size and shape, close packing, and the absence of steric hindrance. The second is a screening process to identify a subpopulation with many favorable energetic interactions between the buried surface areas. Once the size of the subpopulation is small, one may further screen to find the correct complex based on other criteria or constraints obtained from biochemical, genetic, and theoretical studies, including visual inspection.

Distance-constrained molecular docking by simulated annealing

Article

Jan 1991

Shi-Yi Yue

An optimized method based on the principle of simulated annealing is presented for determining the relative position and orientation of interacting molecules. The spatial relationships of these molecules are described by intermolecular distance constraints between specific pairs of atoms, such as found in hydrogen bonds or from experimentally determined data. The method makes use of a random walk through six rotational and translational degrees of freedom where the constituent molecules are treated as rigid bodies. Van der Waals repulsions are used only to define a lower bound on distances between constrained atom pairs within the docking procedure. A cost function comprised of purely geometric constraints is optimized via simulated annealing, in order to search for the best orientation and position of the two molecules. Our docking procedure is applied to eight serine proteinase complexes from the Brookhaven Protein Data Bank. For each simulation 100 computations were performed. A typical docking computation requires only a few seconds of CPU time on a VAXserver 3500. The influence of the number of constraints on the final docked positions was studied. The sensitivity of the docking procedure to a ligand structure which is not well defined is also addressed. Possible applications of this method include using approximate distances incorporating complete energy functions.

Automated Docking of Substrates to Proteins by Simulated Annealing

Article

Jan 1990

The Metropolis technique of conformation searching is combined with rapid energy evaluation using molecular affinity potentials to give an efficient procedure for docking substrates to macromolecules of known structure. The procedure works well on a number of crystallographic test systems, functionally reproducing the observed binding modes of several substrates.

Molecular Surface Recognition: Determination of Geometric Fit Between Proteins and Their Ligands by Correlation Techniques

Abstract and Figures

Recommended publications

Mechanism of proteinase complex formation with α2-macroglobulinThree modes of trypsin binding

Studies on acrosomal proteinase of rabbit spermatozoa

Tumour-Associated Trypsin Inhibitor TATI Is a Prognostic Marker in Colorectal Cancer

Structural studies on rabbit muscle glycogen synthase. II. Limited proteolysis