ArticlePDF Available

The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology

Authors:

Abstract

The amino acid distribution in membrane spanning segments and connecting loops in bacterial inner membrane proteins was analysed. The basic residues Arg and Lys are four times less prevalent in periplasmic as compared to cytosolic connecting loops, whereas no comparable effect is observed for the acidic residues Asp and Glu. Also, Pro is shown to be tolerated to a much larger extent in membrane spanning segments with their N-terminus pointing towards the cytosol than in those with the opposite orientation. The significance of these findings with regard to the mechanism of biogenesis of bacterial inner membrane proteins is discussed.
The
EMBO
Journal
vol.5
no.
11
pp.302
1-3027,
1986
The
distribution
of
positively
charged
residues
in
bacterial
inner
membrane
proteins
correlates
with
the
trans-membrane
topology
Gunnar
von
Heijne
Research
Group
for
Theoretical
Biophysics.
Department
of
Theoretical
Physics,
Royal
Institute
of
Technology,
S-100
44
Stockholm,
Sweden
Communicated
by
K.Simons
The
amino
acid
distribution
in
membrane
spanning
segments
and
connecting
loops
in
bacterial
inner
membrane
proteins
was
analysed.
The
basic
residues
Arg
and
Lys
are
four
times
less
prevalent
in
periplasmic
as
compared
to
cytosolic
con-
necting
loops,
whereas
no
comparable
effect
is
observed
for
the
acidic
residues
Asp
and
Glu.
Also,
Pro
is
shown
to
be
tolerated
to
a
much
larger
extent
in
membrane
spanning
seg-
ments
with
their
N-terminus
pointing
towards
the
cytosol
than
in
those
with
the
opposite
orientation.
The
significance
of
these
findings
with
regard
to
the
mechanism
of
biogenesis
of
bacterial
inner
membrane
proteins
is
discussed.
Key
words:
membrane
protein
topology/hydropathy
plot/struc-
ture
prediction/inner
membrane
protein
Introduction
Many
membrane
proteins
such
as
receptors,
pore-forming
pro-
teins,
ion
pumps,
nutrient
and
metabolite
transporters,
and
photosynthetic
proteins
are
absolutely
essential
for
the
cell's
com-
munication
with
the
outside
world.
Over
the
past
few
years
the
primary
sequences
of
a
large
number
of
membrane
proteins
have
been
determined,
and
the
importance
of
long
hydrophobic
seg-
ments
that
presumably
span
the
membrane
as
helices
has
been
recognized
(reviewed
in
von
Heijne,
1985).
However,
the
se-
quence
characteristics
that
determine
the
trans-membrane
topology
of
the
protein
still
remain
largely
unknown.
The
ability
to
predict
(i)
the
topology,
and
(ii)
the
fully
folded
structure
from
the
primary
sequence
is
currently
limited
by
our
ignorance
of
the
way
membrane
proteins
are
inserted
into
their
target
membrane.
'Simple'
membrane
proteins
with
only
one
hydrophobic
membrane
spanning
segment
generally
are
made
with
an
N-terminal
signal
sequence
that
somehow
initiates
trans-
location
of
the
first
part
of
the
chain
through
the
membrane.
The
hydrophobic
trans-membrane
segment
presumably
halts
this
translocation
process
and
anchors
the
protein
to
the
membrane
in
its
final
topology.
The
biogenesis
of
membrane
proteins
of
the
'complex'
varie-
ty
(i.e.
proteins
with
multiple
hydrophobic
spanning
segments)
on
the
other
hand
can
be
envisaged
as
proceeding
via
two
mechanistically
very
different
routes:
either
by
a
sequential
'threading'
back
and
forth
across
the
membrane
starting
from
the
most
N-terminal
spanning
segment
-
the
topology
thus
be-
ing
determined
by
a
succession
of
'start'
and
'stop'
signals
(Blobel,
1980)
-
or
by
an
insertion
mechanism
where
neighbour-
ing
hydrophobic
segments
pair
up
and
penetrate
the
membrane
as
'helical
hairpins'
(Engelman
and
Steitz,
1981).
In
the
first
model,
the
topology
is
essentially
determined
by
the
hydrophobic
spanning
segments
alone,
whereas
the
topology
in
the
helical
hair-
pin
model
is
an
outcome
of
kinetic
competition
between
the
for-
mation
and
insertion
of
all
the
possible
membrane
spanning
nearest-neighbor
pairs
in
a
process
where
the
characteristics
of
the
connecting
segments
between
the
hydrophobic
stretches
should
be
decisive.
Interestingly,
in
the
recently
determined
X-ray
structure
of
the
photosynthetic
reaction
centre
from
Rhodopseudomonas
viridis
(Deisenhofer
et
al.,
1985)
a
striking
charge
asymmetry
across
the
membrane
has
been
observed
(Michel
et
al.,
1986)
with
the
periplasmic
connecting
segments
or
loops
generally
being
more
negative
than
the
cytosolic
loops.
It
will
now
be
shown
that
such
a
charge
asymmetry
seems
to
be
common
to
all
bacterial
inner
membrane
proteins,
that
it
results
from
a
bias
in
the
distribution
of
positively
(Arg
and
Lys)
but
not
negatively
(Asp
and
Glu)
charged
residues,
and
that
the
distribution
of
positively
charged
residues
in
connecting
loops
may
be
used
to
aid
in
the
predic-
tion
of
the
trans-membrane
topology
of
'complex'
bacterial
in-
ner
membrane
proteins.
Results
and
Discussion
Arg
and
Lys
(but
not
Asp
and
Glu)
are
four
times
less
prevalent
in
periplasmic
relative
to
cytosolic
loops
As
described
under
Materials
and
methods,
a
number
of
bacterial
inner
membrane
proteins
in
addition
to
the
Rps.
viridis
reaction
centre
complex
have
been
sufficiently
well
characterized
in
terms
of
their
trans-membrane
topology
to
serve
as
a
reasonably
reliable
database.
These
proteins
are
listed
in
Table
I,
together
with
their
assumed
topologies
and
a
specification
of
the
membrane
span-
ning
and
connecting
segments
included
in
the
statistical
analysis.
Amino
acid
counts
were
collected
for
four
samples,
namely
periplasmic
and
cytosolic
loops
of
length
65
residues
or
less,
and
membrane
spanning
segments
with
their
N-terminus
facing
the
cytosolic
and
periplasmic
side
of
the
membrane,
Table
II.
A
highly
significant
difference
(P
<
0.001)
in
the
incidence
of
postively
charged
residues
was
found
between
the
periplasmic
and
cytosolic
loops:
fArg+Lys
=
4.2%
in
the
periplasmic
loops
versus
15.8%
in
the
cytosolic
connecting
segments,
almost
a
4-fold
difference.
In
a
control
sample
of
72
soluble
cytosolic
bacterial
proteins,
fArg+Lys
=
12.3%,
and
in
a
sample
of
45
soluble
periplasmic
or
extracellular
bacterial
proteins
fArg+Lys
=
10.0%
(see
Methods).
Pro
seems
to
be
enriched
in
the
pern-
plasmic
loops
(P
<
0.001),
with
fp,O
=
8.5%
versus
3.8%,
4.1
%
and
4.2%
for
the
cytosolic
loops,
soluble
cytosolic
and
soluble
periplasmic
proteins
respectively.
There
is
also
a
marginally
significant
1.6-fold
reduction
(P
<
0.025)
in
negative-
ly
charged
residues
in
the
periplasmic
loops
wherefAsp+Glu
=
7.2%
versus
11.2%,
12.8%
and
11.8%
for
the
cytosolic
loops,
soluble
cytosolic,
and
soluble
periplasmic
proteins.
Not
surprisingly,
both
spanning
segment
samples
are
enrich-
ed
about
2-fold
for
hydrophobic
residues
(fPhe+,e+Leu+Met+Va1
=
55%)
compared
with
the
loops
and
soluble
proteins;
more
interestingly,
Pro
is
significantly
reduced
(P
<
0.01)
only
in
those
spanning
segments
that
have
their
N-terminus
towards
the
periplasmic
side
of
the
membrane
(fpro
=
0.9%
versus
-4%
for
the
other
samples).
©
IRL
Press
Limited,
Oxford,
England
3021
G.von
Heijne
In
an
attempt
to
extend
these
calculations
to
a
larger
set
of
pro-
teins,
a
total
of
66
bacterial
inner
membrane
proteins
were
col-
lected
from
the
National
Biomedical
Research
Foundation
(NBRF)
Protein
Sequence
Database
(Release
7.0)
and
from
the
literature,
see
Materials
and
methods.
In
a
preliminary
step,
hydrophobicity
analysis
was
carried
out
as
described
under
Materials
and
methods
(essentially,
each
sequence
was
partitioned
into
non-overlapping
23-residue
segments
starting
from
the
most
hydrophobic
19-residue
segment
and
working
downwards),
and
the
distribution
of
peak
hydrophobicity
values
was
determined
for
this
sample
as
well
as
for
the
sample
of
soluble
cytosolic
pro-
teins.
The
results
are
presented
in
Figure
1,
where
the
bimodal
Table
I.
Bacterial
inner
membrane
proteins.
The
topology
is
indicated
by
showing
the
number
of
positive
and
negative
residues
in
each
connecting
loop
in
its
proper
cytosolic
or
periplasmic
location,
starting
from
the
N-terminus.
Loops
in
square
brackets
are
not
included
in
the
amino
acid
statistics
since
they
are
longer
than
65
residues
Protein
Spanning
Topology
Reference
segments
cyt
per
A.
Proteins
with
well-characterized
topology
Reaction
centre
H-subunit
L-subunit
M-subunit
12
-30
30-48
84-
102
113
-
131
177-
195
232
-250
53
-71
111
-
129
148-166
206-224
266-284
Bacteriorhodopsin
Light-harvesting
complex
LH1
LH2
maIF
M13
coat
protein
precursor
lacY
lep
24-42
57
-75
96-114
121
-
139
151
-
169
190-212
219
-237
23
-41
27
-45
17
-35
40-58
73
-91
277
-295
319-337
371
-389
418
-436
486-504
5
-23
46-64
9-27
47
-65
77
-95
103-
121
144-162
168-
186
351
-369
383
-401
4-22
58
-76
[
+30/-36]
-C
N-(+3/-2)
(+3/-2)
(+5/-4)
N-(+2/-6)
(+3/-1)
(+6/-6)
(+3/-2)
(+0/-4)
(+4/-2)
(+3/-6)-C
N-(+4/-
1)
N-(+
11-5)
N-(+4/-2)
(+3/-0)
(+3/-I)
(+3/-4)
(+3/-2)-C
N-(+3/-0)
(+4/-1)-C
N-(+
1/-0)
(+3/-1)
(+5/-3)
N-(+0/-2)
(+1/-2)
Michel
et
al.
(1985)
Michel
et
al.
(1986)
(+1/-1)
(+0/-4)-C
(+2/-4)
Michel
et
al.
(1986)
(+
1/-2)
(+0/-2)-C
N-(+
1/-3)
Dunn
et
al.
(1981)
(+
1/-2)
(+2/-0)
(+0/-3)
(+
1/-1)-C
(+1/-1)-C
(+0/-1)
Drews
(1985)
Drews
(1985)
Froshauer
and
Beckwith
(1984)
[+
18/-21]
(+3/-2)
(+2/-4)
(+1/-4)
(+1/-2)
van
Wezenbeek
et
al.
(1980)
Buchel
et
al.
(1980)
(+0/-0)
(+0/-0)
(+0/-i)
(+2/-2)-C
(+10/-5)
N-(
+0/-0)
Wolfe
et
al.
(1983)
[
+25/-301
-C
3022
Membrane
protein
topology
Table
I
cont.
B.
Proteins
with
predicted
topology
hisQ
eds
u)iw-C
secY
inaIG
livH
uneB
u11c(
C130
frd
13
kd
protein
pstA
19-37
62
-80
94-
112
190
-208
20-38
51
-69
85-103
117-
135
155-
173
179-
197
228
-246
13
-31
53
-71
23
-41
77-95
122-
140
154-
172
186-204
217
-235
277
-295
319-337
373
-
391
399-417
19-37
92-110
126-144
159-
177
206
-224
263
-281
21
-39
48
-66
71
-89
105-
123
156-
174
205
-223
246-
264
282
-300
46-64
100-118
148-166
220-238
246-264
18
-36
45
-63
79
-97
103-
121
27
-45
62
-80
99-117
35
-
53
88-
106
128-
146
152-
170
201
-219
268-286
(+3l-2)
[+8/-5]
N-(+2/-1)
(+3/-0)
(+5/-1)
(+4l-6)
(+3l-3)
N-(+5/-2)
(+6l-3)
(+1/-2)
(+8/-
1)
(+7l-4)
(+4/-3)-C
N-(+3/-0)
(+4/-0)
(+2/-5)
(+2/-1)-C
(+0/-1)
(+5/-1)
(+61-3)
(+
1/-3)
(+7l-2)
(+5/-3)
(+0/-3)-C
N-(+4/-0)
(+4/-i)
(+1/-1)-C
N-(+2/-2)
(+3/-i)
N-(+8/-2)
(+3l-3)
(+6l-2)
(+3/-1)-C
N-(+0/-
1)
(+0/-4)
(+4/-3)-C
(+
1/-2)
(+1/-2)
(+0/-1)
(+
1/-1)-C
N-(+0/-2)
(+0/-
1)-C
(+4l-3)
(+0/+0)
(+
1/-2)
(+0/-O)
(2+/-2)
(+4l-4)
(+
1/-2)
(+
1/-3)
N-(+0/-1)
(+0/-1)
(+1/-3)
(+
1/-0)
(2+/-3)-C
N-(+
1/-5)
(+1/-4)
(+0/-0)
(+1/-1)
(+
1/-0)
(+2/-2)
(+0/-
1)-C
(+
1/-3)
(+0/-
1)
(+2/-3)
Higgins
et
al.
(1982)
Icho
et
al.
(1985)
Kanazawa
et
al.
(1982)
Cerretti
et
al.
(1983)
Dassa
and
Hofnung
(1985)
Nazos
et
al.
(1986)
Gay
and
Walker
(1981)
Gay
and
Walker
(1981)
Grundstrom
and
Jaurin
(1982)
Surin
et
al.
(1985)
3023
G.von
Heijne
Table
II.
Amino
acid
frequencies
(percent)
in
inner
membrane
proteins
and
in
control
samples
of
soluble
cytosolic
(s-cyt)
and
periplasmic
(s-per)
proteins
Residue
'Known'
topology
'Predicted'
topology
cyt
per
Nin
No)ut
cyt
per
Nin
NUt)U
s-cyt
s-per
Ala
11.3
8.7
11.6
11.7
10.3
8.5
11.4 12.4
9.6
9.0
Cys
0.2
0.4
2.4
1.5
0.2
0.2
1.4
1.2
1.0
0.5
Asp
6.3
4.0
0.3
0.9
4.0
4.5
0.3
0.6
5.9
6.8
Glu
4.9
3.2
0.3
0.0
5.0
3.9
0.3
0.1
7.0
5.0
Phe
5.3
6.6
13.9
9.9
4.0
5.5
10.9
9.4
3.6
3.3
Gly
8.5
11.5
8.2
9.6
9.4
9.8
7.8
9.5
7.4
9.5
His
1.4
3.0
1.1
0.6
2.1
2.6 0.4
0.5
2.2
1.7
Ile
4.3
4.9
10.3
9.6
4.5
5.3
12.4
11.8
5.7
4.8
Lys
7.9
1.9
1.1
0.6
8.1
1.7
0.8
0.2
5.8
6.4
Leu
6.9
8.7
16.8
21.9
8.0
10.6
19.0
19.4
9.4
7.0
Met
2.4
2.1
5.0
3.5
3.4
3.2
4.9
3.9
2.5
1.6
Asn
2.8
5.3
0.3
0.9
3.3 5.2
0.5 0.7
3.8
6.0
Pro
3.8
8.5
3.7
0.9
4.1
7.3
3.9
1.4
4.1
4.2
Gln
3.2
4.0
0.3
1.2
3.4
4.6
0.9
1.2
4.2
4.2
Arg
7.9
2.3
0.3
0.0
8.0
2.9
0.3 0.2
6.5
3.6
Ser
5.9
4.9
5.5
4.4
6.0 5.4
5.2
4.3
5.3
6.7
Thr
4.0
5.3
4.2
5.6
4.7
5.3
4.1
5.1
5.0
6.9
Val
6.5
3.6
10.5
9.1
6.1
5.1
11.2
11.2
7.4
6.6
Trp
2.2
5.3
2.6
4.7
2.0
3.9
2.3 3.5
1.0
1.6
Tyr
4.3
5.7
1.8
3.5
3.5
4.4
2.0
3.4
2.6
4.5
Total
494
471
380
342
1206 1056
912
855
45
699
15
258
e
r
C
e
n
t
-2.9
Peak
hydrophobicity
+2.
Fig.
1.
Distribution
of
hydrophobicity
peak
heights
calculated
with
a
window
of
19
residues,
see
materials
and
methods.
Open
squares:
66
bacterial
inner
membrane
proteins
(522
peaks
in
total);
solid
squares:
72
bacterial
soluble
cytosolic
proteins
(1495
peaks
in
total).
Positive
values
are
more
hydrophobic.
peak
hydrophobicity
distribution
for
the
inner
membrane
pro-
teins
(corresponding
to
connecting
loops
and
spanning
segments)
stands
out
clearly.
Due
to
the
poor
separation
between
the
two
peaks,
however,
segments
with
a
peak
hydrophobicity
value
in
the
range
0.8-1.4
cannot
be
unambiguously
assigned
to
any
of
the
two
groups.
Thus,
from
the
initial
66
proteins,
10
which
had
no
segment
with
a
peak
hydrophobicity
in
this
critical
range,
together
with
the
10
well-characterized
proteins
discussed
above,
were
selected
as
being
reasonably
likely
to
have
correctly
predicted
spanning
segments,
also
listed
in
Table
I.
For
these
proteins
it
was
assumed
that
their
trans-membrane
topologies
are
such
that
a
minimum
number
of
positively
charged
residues
are
placed
in
the
periplasmic
loops,
and
the
amino
acid
counts
were
again
collected
as
above,
Table
II.
Aside
from
the
same
4-fold
difference
in
the
frequency
of
Arg
+
Lys
as
observed
in
the
smaller
sample,
the
incidence
of
Pro
is
still
significantly
higher
(P
<
0.001)
in
the
periplasmic
than
in
the
cytosolic
loops
(fpro
=
7.3
%
versus
4.1
%),
whereas
there
is
no
longer
any
difference
in
the
frequency
of
Asp
+
Glu
(8.4%
versus
9.0%).
Pro
is
also
still
significantly
reduced
in
frequency
(P
<
0.001)
only
in
the
spanning
segments
with
the
N-terminus
facing
the
periplasm
and
not
in
those
with
the
opposite
orientation
(fPro
=
1.4%
versus
3.9%).
No
strong
preferences
for
specific
positions
within
the
spanning
segments
were
found
for
any
of
the
residues.
All
these
observations
hold
true
even
when
the
10
'predicted'
proteins
are
considered
alone.
So
far,
only
mean
frequencies
for
the
whole
sample
have been
discussed;
however,
as
shown
in
Figure
2,
all
periplasmic
loops
seem
to
be
similarly
reduced
in
their
amount
of
positively
charged
residues.
The
distributions
of
the
number
of
negatively
charged
residues,
on
the
other
hand,
do
not
differ
significantly
between
periplasmic
and
cytosolic
loops
(data
not
shown).
Long
periplasmic
loops
have
normal
Arg
and
Lys
frequencies
In
the
analysis
above,
only
relatively
short
connecting
loops
(less
than
65
residues
long
and
with
a
mean
length
of
-
20
residues)
have
been
considered.
Some
inner
membrane
proteins
have
much
longer
periplasmic
loops;
thus
malF
has
a
periplasmic
domain
some
185
residues
long,
and
the
chemotaxis
proteins
tar,
tap,
tsr,
(Krikos
et
al.,
1983)
and
trg
(Bollinger
et
al.,
1984)
have
periplasmic
domains
counting
about
165
residues.
The
overall
amino
acid
compositions
of
these
domains,
however,
are
not
significantly
different
from
the
samples
of
soluble
cytosolic
and
periplasmic
proteins,
with
fArg+Lys
=
10.4%
and
fAsp+Glu
-
10.8%
(data
not
shown).
7he
number
of
positively
(but
not
negatively)
charged
residues
tends
to
alternate
between
successive
loops
If
a
local
alternation
in
the
number
of
positively
charged
residues
between
successive
connecting
loops
is
an
important
characteristic
3024
Membrane
protein
topology
P
e
r
C
e
n
t
6
N:o
of
positive
residues
16
Fig.
2.
Distribution
of
the
number
of
positively
charged
residues
in
periplasmic
connecting
loops
(open
squares,
54
loops
in
total)
and
cytosolic
connecting
loops
(solid
squares,
56
loops
in
total)
in
the
20
inner
membrane
proteins
listed
in
Table
I.
of
bacterial
inner
membrane
proteins,
then
this
should
show
up
in
an
analysis
of
the
pattern
of
alternation
in
subsequences
in-
cluding
three,
four,
five,
etc.
connecting
segments.
Thus
I
have
looked
for
the
number
of
strictly
alternating
n-tuples
(triplets,
quadruplets,
etc.;
see
Methods)
in
the
series
of
numbers
that
one
gets
by
noting
the
number
of
positively
charged
residues
(or
negatively
charged
residues,
or
the
net
charge,
or
the
total
charge)
in
successive
connecting
loops,
both
for
the
well-characterized
proteins
and
from
proteins
with
spanning
segments
predicted
from
hydrophobicity
analysis.
An
occasional
error
in
the
prediction
will
have
no
great
effect
on
the
results
in
this
case,
and
I
have
thus
used
the
whole
66-protein
sample
and
required
a
peak
hydrophobicity
>
1.2
for
predicting
a
spanning
segment.
As
a
further
restriction,
I
have
required
that
no
connecting
loop
be
longer
than
65
residues
(when
a
longer
loop
was
predicted,
the
protein
was
divided
into
two
independent
sequences).
The
results
of
this
anlaysis
are
shown
in
Figure
3
where
the
quotient
between
the
number
of
observed
strictly
alternating
n-
tuples
and
the
mean
number
of
such
n-tuples
in
a
sample
con-
sisting
of
10
randomly
scrambled
copies
of each
of
the
original
series
is
plotted
against
n;
again
the
positively
charged
residues
(followed
by
the
total
number
of
charged
residues)
stand
out
as
the
characteristic
giving
the
largest
number
of
strictly
alternating
n-tuples
for
all
values
of
n.
A
'grammar'
for
membrane
protein
topology
Although
the
number
of
proteins
with
reasonably
well-defined
trans-membrane
topology
is
limited,
one
can
still
find
examples
ranging
from
the
simplest
possible
case
-
a
single
membrane
spanning
segment
bounded
by
two
short
exposed
regions
-
to
rather
complex
cases
such
as
maIF
which
most
likely
has
eight
spanning
segments
and
a
large
periplasmic
domain
between
seg-
ments
3
and
4.
A
close
inspection
of
Table
I
shows
the
Rps.
cap-
sulata
LH
1
and
LH2
polypeptides
behaving
as
expected
if
the
number
of
positively
charged
residues
determine
the
topology:
the
region
with
the
higher
number
of
Arg
+
Lys
and
the
higher
total
charge
faces
the
cytosol.
The
phage
M1
3
major
coat
protein
precursor
and
E.
coli
leader
pepidase
(lep)
are
on
the
next
level
of
complexity
with
two
like-
3
n-tup
le
Fig.
3.
Quotient
between
the
number
of
strictly
alternating
n-tuples
found
in
a
sample
of
66
inner
membrane
proteins
and
the
number
obtained
for
a
sample
consisting
of
10
randomly
scrambled
copies
of each
of
the
original
entries,
see
Materials
and
methods
(open
squares:
positively
charged
residues;
solid
squares:
negatively
charged
residues;
open
diamonds:
total
number
of
charged
residues;
solid
diamonds:
net
charge).
For
the
positively
charged
residues,
the
absolute
numbers
of
strictly
alternative
n-tuples
observed
for
the
inner
membrane
protein
sample
are:
94
(63.9
expected)
out
of
121
3-tuples,
62
(27.4
expected)
out
of
94
4-tuples,
40
(12.6
expected)
out
of
71
5-tuples,
22
(5.4
expected)
out
of
53
6-tuples,
13
(2.1
expected)
out
of
38
7-tuples,
and
8
(0.8
expected)
out
of
27
8-tuples.
ly
spanning
segments
(see
Materials
and
methods).
Again,
the
orientation
is
as
expected
from
the
distribution
of
the
positively
charged
residues;
note
in
particular
that
the
first
hydrophobic
region
in
lep
probably
has
its
N-terminus
facing
the
periplasm
which
correlates
with
a
lack
of
basic
residues
in
the
extra-mem-
braneous
N-terminal
region.
The
same
situation
is
observed
for
the
single
spanning
segment
in
the
Rps.
viridis
reaction
centre
H-subunit.
Among
the
proteins
with
multiple
spanning
segments,
the
Rps.
viridis
reaction
centre
L
subunit
shows
a
perfect
series
of
alter-
nating
numbers
of
Arg
+
Lys;
the
M-subunit,
malF,
and
lacY
also
conform
to
the
rule
with
the
qualification
that
a
couple
of
neighbouring
connecting
loops
have
the
same
positive
charge;
only
bacteriorhodopsin
shows
one
'inversion'
breaking
the
strict
alternation.
It
thus
appears
that
the
topology
of
the
bacterial
in-
ner
membrane
proteins
of
the
'complex'
kind
can
be
generated
by
applying
the
rules
found
to
hold
for the
simple
one-
and
two-
spanning
segment
proteins.
Implications
for
membrane
protein
biogenesis
and
protein
secre-
tion
in
bacteria
The
surprisingly
good
correlation
between
the
distribution
of
positively
charged
residues
and
the
trans-membrane
topology
of
a
relatively
large
number
of
functionally
diverse
proteins
discuss-
ed
above
may
lead
one
to
think
of
the
biogenesis
of
bacterial
inner
membrane
proteins
primarily
in
terms
of
the
'helical
hair-
pin'
hypothesis
(see
Introduction).
Within
the
framework
of
this
model,
it
is
easy
to
see
how
membrane
integration
of
the
protein
could
be
determined
by
two
factors:
the
strengths
of
the
hydrophobic
interaction
between
the
putative
spanning
segments
and
the
membrane,
and
the
activation
energy
barriers
associated
with
the
translocation
of
the
periplasmic
connecting
loops
through
the
membrane.
Thus,
in
a
post-translational
'helical
hairpin'
3025
G.von
Heijne
mechanism,
the
most
hydrophobic
spanning
segment
may
insert
first
together
with
the
one
of
its
nearest
neighbours
that
allows
formation
of
the
helical
hairpin
with
the
most
easily
translocat-
able
connecting
loop,
followed
by
insertion
of
less
hydrophobic
'hairpins'.
Single
N-
and
C-terminal
spanning
segments
such
as
the
N-terminal
segment
in
the
Rps.
viridis
H-subunit
or
the
C-
terminal
spanning
segments
in
the
L
and
M
subunits
may
insert
as
unpaired
segments
thus
bringing
the
polar
(but
not
highly
positively
charged)
terminal
residues
through
the
membrane.
Kinetically
determined
'locally
optimal'
insertions
of
this
kind
may
in
principle
lead
to
structures
with
unpaired
internal
hydro-
phobic
segments
left
on
the
cytosolic
side
that
it
should
be
possible
to
create
by
gene
fusion,
thus
providing
a
critical
test
of
the
model.
At
this
point,
the
most
encouraging
experimental
finding
is
perhaps
the
observation
that
phage
M13
coat
protein
precur-
sor
inserts
spontaneously
in
the
correct
orientation
even
into
protein-free
liposomes
(Geller
and
Wickner,
1985).
Presumably
a
mechanism
such
as
this
can
only
work
for
relatively
short
connecting
loops,
no
more
than
perhaps
60-70
residues
long
(i.e.
the
spanning
segments
must
be
able
to
'drag'
the
whole
connecting
loop
into
the
membrane).
Unfortunately,
there
are
not
yet
any
good
examples
of
periplasmic
loops
with
a
length
in
the
critical
range
between
60
and
100
residues;
shorter
loops
have
very
few
positively
charged
residues
as
demonstrated
above,
whereas
longer
loops
such
as
found
in
maiF
and
the
chemotaxis
proteins
(165
-
185
residues
long)
show
no
such
defi-
ciency.
The
exact
point
of
transition
between
loops
with
low
and
normal
Arg
+
Lys
counts
thus
remains
to
be
determined.
It
is
tempting
to
speculate
that
longer
periplasmic
loopst
are
trans-
located
across
the
membrane
in
the
same
way
as
soluble
peri-
plasmic
proteins,
i.e.
in
some
sort
of
energy-driven
(Chen
and
Tai,
1985),
possibly
post-translational
(Randall,
1983)
process
initiated
either
by
a
cleavable
N-terminal
signal
peptide
or
by
an
unpaired
spanning
segment
on
the
N-terminal
side
of
the
peri-
plasmic
loop.
As
for
the
amino
acid
composition
of
the
spanning
segments,
the
observation
that
Pro
seems
to
be
much
more
easily
accom-
modated
in
those
segments
that
have
their
N-terminus
pointing
towards
the
cytosol
(Nin)
than
in
those
in
the
opposite
orienta-
tion
(Nout)
is
hard
to
explain.
It
does,
however,
cast
some
doubt
on
the
hypothesis
that
the
relatively
high
proline
content
in
span-
ning
segments
from
transport
proteins
as
opposed
to
non-transport
proteins
has
something
to
do
with
proline
cis
-
trans
isomeriza-
tion
being
important
for
their
transport
function
(Brandl
and
Deber,
1986),
since
most
of
the
non-transport
proteins
analysed
in
that
study
are
'simple'
membrane
proteins
with
only
one
span-
ning
segment
in
the
Nout-orientation.
Thus
the
difference
in
pro-
line
content
observed
by
these
authors
may
be
a
result
of
constraints
imposed
on
the
biogenesis
of
Nin
versus
Nout
span-
ning
segments,
rather
than
transport
protein
function
per
se.
Why,
finally,
are
positively
charged
residues
apparently
more
critical
in
the
connecting
loops
than
negatively
charged
ones?
If
an
interaction
with
the
membrane
potential
is
the
decisive
fac-
tor
in
the
insertion
process
one
would
expect
the
total
net
charge
rather
than
the
positive
charge
or
the
total
charge
to
be
most
strongly
correlated
with
the
trans-membrane
topology;
this
does
not
seem
to
be
the
case.
An
alternative,
though
not
mutually
ex-
clusive
explanation
more
in
keeping
with
the
amino
acid
com-
position
data
is
that
the
dipolar
nature
of
the
membrane,
which
is
independent
of
any
imposed
potential,
is
at
the
root
of
the
mat-
ter:
the
dipoles
associated
with
the
lipid
headgroups
make
the
membrane
more
easily
penetrated
by
anions
than
by
cations,
with
an
estimated
difference
in
activation
free
energy
of
up
to
10
3026
kcal/mol
(Flewelling
and
Hubbell,
1986).
Thus
kinetics,
rather
than
equilibrium
thermodynamics,
may
ultimately
be
determin-
ing
the
folding
of
bacterial
inner
membrane
proteins.
Materials
and
methods
Hydrophobicity
analysis
In
the
Rps.
viridis
reaction
centre
complex,
all
membrane
spanning
helices
have
at
least
19
contiguous
uncharged
residues,
and
are
at
least
24
residues
long
(Michel
et
al.,
1986).
Thus,
hydrophobicity
analysis
was
performed
using
a
19-residue
moving
window
and
the
Engelman
-
Steitz
hydrophobicity
scale
(Engelman
and
Steitz,
1981).
To
partition
the
sequence
into
non-overlapping
membrane
span-
ning
segments
and
connecting
loops,
the
highest
peak
in
the
hydrophobicity
pro-
file
was
located,
and
a
segment
of
23
residues
(19
residues
in
the
spanning
segment
proper
and
two
additional
residues
added
at
both
ends)
was
removed
from
fur-
ther
consideration.
This
procedure
was
repeated
until
no
segment
of
length
23
residues
or
more
remained.
From
the
list
of
putative
spanning
segments
thus
ob-
tained,
those
with
a
mean
19-residue
hydrophobicity
greater
than
a
pre-set
cutoff
value
(1.2
or
1.4
kcal/mol,
see
text)
were
predicted
as
true
trans-membrane
helices.
When
amino
acid
counts
were
taken,
the
two
added
amino
acids
at
the
ends
of
each
19-residue
spanning
segment
were
counted
as
belonging
to
the
connecting
loops.
Charge
calculation
In
all
charge
calculations,
Arg
and
Lys
were
counted
as
+
1,
Asp
and
Glu
as
-
1.
C-terminal
carboxyl
groups
were
also
counted
as
-1.
N-terminal
amino
groups
were
not
counted,
since
(i)
these
may
be
formylated
and
hence
uncharg-
ed
during
biogenesis
and
membrane
integration
of
the
protein,
and
(ii)
since
this
group
has
a
significantly
lower
PK.
than
the
basic
moieties
on
the
Arg
and
Lys
side
chains
(around
9.5
versus
12.5
and
10.5;
Bohinski,
1973).
Sequence
samples
Well-characterized
inner
membrane
proteins.
This
group
contained
the
whole
or
pai'ts
of
10
proteins:
Rps.:
viridis
reaction
centre
L,
M,
and
H
subunits:
the
best
characterized
of
all
membrane'
proteins
to
date,
with
the
full
three-dimensional
X-ray
structure
having
been
determined
(Deisenhofer
et
al.,
1985).
H.
halobium
bacteriorhodopsin:
this
protein
is
known
from
electron
microscopy
to
have
seven
membrane
spanning
segments
(Henderson
and
Unwin,
1975).
The
membrane
topology
of
the
chain
has
also
been
well
mapped
by
protease
cleavage
experiments
(Engelman
and
Steitz,
1984).
Rps.
capsulata
light-harvesting
complex
LH1
and
LH2
polypeptides:
these
short
polypeptides
(58
and
49
residues
long)
have
only
one
membrane
spanning
seg-
ment.
Their
N-
and
C-termini
have
been
mapped
to
the
cytosolic
and
periplasmic
side
of
the
membrane,
respectively
(Tadros
et
al.,
1986).
E.
coli
malF:
the
topology
of
the
first
three
membrane
spanning
segments
and
the
following
long
periplasmic
domain
has
been
mapped
by
lacZ
and
TnphoA
transposon
fusions
(Manoil,C.,
Boyd,D.
and
Froshauer,S.,
personal
communica-
tion;
see
also
Froshauer
and
Beckwith,
1984;
Manoil
and
Beckwith,
1985).
The
C-terminal
membrane
domain
has
been
less
well
mapped,
but
the
existence
of
five
unambiguous
(mean
19-residue
hydrophobicity
>
1.4)
putative
spanning
seg-
ments
in
this
domain
makes
the
topology
shown
in
Table
I
highly
probable.
Phage
M13
major
coat
protein:
this
is
a
short
protein
with
an
N-terminal
cleavable
signal
peptide,
a
periplasmic
loop,
and
a
C-terminal
spanning
segment.
It
integrates
into
membranes
as
a
'helical
hairpin'
in
the
absence
of
leader
pep-
tidase
(Geller
and
Wickner,
1985).
E.
coli
lactose
permease
(lacY):
the
N-terminus,
the
C-terminus,
and
an
ex-
posed
segment
around
residue
135
have
been
mapped
to
the
cytosolic
side
of
the
membrane
(Bieseler
et
al.,
1985;
Seckler
et
al.,
1983,1986).
Six
unambiguous
putative
spanning
segments
in
the
N-terminal
half
and
two
close
to
the
C-terminus
make
it
possible
to
derive
the
partial
topology
shown
in
Table
I
(see
also
Vogel
et
al.,
1985).
E.
coli
leader
peptidase
(lep):
this
protein
has
a
large
C-terminal
periplasmic
domain,
can
be
cleaved
by
trypsin
attacking
from
the
cytosolic
side
of
the
mem-
brane
around
residue
50,
and
does
not
have
a
cleavable
signal
peptide
(Wolf
et
al.,
1983).
The
existence
of
two
unambiguous
putative
spanning
segments
(residues
2-24
and
56-78)
strongly
suggests
the
topology
given
in
Table
I.
Inner
membrane
proteins
with
predicted
topology
Fifty-six
additional
sequences
of
bacterial
inner
membrane
proteins
were
extracted
from
the
NBRF
Protein
Sequence
Database
(Release
7.0)
or
collected
from
the
literature.
Ten
of
these
were
found
to
have
only
unambiguous
(mean
19-residue
hydrophobicity
>
1.4
kcal/mol)
and
no
ambiguous
(mean
19
residue
hydrophobici-
ty
between
0.8
and
1.4
kcal/mol,
see
text)
putative
spanning
segments
with con-
necting
loops
shorter
than
65
residues.
Their
topologies
were
predicted
assuming
that
a
minimum
number
of
postively
charged
residues
are
placed
in
periplasmic
loops,
see
Table
I.
Membrane
protein topology
Cytosolic
and
periplasmic
reference
sets
72
sequences
(45
699
residues)
of
soluble
cytosolic
and
45
sequences
(15
258
residues)
of
soluble
periplasmic
or
extracellular
bacterial
proteins
were
extracted
from
the
NBRF
Protein
Sequence
Database
or
collected
from
the
literature.
Signal
peptides
were
removed
from
the
latter
group
prior
to
analysis.
n-Tuple
analysis
For
each
protein
analysed,
the
number
of
positively
charged
residues
in
successive
connecting
loops
was
recorded
as
a
series
of
numbers,
e.g.
3,1,4,3,3,1.
Then,
the
number
of
strictly
alternating
n-tuples
in
this
series
was
noted,
e.g.
two
3-tuples
(3,1,4;
1,4,3)
and
one
4-tuple
(3,1,4,3)
in
this
example.
The
total
numbers
of
such
n-tuples
in
the
whole
sample
was
compared
with
the
numbers
obtained
when
the
original
series
were
randomly
scrambled
10
times.
Statistical
analysis
The
statistical
significance
of
observed
differences
in
amino
acid
composition
was
assessed
using
x2-analysis.
Acknowledgement
This
work
was
supported
by
a
grant
from
the
Swedish
Natural
Sciences
Research
Council.
References
Bieseler,B.,
Prinz,H.
and
Beyreuther,K.
(1985)
Ann.
N.
Y.
Acad.
Sci.,
456,
309-325.
Blobel,G.
(1980)
Proc.
Natl.
Acad.
Sci.
USA,
77,
1496-1500.
Bohinski,R.C.
(1973)
Modem
Concepts
in
Biochemistry.
Allen
&
Bacon,
Boston,
MA.
Bollinger,J.,
Park,C.,
Harayama,S.
and
Hazelbauer,G.L.
(1984)
Proc.
Natl.
Acad.
Sci.
USA,
81,
3287-3291.
Brandl,C.J.
and
Deber,C.M.
(1986)
Proc.
Natl.
Acad.
Sci.
USA,
83,
917-92
1.
Buchel,D.E.,
Gronenborn,B.
and
Muller-Hill,B.
(1980)
Nature,
283,
541-545.
Cerretti,D.P.,
Dean,D.,
Davis,G.R.,
Bedwell,D.M.
and
Nomura,M.
(1983)
Nucl.
Acids
Res.,
11,
2599-2616.
Chen,L.
and
Tai,P.C.
(1985)
Proc.
Natl.
Acad.
Sci.
USA,
82,
4384-4388.
Dassa,E.
and
Hofnung,M.
(1985)
EMBO
J.,
4,
2287-2293.
Deisenhofer,J.,
Epp,O.,
Miki,K.,
Huber,R.
and
Michel,H.
(1985)
Nature,
318,
618-624.
Drews,G.
(1985)
Microbiol.
Rev.,
49,
59-70.
Dunn,R.,
McCoy,J.,
Simsek,M.,
Majumdar,A.,
Chang,S.H.,
RajBhandary,U.L.
and
Khorana,H.G.
(1981)
Proc.
Natl.
Acad.
Sci.
USA,
78,
6744-6748.
Engelman,D.M.
and
Steitz,T.A.
(1981)
Cell,
23,
411-422.
Engelman,D.M.
and
Steitz,T.A.
(1984)
In
Wetlaufer
(ed.),
The
Protein
Folding
Problem.
AAAS,
New
York,
pp.
87-113.
Flewelling,R.F.
and
Hubbell,W.L.
(1986)
Biophys.
J.,
49,
541-552.
Froshauer,S.
and
Beckwith,J.
(1984)
J.
Biol.
Chem.,
259,
10896-10903.
Gay,N.J.
and
Walker,J.E.
(1981)
Nucl.
Acids
Res.,
9,
3919-3926.
Geller,B.L.
and
Wickner,W.
(1985)
J.
Biol.
Chem.,
260,
13281-13285.
Grundstrom,T.
and
Jaurin,B.
(1982)
Proc.
Natl.
Acad.
Sci.
USA,
79,
1111-1115.
Henderson,R.
and
Unwin,P.N.T.
(1975)
Nature,
257,
28-32.
Higgins,C.F.,
Haag,P.D.,
Nikaido,K.,
Ardeshir,F.,
Garcia,G.
and
Ames,G.F.L.
(1982)
Nature,
298,
723-727.
Icho,T.,
Sparrow,C.P.
and
Raetz,C.R.H.
(1985)
J.
Biol.
Chem.,
260,
12078-12083.
Kanazawa,H.,
Kayano,T.,
Kiyasu,T.
and
Futai,M.
(1982)
Biochem.
Biophys.
Res.
Commun.,
105,
1257-1264.
Krikos,A.,
Mutoh,N.,
Boyd,A.
and
Simon,M.I.
(1983)
Cell,
33,
615-622.
Manoil,C.
and
Beckwith,J.
(1985)
Proc.
Natl.
Acad.
Sci.
USA,
82,
8129-8133.
Michel,H.,
Weyer,K.A.,
Gruenberg,H.
and
Lottspeich,F.
(1985)
EMBO
J.,
4,
1667-1672.
Michel,H.,
Weyer,K.A.,
Gruenberg,H.,
Dunger,I.,
Oesterhelt,D.
and
Lott-
speich,F.
(1986)
EMBO
J.,
5,
1149-1158.
Nazos,P.M.,
Antonucci,T.K.,
Landick,R.
and
Oxender,D.L.
(1986)
J.
Bacteriol.,
166,
565-573.
Randall,L.L.
(1983)
Cell,
33,
231-240.
Seckler,R.,
Wright,J.K.
and
Overath,P.
(1983)
J.
Biol.
Chem.,
258,
10817-10820.
Seckler,R.,
Moroy,T.,
Wright,J.K.
and
Overath,P.
(1986)
Biochemistry,
25,
2403-2409.
Surin,B.P.,
Rosenberg,H.
and
Cox,G.B.
(1985)
J.
Bacteriol.,
161,
189-198.
Tadros,M.H.,
Frank,R.
and
Drews,G.
(1986)
FEBS
Lett.,
1%,
233-236.
van
Wezenbeek,P.M.G.F.,
Hulsebos,T.J.M.
and
Schoenmakers,J.G.G.
(1980)
Gene,
11,
129-148.
Vogel,H.,
Wright,J.K.
and
Jahnig,F.
(1985)
EMBO
J.,
4,
3625-3631.
von
Heijne,G.
(1985)
Current
Topics
in
Membranes
and
Transport,
24,
151-179.
Wolfe,P.B.,
Wickner,W.
and
Goodman,J.M.
(1983)
J.
Biol.
Chem.,
258,
12073-12080.
Received
on
21
Julv
1986
3027
... The hydrophobicity hypothesis: Selective constraints on targeting and transport of highly hydrophobic proteins have played a major role in modulating the evolution of mitogenomes, which have been maintained to ensure the correct localization of these highly hydrophobic membrane proteins [51,[70][71][72][73]. ...
... Prediction 2: Hydrophobic membrane proteins encoded by mitogenomes would be recognized by the signal recognition particle (or by the components of unconventional pathways) and mis-targeted to the endoplasmic reticulum if they were nucleus-encoded [71,72,77]. ...
Article
Full-text available
The mitochondria contain their own genome derived from an alphaproteobacterial endosymbiont. From thousands of protein-coding genes originally encoded by their ancestor, only between 1 and about 70 are encoded on extant mitochondrial genomes (mitogenomes). Thanks to a dramatically increasing number of sequenced and annotated mitogenomes a coherent picture of why some genes were lost, or relocated to the nucleus, is emerging. In this review, we describe the characteristics of mitochondria-to-nucleus gene transfer and the resulting varied content of mitogenomes across eukaryotes. We introduce a ‘burst-upon-drift’ model to best explain nuclear-mitochondrial population genetics with flares of transfer due to genetic drift. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-024-01824-1.
... The core of the shared fold, which we refer to as "S-component fold", is composed of six alpha-helices (H1-H6) connected by 5 loops protruding alternately towards the extracellular side and the cytoplasm (Fig. 2b-d) 47,48 . The 5TMR domain of SHKs has an additional helix at the N-terminus (H0) linked to the six-helical core by a cytosolic loop, resulting in a total of 7 transmembrane helices, as recently demonstrated biochemically for the high-affinity pyruvate receptor BtsS 37 . ...
Article
Full-text available
The processes of nutrient uptake and signal sensing are crucial for microbial survival and adaptation. Membrane-embedded proteins involved in these functions (transporters and receptors) are commonly regarded as unrelated in terms of sequence, structure, mechanism of action and evolutionary history. Here, we analyze the protein structural universe using recently developed artificial intelligence-based structure prediction tools, and find an unexpected link between prominent groups of microbial transporters and receptors. The so-called S-components of Energy-Coupling Factor (ECF) transporters, and the membrane domains of sensor histidine kinases of the 5TMR cluster share a structural fold. The discovery of their relatedness manifests a widespread case of prokaryotic “transceptors” (related proteins with transport or receptor function), showcases how artificial intelligence-based structure predictions reveal unchartered evolutionary connections between proteins, and provides new avenues for engineering transport and signaling functions in bacteria.
... Furthermore, only a small fraction of mutations in the membrane are to polar residues (7%), mostly at positions buried within the protein where they would be shielded from lipid and predominantly to the mildly polar Thr. Strikingly, mutations to the positive Arg and Lys identities mostly occur in the inner-membrane leaflet or intracellular domain (90%), following the "positiveinside" rule ( Figure 2b) (Gavel et al., 1991;von Heijne, 1986von Heijne, , 1989. The overall effect of introducing dozens of mPROSS mutations is a significant improvement in the Rosetta energy relative to the parental protein ( Figure 2d). ...
Article
Full-text available
Membrane proteins play critical physiological roles as receptors, channels, pumps, and transporters. Despite their importance, however, low expression levels often hamper the experimental characterization of membrane proteins. We present an automated and web‐accessible design algorithm called mPROSS (https://mPROSS.weizmann.ac.il), which uses phylogenetic analysis and an atomistic potential, including an empirical lipophilicity scale, to improve native‐state energy. As a stringent test, we apply mPROSS to the Kv1.2–Kv2.1 paddle chimera voltage‐gated potassium channel. Four designs, encoding 9–26 mutations relative to the parental channel, were functional and maintained potassium‐selective permeation and voltage dependence in Xenopus oocytes with up to 14‐fold increase in whole‐cell current densities. Additionally, single‐channel recordings reveal no significant change in the channel‐opening probability nor in unitary conductance, indicating that functional expression levels increase without impacting the activity profile of individual channels. Our results suggest that the expression levels of other dynamic channels and receptors may be enhanced through one‐shot design calculations.
... The core of the shared fold, which we refer to as "S-component fold", is composed of six alpha-helices (H1-H6) connected by 5 loops protruding alternately towards the extracellular side and the cytoplasm ( Figures 2B-D). 42,43 The 5TMR domain of SHKs has an additional helix at the N-terminus (H0) linked to the six-helical core by a cytosolic loop, resulting in a total of 7 transmembrane helices, as recently demonstrated biochemically for the high-affinity pyruvate receptor BtsS. 32 H0 has its N-terminus on the extracellular side, and the C-terminus at the cytoplasmic side of the membrane. ...
Preprint
Full-text available
The processes of nutrient uptake and signal sensing are crucial for microbial survival and adaptation. Membrane-embedded proteins involved in these functions (transporters and receptors) are commonly regarded as unrelated in terms of sequence, structure, mechanism of action and evolutionary history. Here, we analyze the protein structural universe using recently developed artificial intelligence-based structure prediction tools, and find an unexpected link between prominent groups of microbial transporters and receptors. The so-called S-components of energy-coupling factor (ECF) transporters, and the membrane domains of sensor histidine kinases of the 5TMR cluster share a structural fold. The discovery of their relatedness manifests a widespread case of prokaryotic transceptors (related proteins with transport or receptor function), showcases how artificial intelligence-based structure predictions reveal unchartered evolutionary connections between proteins, and provides new avenues for engineering transport and signaling functions in bacteria.
... Lys are known to be important for ensuring correct membrane topology and are predominantly found together with Arg at cytosolic sites in transmembrane proteins as described by the positive-inside rule. 29,30 Lys and Arg are among the most surface-exposed amino acids within class A GPCRs, indicating that they provide interactions between TMDs and the environment ( Figure S2). However, while Lys and Arg occur in similar average numbers per receptor (Arg: 8.3, Lys: 5.9), Arg never shows patterns that would suggest its involvement in hydropathy tuning. ...
Article
Leucine and Isoleucine are two amino acids that differ only by the positioning of one methyl group. This small difference can have important consequences in α-helices, as the β-branching of Ile results in helix destabilization. We set out to investigate whether there are general trends for the occurrences of Leu and Ile residues in the structures and sequences of class A GPCRs (G protein-coupled receptors). GPCRs are integral membrane proteins in which α-helices span the plasma membrane seven times and which play a crucial role in signal transmission. We found that Leu side chains are generally more exposed at the protein surface than Ile side chains. We explored whether this difference might be attributed to different functions of the two amino acids and tested if Leu tunes the hydrophobicity of the transmembrane domain based on the Wimley-White whole-residue hydrophobicity scales. Leu content decreases the variation in hydropathy between receptors and correlates with the non-Leu receptor hydropathy. Both measures indicate that hydropathy is tuned by Leu. To test this idea further, we generated protein sequences with random amino acid compositions using a simple numerical model, in which hydropathy was tuned by adjusting the number of Leu residues. The model was able to replicate the observations made with class A GPCR sequences. We speculate that the hydropathy of transmembrane domains of class A GPCRs is tuned by Leu (and to some lesser degree by Lys and Val) to facilitate correct insertion into membranes and/or to stably anchor the receptors within membranes.
Article
Membrane proteins play pivotal roles in a wide array of cellular processes and constitute approximately a quarter of the protein-coding genes across all organisms. Despite their ubiquity and biological significance, our understanding of these proteins remains notably less comprehensive compared to their soluble counterparts. This disparity in knowledge can be attributed, in part, to the inherent challenges associated with employing specialized techniques for the investigation of membrane protein insertion and topology. This review will center on a discussion of molecular biology methodologies and computational prediction tools designed to elucidate the insertion and topology of helical membrane proteins.
Article
The orientation of membrane proteins within the lipid bilayer is key to understanding their molecular function. Similarly, the proper topology of multispanning membrane proteins is crucial for their function. Although bioinformatics tools can predict these parameters assessing the presence of hydrophobic protein domains sufficiently long to span the membrane and other structural features, the predictions from different algorithms are often inconsistent. Therefore, experimental analysis becomes mandatory. Redox-based topology analysis exploits the steep gradient in the glutathione redox potential (EGSH) across the ER membrane of about 80 mV to visualize the orientation of ER membrane proteins by fusing the EGSH biosensor roGFP2 to either the N- or the C-termini of the investigated protein sequence. Transient expression of these fusion proteins in tobacco leaves allows direct visualization of orientation and topology of ER membrane proteins in planta. The protocol outlined here is based on either a simple merge of the two excitation channels of roGFP2 or a colocalization analysis of the two channels and thus avoids ratiometric analysis of roGFP2 fluorescence.
Article
Monodehydroascorbate reductase (MDHAR) is a crucial enzymatic antioxidant of the ascorbate-glutathione pathway involved in reactive oxygen species scavenging. Herein, we identified 15 TaMDHAR genes in bread wheat. Phylogenetic analysis revealed their clustering into three groups, which are also related to the subcellular localization in the peroxisome matrix, peroxisome membrane, and chloroplast. Each TaMDHAR protein consisted of two conserved domains; Pyr_redox and Pyr_redox_2 of the pyridine nucleotide disulfide oxidoreductase family. The occurrence of diverse groups of cis-regulatory elements in the promoter region and their interaction with numerous transcription factors suggest assorted functions of TaMDHARs in growth and development and in light, phytohormones, and stress responses. Expression analysis in various tissues further revealed their importance in vegetative and reproductive development. In addition, the differential gene expression and enhanced enzyme activity during drought, heat, and salt treatments exposed their role in abiotic stress response. Interaction of MDHARs with various antioxidant enzymes and biochemicals related to the ascorbate-glutathione cycle exposed their synchronized functioning. Interaction with auxin indicated the probability of cross-talk between antioxidants and auxin signaling. The miR168a, miR169, miR172 and others interaction with various TaMDHARs further directed their association with developmental processes and stress responses. The current study provides extensive information about the importance of TaMDHARs, moreover, the precise role of each gene needs to be established in future studies.
Article
Substantial progress has been made in our understanding of the nongenomic actions, ligand binding, intracellular signaling pathways, and functions of membrane progesterone receptors (mPRs) in reproductive and nonreproductive tissues since their discovery 20 years ago. The five mPRs are members of the progestin adipoQ receptor (PAQR) family which also includes adiponectin receptors (AdipoRs). However, unlike AdipoRs, the 3-D structures of mPRs are unknown, and their structural characteristics remain poorly understood. The mechanisms regulating mPR functions and their trafficking to the cell surface have received little attention and have not been systematically reviewed. This paper summarizes some structural aspects of mPRs, including the ligand binding pocket of mPRα recently derived from homology modeling with AdipoRs, and the proposed topology of mPRs from the preponderance of positively charged amino acid residues in their intracellular domains. The mechanisms of trafficking membrane receptors to the cell surface are discussed, including the amino acid motifs involved with their export to the cell surface, the roles of adaptor proteins, and post-translational glycosylation and palmitoylation modifications that promote cell surface expression and retention. Evidence for similar mechanisms regulating the expression and functions of mPRs on the cell surface is discussed, including the identification of potential export motifs on mPRα required for its trafficking to the cell membrane. Collectively, these results have identified several potential mechanisms regulating the expression and functions of mPRs on the cell membrane for further investigation.
Article
Full-text available
DNA replication, transcription, and translation in eukaryotic cells occur with decreasing but still high fidelity. In contrast, for the estimated 33% of the human proteome that is inserted as transmembrane (TM) proteins, insertion with a non-functional inverted topology is frequent. Correct topology is essential for function and trafficking to appropriate cellular compartments and is controlled principally by responses to charged residues within 15 residues of the inserted TM domain (TMD); the flank with the higher positive charge remains in the cytosol (inside), following the positive inside rule (PIR). Yeast (Saccharomyces cerevisiae) mutants that increase insertion contrary to the PIR were selected. Mutants with strong phenotypes were found only in SPF1 and STE24 (human cell orthologs are ATP13A1 and ZMPSte24) with, at the time, no known relevant functions. Spf1/Atp13A1 is now known to dislocate to the cytosol TM proteins inserted contrary to the PIR, allowing energy-conserving reinsertion. We hypothesize that Spf1 and Ste24 both recognize the short, positively charged ER luminal peptides of TM proteins inserted contrary to the PIR, accepting these peptides into their large membrane-spanning, water-filled cavities through interaction with their many interior surface negative charges. While entry was demonstrated for Spf1, no published evidence directly demonstrates substrate entry to the Ste24 cavity, internal access to its zinc metalloprotease (ZMP) site, or active withdrawal of fragments, which may be essential for function. Spf1 and Ste24 comprise a PIR quality control system that is conserved in all eukaryotes and presumably evolved in prokaryotic progenitors as they gained differentiated membrane functions. About 75% of the PIR is imposed by this quality control system, which joins the UPR, ERAD, and autophagy (ER-phagy) in coordinated, overlapping quality control of ER protein function.
Article
Full-text available
The genes encoding the 52 ribosomal proteins (r-proteins) of Escherichia coli are organized into approximately 19 operons scattered throughout the chromosome. One of these, the spc operon, contains the genes for ten ribosomal proteins: L14, L24, L5, S14, S8, L6, L18, S5, L30 and L15 (rplN, rplX, rplE, rpsH, rpsH, rplF, rplR, rpsE, rpmD, and rplO). We now report the entire 5.9 kb nucleotide sequence of the apc operon. DNA sequence analysis has confirmed the genetic organization and refined the amino acid sequence of the ten r-proteins in this operon. It has also revealed the presence of two open reading frames past the last known gene (L15) of the spc operon. One of these corresponds to a gene (prlA or secY) which recently has been shown by others to be involved in protein export. In addition, Si mapping experiments indicate that a significant proportion of transcription initiated from the spc operon continues not only into the two putative genes, but also without termination into the downstream a r-protein operon.
Chapter
Full-text available
The two types of protein folding problems (theoretical predictions of protein structure and folding pathways) are discussed. Most of the effort in our laboratory has been devoted to computing three-dimensional structures of proteins from amino acid sequence, using either an all-atom or a united-residue force field. More recently, folding pathways have been computed with a stochastic difference equation method, and folding trajectories have been simulated by molecular dynamics with the united-residue force field.
Article
Full-text available
A 7-A resolution map of the purple membrane has been obtained by electron microscopy of tilted, unstained specimens. The protein in the membrane contains seven, closely packed, alpha-helical segments which extend roughly perpendicular to the plane of the membrane for most of its width. Lipid bilayer regions fill the spaces between the protein molecules.
Article
Full-text available
The cds gene of Escherichia coli codes for the enzyme CDP-diglyceride synthetase. We now report the construction of plasmids which carry cds. Using these plasmids, we have sequenced 1274 base pairs of DNA, including a 750-base pair open reading frame which is the coding region of the cds gene. This DNA sequence allows the deduction of the primary peptide sequence for CDP-diglyceride synthetase. The protein is very hydrophobic, and, assuming no processing or modification, has a molecular weight of 27,570. Furthermore, there is a second open reading frame immediately after cds, implying that cds may be part of an operon. We have also constructed a runaway replication cds-plasmid that directs approximately 50-fold overproduction of CDP-diglyceride synthetase. This overproduction has been utilized in the purification of the enzyme to homogeneity, as described in the accompanying paper (Sparrow, C.P., and Raetz, C.R.H., J. Biol. Chem. 260, 12084-12091). Finally, the molecular cloning work reported herein allows the exact placement of the cds gene on the E. coli genetic map.
Article
This book contains 10 chapters. Some of the chapter titles are: Expression of the Oxytocin and Vasopressin Genes; Steroid Effects on Excitable Membranes: The Secretory Vesicle in Processing and Secretion of Neuropeptides: and Steroid Hormone Influences on Cyclic AMP-Generating Systems.
Article
The energy requirement for translocation of alkaline phosphatase and the outer membrane protein OmpA into Escherichia coli membrane vesicles was studied under conditions that permit posttranslational translocation and, hence, prior removal of various components necessary for protein synthesis. Translocation could be supported by an ATP-generating system or, less well, by the protonmotive force generated by D-lactate oxidation; the latter might act by generating ATP from residual bound nucleotides. However, when protonmotive force inhibitors were used or when ATP was further depleted by E. coli glycerol kinase, D-lactate no longer supported the translocation. Furthermore, ATP could still support protein translocation in the presence of proton uncouplers or with membranes defective in the F1 fraction of the H+-ATPase. We conclude that ATP is required for protein translocation in this posttranslational system (and probably also in cotranslational translocation); the protonmotive force may contribute but does not appear to be essential.
Article
We constructed a derivative of transposon Tn5 that permits the generation of hybrid proteins composed of alkaline phosphatase (EC 3.1.3.1) lacking its signal peptide fused to amino-terminal sequences of other proteins. Such a hybrid gives alkaline phosphatase activity if the protein fused to alkaline phosphatase contributes sequences that promote export and thus compensate for the missing alkaline phosphatase signal peptide. Fusions to both a secreted periplasmic protein and a complex cytoplasmic membrane protein led to alkaline phosphatase activity. TnphoA fusions should help localize export signals within the structure of a protein, such as a transmembrane protein, as well as identify new chromosomal genes for secreted and transmembrane proteins.
Article
The MalG protein is needed for the transport of maltose in Escherichia coli K12. We present the sequence of gene malG. The deduced amino acid sequence corresponds to a protein of 296 amino acid residues (mol. wt. = 32 188 daltons). This protein is largely hydrophobic (hydrophobic index = 0.83) and is thus presumably an integral inner membrane protein which could span the membrane through six hydrophobic segments. We provide direct evidence from fusion proteins for the translation frame and we also identified the in vitro made MalG protein. We have found a sequence which is highly conserved between MalG and MalF, the other integral inner membrane protein of the maltose transport system. This conserved sequence is also present in all known integral membrane proteins of binding protein-dependent transport systems, always at the same distance (approximately 90 residues) from their COOH terminus. We discuss briefly this finding.
Article
The physical location of the genetically defined livH gene was mapped in the 17-kilobase plasmid pOX1 by using transposon Tn5 inactivation mapping and further confirmed by subcloning and complementation analysis. These results indicated that the livH gene maps 3' to livK, the gene encoding the leucine-specific binding protein. Moreover, the nucleotide sequence of the livH gene and its flanking regions was determined. The livH gene is encoded starting 47 base pairs downstream from the livK gene, and it is transcribed in the same direction as the livK gene. The livK-livH intergenic region lacks promoter sequences and contains a GC-rich sequence that could lead to the formation of a stable stem loop structure. The coding sequence of the livH gene, which is 924 base pairs, specifies a very hydrophobic protein of 308 amino acid residues. Expression of livH-containing plasmids in minicells suggested that a poorly expressed protein with an Mr of 30,000 could be the livH gene product.