Content uploaded by Yuki Todo
Author content
All content in this area was uploaded by Yuki Todo on Jul 08, 2020
Content may be subject to copyright.
Neural Networks 60 (2014) 96–103
Contents lists available at ScienceDirect
Neural Networks
journal homepage: www.elsevier.com/locate/neunet
Unsupervised learnable neuron model with nonlinear interaction
on dendrites
Yuki Todo a, Hiroki Tamura b, Kazuya Yamashita c,∗, Zheng Tang c
aKanazawa University, Japan
bMiyazaki University, Japan
cUniversity of Toyama, Japan
article info
Article history:
Received 23 August 2013
Received in revised form 29 May 2014
Accepted 7 July 2014
Available online 1 August 2014
Keywords:
Neuron model
Interaction
Dendrites
Unsupervised learning
Directionally selective cells
abstract
Recent researches have provided strong circumstantial support to dendrites playing a key and possibly
essential role in computations. In this paper, we propose an unsupervised learnable neuron model by
including the nonlinear interactions between excitation and inhibition on dendrites. The model neuron
self-adjusts its synaptic parameters, so that the synapse to dendrite, according to a generalized delta-rule-
like algorithm. The model is used to simulate directionally selective cells by the unsupervised learning
algorithm. In the simulations, we initialize the interaction and dendrite of the neuron randomly and
use the generalized delta-rule-like unsupervised learning algorithm to learn the two-dimensional multi-
directional selectivity problem without an external teacher’s signals. Simulation results show that the
directionally selective cells can be formed by unsupervised learning, acquiring the required number of
dendritic branches, and if needed, enhanced and if not, eliminated. Further, the results show whether a
synapse exists; if it exists, where and what type (excitatory or inhibitory) of synapse it is. This leads us to
believe that the proposed neuron model may be considerably more powerful on computations than the
McCulloch–Pitts model because theoretically a single neuron or a single layer of such neurons is capable
of solving any complex problem. These may also lead to a completely new technique for analyzing the
mechanisms and principles of neurons, dendrites, and synapses.
©2014 Elsevier Ltd. All rights reserved.
1. Introduction
The human brain, built of about 1011 neurons and 1015 intercon-
nections, is of staggering complexity. The fundamental structure
of a neuron consists of a cell body, an axon, and a dendrite. Neu-
rons have different functions depending on the branch patterns of
their dendrites, i.e., the function changes with differences in these
structures (Cajal, 1909). The first model of a neuron was proposed
by McCulloch and Pitts in 1943 and has been widely used as a ba-
sic unit for modern researches on neural networks (McCulloch &
Pitts, 1943). However, this model has been criticized as being over-
simplified from the viewpoint of the properties of real neurons
and the computation that they perform because only one nonlin-
ear term (thresholding on cell body) is included in the model and
the nonlinear mechanisms of dendrites are not considered (Lon-
don & Häusser, 2005). Meanwhile, recent researches have provided
∗Corresponding author. Tel.: +81 764456886.
E-mail address: kazuya@eng.u-toyama.ac.jp (K. Yamashita).
strong circumstantial support for dendrites playing a key role in
the overall computation performed by the neuron (Agmon-Snir,
Carr, & Rinzel, 1998;Anderson, Binzegger, Kahana, Martin, & Segev,
1999;Euler, Detwiler, & Denk, 2002;Magee,2000;Single & Borst,
1998;Stuart, Spruston, & Häusser, 2008). Based on these exper-
imental findings, Larkum, Zhu, and Sakmann (2001), and Rhodes
and Llinás (2001) modeled apical dendrites as a compartment dis-
tinct from the somatic compartment, and were successful in re-
producing the diverse range of neuronal firing patterns (Kepecs,
Wang, & Lisman, 2002;Mainen & Sejnowski, 1996). After the ex-
perimental observation of localized regenerative spikes in the fine
distal dendrites (Schiller, Major, Koester, & Schiller, 2000;Schiller,
Schiller, Stuart, & Sakmann, 1997;Wei et al.,2001). Poirazi, Bran-
non, and Mel (2003a, 2003b) proposed a simplified two-layer neu-
ral network model where individual dendritic subunits perform
a sigmoidal thresholding nonlinear operation on their inputs. This
model provided a useful abstraction on the spatial integrative func-
tion of a pyramidal cell (Wang & Liu, 2010).
However, they all are bound to require the addressing of the
relevant synaptic input to the relevant locality in the dendrites
http://dx.doi.org/10.1016/j.neunet.2014.07.011
0893-6080/©2014 Elsevier Ltd. All rights reserved.
Y. Todo et al. / Neural Networks 60 (2014) 96–103 97
(Koch, Poggio, & Torre, 1983;Koch, Poggio, Torre, & Casey, 1982).
Recently, many algorithms for learning such nonlinear process-
ing have been proposed. Instead of using a weighted sum, Durbin
and Rumelhart used a weighted product as a computational unit
for feedforward learning networks of the backpropagation type
(Durbin & Rumelhart, 1989). Other models of a nonlinear neuron
and algorithm for learning, such as the sigma–pi unit in which the
output activation is calculated as the weighted sum of the prod-
ucts of independent sets, or clusters of input values (Mel,1990;
Rumelhart & Mcclelland, 1986), and the cluster in which each in-
put has a synaptic weight (the term ‘‘cluster’’ is used to refer to in-
puts that can affect the activation received by a particular synapse)
(Spratling & Hayes, 2000), have also been proposed. However, they
could not solve the 3-bit parity problem. Furthermore, these mod-
els used ‘‘weights’’ to represent the degree of clustering between
synapses. Thus, all sense of locality was lost, and these models
could not represent local interactions within a fixed dendritic tree.
In this sense, they are not biologically plausible models of nonlin-
ear dendritic processing (Spratling & Hayes, 2000).
Koch, Poggio, and Torre found that in the dendrites of a retinal
nerve cell, if an activated inhibitory synapse is closer than an exci-
tatory synapse to the cell body, the excitatory synapse will be inter-
cepted. They suggested that the interaction between synapses and
the action at the turning point of a branch be considered in terms
of logical operations (Koch et al.,1983,1982). Several experimen-
tal examples such as direction selectivity in retinal ganglion cells
(Taylor, He, Levick, & Vaney, 2000) and coincidence detection in the
auditory system (Segev, 1998) have provided strong circumstan-
tial support to Koch’s model. Recent theoretical and experimental
studies using the neuron simulation environment also suggested
that such an inhibitory effect be located in a single dendritic branch
(Liu, 2004) and dendritic computation results from the interaction
of excitatory and inhibitory synaptic inputs (Fortier & Bray, 2013).
However, for a specific given task, particularly a complex task, it
is usually very difficult for Koch’s model to identify what type of
synapse (excitatory or inhibitory) is needed, where the synapse
should be located, which branch of the dendrite is needed, and
which one is not needed (Destexhe & Marder, 2004). Koch pointed
out that we need a learning algorithm based on the plasticity in
dendrites to answer these questions and understand how the con-
ductance of a neuron’s cell body and dendritic membrane develops
in time (Koch, 1997). Fortunately, a wide variety of plasticity
mechanisms have been identified in pyramidal neurons (Artola,
Brocher, & Singer, 1990;Bi & Poo, 1998;Dringenberg, Hamze,
Wilson, Speechley, & Kuo, 2007;Gu,2003;Losonczy, Makara, &
Magee, 2008;Makara, Losonczy, Wen, & Magee, 2009;Markram,
Lübke, Frotscher, & Sakmann, 1997;Ngezahayo, Schachner, & Ar-
tola, 2000;Reynolds & Wickens, 2002;Sjöström, Rancz, Roth, &
Häusser, 2008;Sjöström, Turrigiano, & Nelson, 2001). Meanwhile,
Holtmaat and Svoboda showed experimental evidence to support
structural synaptic plasticity and learning (Holtmaat & Svoboda,
2009). In particular, recent experimental evidence suggested that
back-propagating action potentials can provide a feedback signal
to the input layers and may be involved in the process of synaptic
plasticity (Larkum, Zhu, & Sakmann, 1999;Stuart & Häusser, 2001).
In our previous papers (Tang, Kuratu, Tamura, Ishizuka, &
Tanno, 2000;Tang, Tamura, Okihiro, & Tanno, 2000), we proposed a
neuron model with interaction among synapses with dendrites and
successfully trained the model to learn the directionally selective
problem and the depth rotation problem (Sekiya, Aoyama, Tamura,
& Tang, 2001;Sekiya, Wang, Aoyama, & Tang, 2001;Sekiya, Zhu,
Aoyama, & Tang, 2000;Takeuchi,2010;Tamura, Tang, & Ishii,
2002;Tamura, Tang, Okihiro, & Tanno, 1999). However, the train-
ings were all performed by the supervised learning of a mecha-
nism that compares the desired and the actual outputs and feeds
back the processed corrections. Such a supervised training mech-
anism is biologically implausible; it is difficult to conceive such a
training mechanism in the brain. Recently, Legenstein and Maass
provided mathematical proof that these plasticity mechanisms in-
duced a competition between dendritic branches, and such den-
dritic competition enabled a single neuron to acquire nonlinear
computational capabilities, such as the capability to bind multi-
ple input features in a self-organized manner (Legenstein & Maass,
2011). However, even having used nonlinear branches, the model
could not solve such non-linearly separated problems as a sim-
ple exclusive OR (XOR) function. Spratling and Hayes presented a
model of an initially standard linear node that uses unsupervised
learning to find clusters of inputs within which inactivity at one
synapse can occlude the activity at the other synapses. However,
because they used ‘‘weights’’ to represent the degree of cluster-
ing between synapses, all sense of locality was lost, and this model
failed to include local interactions within a fixed dendritic tree. In
this sense, it is not a biologically plausible model of nonlinear den-
dritic processing (Spratling & Hayes, 2000). In this paper, we as-
sume that neurons learning to compute what they compute and
develop an unsupervised learnable neuron model with interaction
among synapses of a dendrite. The unsupervised learning algo-
rithm for a single layer of such neurons requires no teaching signal
for the output, and hence, there are no comparisons with the pre-
determined ideal responses. The training set consists solely of in-
put vectors, and the desired output patterns are obtained from the
input patterns. We show how such an unsupervised rule enables
the neurons to decide their synaptic connections and delete the
unnecessary synaptic connections and dendritic branches. We also
show that such an unsupervised learning algorithm can be used to
learn two-dimensional eight-directionally selective problems.
2. Model and learning
2.1. Model
From measurements made using histological materials, Koch,
Poggio, and Torre found that the interactions between excitation
and inhibition can be strongly nonlinear, and shunting inhibition
can specifically veto an excitatory input if it is located on the direct
path to the soma (Koch et al.,1983,1982). Fig. 1 shows a model
that implements the idea. Here, if the inhibitory interaction is de-
scribed as an AND NOT gate, the operation implemented in Fig. 1
could be read as
u=x1·x2(1)
where x2denotes an excitatory input and x1represents an in-
hibitory input. Each input is either logical 0 or 1. Thus, the signal to
the cell body (soma) becomes u=1 when and only when x1=0
and x2=1.
Fig. 2(a) shows an idealized dendrite of a γcell, receiving ex-
citatory and inhibitory synapses distributed from the tip to the
soma. As shown in Fig. 2(a), most γcells have a small cell body
and dendrites that usually have only one branch (Koch et al., 1982).
Koch, Poggio, and Torre showed that a given excitatory input would
be effectively voted by the inhibitory inputs on the direct path to
the soma whereas the remaining inputs essentially remain unaf-
fected by all other more distal inhibitory synapses (Destexhe &
Marder, 2004;Koch et al.,1983). Thus, the operation implemented
in Fig. 2(a) can be read as
u=x1·x2+x1·x3·x4,(2)
and Fig. 2(a) can also be represented by Fig. 2(b).
Compared with a γcell, the dendrite of a δcell has considerably
more branches (Koch et al., 1982). A δcell is shown in Fig. 3(a). The
operation implemented can be expressed as follows:
u=x7·x8+x5·x6+x1·x2+x1·x3·x4(3)
and thus, Fig. 3(a) can also be re-drawn as Fig. 3(b).
98 Y. Todo et al. / Neural Networks 60 (2014) 96–103
Fig. 1. Idealized dendritic model of the interaction between inhibitory inputs ()
and excitatory inputs (•).
Fig. 2. Idealized dendritic model of the interaction between inhibitory inputs ()
and excitatory inputs (•).
Fig. 3. Idealized dendrite of a γcell (a) and its different representation (b).
As mentioned above, the interaction between the synapses
on branches can be considered a logical AND operation, and the
operation on the branching points a logic OR operation. According
to Boolean algebra, such a network with logical AND, OR, and
NOT functions can produce any complex logical functions for any
interaction among synapses, if the number of AND-like branches
is sufficiently large. Thus, we model the dendritic mechanism as
the structure shown in Fig. 4. Dendritic branches receive signals
at the connection point called synapse (H) and perform a logical
AND operation on these signals. The branching point sums up the
current from the branches, such that its output is a logical OR
operation on its inputs. It is then conducted to the cell body (soma),
and when it exceeds the threshold, the cell fires, sending a signal
down the axon to other neurons.
Fig. 4. Model of a neuron based on the interaction within dendrites.
In order to make the model neuron learnable, we may express
it in a mathematical form as follows:
Synaptic function: A set of inputs labeled x1,x2,...,xn, each of
which is either logical 0 or 1, is applied to mdendritic branches
of the neuron (H) as one of the following synaptic connections:
the direct connection (excitatory synapse •) or the inversed
connection (inhibitory synapse ) or the 1-constant connection
(1
⃝) or the 0-constant connection ( 0
⃝), to the corresponding AND
gates.
The synaptic function can be described by a one-input one-
output sigmoid function and its node function from the ith (i=
1,2,...,n) input to the jth (j=1,2,...,m) AND gate is given by
Yij =1
1+e−k(wijxi−θij )(4)
where wij and θij denote synaptic parameters, and krepresents a
positive constant. Since the input xiis either 0 or 1, for a large k
(e.g., k=10), there are only four cases of different values of the
synaptic parameters wij and θij.
Case 1a: 0 ≤wij < θij , for example wij =1.0 and θij =1.5, as
shown in Fig. 5(a), corresponds to a 0-constant connection.
Case 1b: wij <0< θij, for example wij = −1.0 and θij =
0.5 or 1.5, as shown in Fig. 5(b), produces the same 0-constant
connection.
Case 2: wij < θij <0, for example wij = −1.0 and θij = −0.5,
as shown in Fig. 5(c), leads to an inverse connection (inhibitory
synapse).
Case 3: 0 < θij ≤wij , for example wij =1.0 and θij =0.5,
as shown in Fig. 5(d), illustrates a direct connection (excitatory
synapse).
Case 4a: θij ≤0≤wij, for example wij =1.0 and
θij = −0.5 or −1.5, as shown in Fig. 5(e), shows a 1-constant
connection.
Case 4b: θij ≤wij <0, for example wij = −1.0 and θij = −1.5,
as shown in Fig. 5(f), illustrates a 1-constant connection.
Therefore, the synaptic states can be described by four states
of connections. The values of wij and θij are initialized randomly
within −1.5 and 1.5. This means that the inputs are assumed to be
randomly connected to every dendritic branch with one of the four
synaptic connections mentioned above. As the values of wij and θij
change, the synaptic function varies accordingly, thus exhibiting
various states of connections. Further, the function of sigmoid is
clearly differential.
AND function: It corresponds to the interaction among synapses
on a dendritic branch. The logical AND operation of the input vari-
able yields the value 1 if and only if all input variables are simul-
taneously 1. For any n-variable logic function, only 2n−1AND gates
will be necessary. In order to conduct learning with a gradient de-
scent, derivatives of the AND function are required. However, the
derivatives of the AND function do not exist. Here, we use a soft-
minimum operator.
fmin j=Yije−µYij
e−µYij (5)
Y. Todo et al. / Neural Networks 60 (2014) 96–103 99
(a) 0-constant connection
0< wij < θij (e.g.
wij =1.0, θij =1.5).
(b) 0-constant connection
wij <0< θij (e.g.
wij = −1.0, θij =0.5 or 1.5).
(c) Inversed connection wij < θij <0
(e.g. wij = −1.0, θij = −0.5).
(d) Direct connection 0 < θij < wij
(e.g. wij =1.0, θij =0.5).
(e) 1-constant connection
θij <0< wij (e.g.
wij =1.0, θij = −0.5 or −1.5).
(f) 1-constant connection
θij < wij <0 (e.g.
wij = −1.0, θij = −1.5).
Fig. 5. Synaptic functions from input to dendrite branch.
where µdenotes a positive constant. The soft-minimum function
produces the same result as the AND operation within the limit.
lim
µ→∞ fmin j=min Y1j,Y2j,...,Ynj
=AND Y1j,Y2j,...,Ynj.(6)
OR function: It corresponds to the sublinear summation oper-
ation at a branching point. The logical OR operation of the input
variables yields the value 1 whenever at least one of the variables
is 1 and 0 otherwise. Similar to the soft-minimum for the logical
AND operation, we use a soft-maximum operator for the logical
OR operation:
fmax =
j
fmin jevfminj
j
evfmin j(7)
where vdenotes a positive constant. The soft-maximum function
gives the same result as the OR operation within the limit.
lim
v→∞ fmax =max (fmin 1,fmin 1 ,...,fmin m)
=OR (fmin 1,fmin 1 ,...,fmin m)(8)
where mdenotes the number of AND gates (branches). Note that
this function is also clearly differential. It is also worth noting that
for a set of ninput variables, each taking the value of 0 or 1, the
model neuron is an OR of ANDs, also known as a sum of products,
having a disjunctive normal form (DNF). Therefore, according to
Boolean algebra, provided there are a sufficient number of AND
gates (≤2n−1), the neuron is capable of computing any complex
Boolean function (Cazé, Humphries, & Gutkin, 2013;Crama & Ham-
mer, 2011;Wegener,1987).
Soma: It specifies the threshold or the sigmoid operation of the
product terms. The soma yields the value 1 when the input exceeds
a threshold of 0.5. For learning proposes, we use a sigmoid opera-
tor:
O=1
1+e−g(fmax−0.5)(9)
where gdenotes a positive constant. Furthermore, this function is
also clearly differentiable.
2.2. Back-propagation-like algorithm
The neuron model mentioned above is a feed-forward network
with continuous functions. Thus, the error back-propagation-like
algorithm will be valid for the neuron model. By using a learning
rule, we can readily derive a neuron model from the condition of
the least squared error between the actual output Oand the desired
output Tdefined as
E=1
2(T−O)2.(10)
Since the minimization of the error requires the synaptic pa-
rameters wij and θij changes to be in the negative gradient direc-
tion, we take
1wij = −η∂E
∂wij
(11)
1θij = −η∂E
∂θij
(12)
where ηdenotes a positive constant called the learning constant.
For the synaptic parameters wij and θij, we must differentiate with
respect to the wij and θij, which are deeply embedded in Eqs. (11)
and (12). By using the chain rule, we obtain
∂E
∂wij
=∂E
∂O·∂O
∂fmax
·∂fmax
∂fmin j
·∂fmin j
∂Yij
·∂Yij
∂wij
(13)
∂E
∂θij
=∂E
∂O·∂O
∂fmax
·∂fmax
∂fmin j
·∂fmin j
∂Yij
·∂Yij
∂θij
(14)
and expand Eqs. (10)–(14).
∂E
∂O=O−T(15)
∂O
∂fmax
=ge−g(fmax−0.5)
1+e−g(fmax−0.5)2(16)
∂fmax
∂fmin j
=
l1+vfmin j−vfmin lev(fmin j+fmin l)
l
evfmin l2(17)
∂fmin j
∂Yij
=
k1−µYij +µYike−µ(Yij +Yik)
k
e−µYik 2(18)
100 Y. Todo et al. / Neural Networks 60 (2014) 96–103
Fig. 6. Structure of module by model neurons.
∂Yij
∂wij
=kxie−k(xiwij−θij )
1+e−k(xiwij−θij )2(19)
∂Yij
∂θij
=−ke−k(xiwij−θij )
1+e−k(xiwij−θij )2.(20)
2.3. Unsupervised learning
Holtmaat and Svoboda showed experimental evidence to
support structural synaptic plasticity and learning (Holtmaat &
Svoboda, 2009). Furthermore, Roth and Häusser’s biophysical sim-
ulations of different neuronal types showed that the dendritic ge-
ometry, in concert with dendritic voltage-gated channels, plays a
crucial role in determining how APs propagate in dendritic trees.
Back-propagating sodium APs-like calcium spikes can act as a
‘‘global’’ signal. Back-propagation is strongly dependent on den-
drite morphology and can be modulated with high-precision tim-
ing by synaptic inputs (Vetter, Roth, & Häusser, 2001). We have
also used the back-propagation-like algorithm mentioned above
for many applications in our previous works (Sekiya,Aoyama et al.,
2001;Sekiya,Wang et al.,2001;Sekiya et al.,2000;Takeuchi,2010;
Tamura et al.,2002,1999;Tang,Kuratu et al.,2000;Tang,Tamura
et al.,2000). However, it is difficult to explain where the desired
outputs come from.
Although a single model neuron can perform certain functions,
the power of neural computation comes from connecting neurons
into networks. We use a network with a group of model neurons
arranged in a layer, as shown in Fig. 6. The model neurons con-
nect to each other with a weight, e.g., c= −0.1. Such lateral in-
hibition provides a mechanism through which neurons compete
to respond to the current pattern of stimulation and attempt to
‘‘suppress’’ other neurons from generating a response to the cur-
rent stimulus. It has been known that such lateral inhibition plays
an important role in determining the receptive field properties of
these cells (Eysel, Shevelev, Lazareva, & Sharaev, 1998;Jagadeesh,
2000;Spratling & Johnson, 2003). Thus, the total inputs to neuron
qwill become
uq=fmax q+
Q
l̸=q
cOl.(21)
Hebb has postulated that ‘‘when an axon of cell a is sufficiently near
to excite cell b and repeatedly or persistently takes part in firing it,
some growth or metabolic change takes place in one or both cells
such that a’s efficiency, as one of the cells firing b, is increased’’
(Hebb, 1949). Recently, however, it has been shown that synapses
that are activated slightly before the cell fires, are strengthened
whereas those that are activated slightly after are weakened (Ger-
stner, Kempter, Hemmen, & Wagner, 1996;Markram et al.,1997).
For a given input pattern, our unsupervised learning using the Heb-
bian rule will make the fired neuron fire, the unfired neuron un-
fired, and all neurons to fire if the model neurons are all unfired. It
can be realized by modifying the synaptic parameters wij and θij of
neuron qas
δwqij = −ηoq−0.5oq−1∂oq
∂wqij
(22)
δθqij = −ηoq−0.5oq−1∂oq
∂θqij
(23)
and if all neurons are unfired,
δwqij = −ηoq−1∂oq
∂wqij
(24)
δθqij = −ηoq−1∂oq
∂θqij
(25)
where q=1,2,...,Q, and Qdenotes the number of neurons.
Furthermore, in order to prevent a fired model neuron from firing
to every input pattern, we use a model neuron having an absolute
refractory period (rp) during which once the neuron is fired, it will
not fire again. In symbols:
oq=
1
1+e−g(uq−0.5)p=0
0p>0
(26)
and pwill be set to the fired neurons with a constant rp(≥1)once
the model neuron fires and will be counted down till it reaches 0
as a new input pattern is applied. The unsupervised learning can
be summarized as follows:
1. Generate an initial synaptic parameter set of the model neu-
rons randomly. The random initial synaptic parameters are dis-
tributed uniformly within a small range.
2. Apply an input pattern to each model neuron and use Eqs. (4),
(5), and (7) to calculate the outputs of the model neurons as the
initial outputs of the model neurons.
3. Apply an input pattern to each neuron and use Eqs. (4),(5),(7),
(21), and (26) to calculate the outputs of the model neurons
with c= −0.1.
4. If the outputs of the model neurons are all less than 0.5, use
Eqs. (24) and (25) to modify the synaptic parameters wand θof
all model neurons and go to step 6.
5. Use Eqs. (22) and (23) to change the synaptic parameters wand
θof all model neurons.
6. For the fired neurons, if p=0, set p=rp(> 1)and go to step 3;
7. p:= p−1, go to step 3.
3. Simulations
To validate the neuron model and its unsupervised learning
rule, we simulated the model neuron to the multi-directional
selectivity problem by using the unsupervised learning algorithm.
Fig. 7 shows the two-dimensional input patterns in eight directions
(→,←,↓,↑,↗,↙,↘, and ↖). We used the eight patterns
as the input patterns in learning in which eight model neurons
were used to discriminate the eight input patterns. Each neuron
had 10 branches containing 18 synapses each, nine from fast
pathway inputs (x1,x2,...,x9) and nine from slow pathway
inputs (x′
1,x′
2,...,x′
9) with a delay time τas shown in Fig. 8. All
input patterns were randomly drawn from the eight patterns and
Y. Todo et al. / Neural Networks 60 (2014) 96–103 101
(a) Pattern 1 (→) and Pattern 2 (←). (b) Pattern 3 (↓) and Pattern 4 (↑).
(c) Pattern 5 (↗) and Pattern 6 (↙). (d) Pattern 7 (↘) and Pattern 8 (↖).
Fig. 7. Input patterns (eight directions: →,←(a); ↓,↑(b); ↗,↙(c) and ↘,↖(d)).
Fig. 8. Network for two-dimensional movement detection, where τdenotes the
external delta delay.
assigned to the 10 model neurons. The learning was repeated many
times, each time with a new randomly drawn input pattern. In
the simulations, the learning parameters were set to k=5.0,
µ=5.0, ν=5.0, η=0.5, and rp =3. We initialized
the synaptic parameters wand θof the model neurons within
−1.5< w,θ < 1.5, randomly. Fig. 9(a) shows an example of the
initialized morphology of a model neuron. An input was connected
to a branch by a direct connection (•), an inverted connection
(), a constant-0 connection ( 0
⃝), or a constant-1 connection
(1
⃝). Fig. 10 serves to indicate how the eight model neurons
learned to respond to the specific input patterns. As predicted, each
model neuron learned to fire to their specific input patterns and
achieved satisfactory performance. Fig. 11 shows an example of
the output states of the model neurons to the input patterns after
learning. The neuron began to respond to input pattern 7 after
about 8000 learning epochs. Fig. 9(b) shows the neuron’s dendrite
after learning. It is very interesting to note that dendritic branches
1 and 5 had at least a constant-0 connection. Since a dendritic
branch performs a logical AND operation, branches 1 and 5 can
be eliminated, corresponding to the degeneration of a dendritic
branch. Further, observing branches 2 and 7 carefully, we can find
that they have the completely same dendritic tree structure. It
can be viewed as the redundance of dendritic branches. This is
also because a dendritic branch performs a logical AND operation
and any constant-1 connection to dendritic branches will not
affect its output. Therefore, removing all 1-constant connections
to the branches, we can re-draw Fig. 9(b) as Fig. 9(c). Fig. 12(a)
Fig. 9. Example of a model neuron’s dendrite: (a) before learning, (b) after learning,
and (c) an idealized dendrite.
Fig. 10. Output state of the input patterns of the module after learning (example 1).
102 Y. Todo et al. / Neural Networks 60 (2014) 96–103
Fig. 11. Learning to input pattern 7 (neuron 2).
Fig. 12. Example of a neuron’s dendrite (a) and its idealized dendrite (b).
shows an example of a neuron’s dendrite after learning. It can also
be idealized into a dendrite as shown in Fig. 12(b), predicting a
possible dendritic tree structure. We have performed simulations
over multiple trials by using different randomly generated training
sequences and initial synaptic parameters to several combinations
of the input patterns. We found that two neurons learned to
discriminate any two patterns, for example one firing only to
Pattern 1 (→) and another one firing only to Pattern 2 (←) with
over 85% success rate. Even for four input patterns, such as Pattern
1 (→), Pattern 2 (←), Pattern 3 (↓), Pattern 4 (↑), and Pattern 5
(↗), Pattern 6 (↙), Pattern 7 (↘), Pattern 8 (↖), neurons were
capable of keeping successful discrimination more than 65%. But,
as the number of input patterns increased, the rate for successful
discrimination dropped dramatically. There were only about
3%–5% that could learn to the successful discrimination of all the 8
patterns. This is because our algorithm employed a type of gradient
descent; that is, it could get trapped in a local minimum. Statistical
training methods may help avoid this trap, but they tend to be slow.
Multiple trials by resetting training sequences or initial synaptic
parameters can help avoid this trap because our algorithm itself
is very fast. It is worth to point out that even in case of failure of
learning that not all the 8 patterns were discriminated successfully,
some neurons still gave correct discriminations to their patterns.
From biological point of view, partial success (discrimination of
only several, or a few patterns) might be meaningful.
4. Conclusion
In this paper, we proposed an unsupervised learnable neuron
model by including the nonlinear interactions between excitation
and inhibition of dendrites. The model neuron adjusts its synaptic
parameters such that there is a synapse to a dendrite, according to a
generalized delta-rule-like algorithm without any teacher signals.
The model was used to simulate the directionally selective cells
by unsupervised learning. In the simulations, we randomly initial-
ized the interaction and dendrite of the neuron and used the gen-
eralized delta-rule-like unsupervised learning algorithm to learn
the two-dimensional multidirectional selectivity problem without
an external teacher’s signals. The simulation results showed that
the directionally selective cells could be formed by unsupervised
learning. In simulations, we did not assign any specific inputs to
any specific branches, but rather randomly initialized all inputs to
all branches and the neurons self-organized to decide how many
dendritic branches were needed; if needed, enhanced, and if not,
eliminated. The simulation results also showed whether a synapse
existed or not; if it existed, where and what type (excitatory or in-
hibitory) of synapse that it took. Furthermore, the single neuron or
a single layer of such neurons is capable of solving any complex
problems such as the binding problem and any Boolean function,
particularly a linearly non-separable Boolean function. This leads
us to believe that the proposed neuron model might be consid-
erably more powerful on computations than the McCulloch–Pitts
model because the use of one unit of the McCulloch–Pitts linear
model, sigma–pi model, or any other nonlinear model is incapable
of solving even the simple 3-bit parity problem. Therefore, this
study provided evidence that complex nonlinear functions could
be acquired by the proposed single neuron in an unsupervised
learning manner through biologically plausible plasticity mech-
anisms. In this study, we have also shown that synaptic plastic-
ity mechanisms enabled the nonlinear dendritic neuron to acquire
complex functionality in a back-propagation-like manner. These
might also lead to the development of a completely new technique
for analyzing the mechanisms and principles of neurons, dendrites,
and synapses.
References
Agmon-Snir, H., Carr, C. E., & Rinzel, J. (1998). The role of dendrites in auditory
coincidence detection. Nature,393(6682), 268–272.
Anderson, J. C., Binzegger, T., Kahana, O., Martin, K. A. C., & Segev, I. (1999).
Dendritic asymmetry cannot account for directional responses of neurons in
visual cortex. Nature Neuroscience,2(9), 820–824.
Artola, A., Brocher, S., & Singer, W. (1990). Different voltage-dependent thresholds
for inducing long-term depression and long-term potentiation in slices of rat
visual cortex. Nature,347(6288), 69–72.
Bi, G. Q., & Poo, M. M. (1998). Synaptic modifications in cultured hippocampal
neurons: dependence on spike timing, synaptic strength, and postsynaptic cell
type. Journal of Neuroscience,18, 10464–10472.
Cajal, S. R. (1909). Histologie du systéme nerveux de l’homme & des vertébrés, Vol. 1.
Paris: Maloine.
Cazé, R. D., Humphries, M., & Gutkin, B. (2013). Passive dendrites enable single
neurons to compute linearly non-separable functions. PLoS Computational
Biology,9(2).
Crama, Y., & Hammer, P. L. (2011). Encyclopedia of mathematics and its applications:
Vol. 142.Boolean functions—theory, algorithms, and applications. Cambridge
University Press.
Destexhe, A., & Marder, E. (2004). Plasticity in single neuron and circuit
computations. Nature,431(7010), 789–795.
Dringenberg, H. C., Hamze, B., Wilson, A., Speechley, W., & Kuo, M. C. (2007).
Heterosynaptic facilitation of in vivo thalamocortical long-term potentiation
in the adult rat visual cortex by acetylcholine. Cerebral Cortex,17(4), 839–848.
Y. Todo et al. / Neural Networks 60 (2014) 96–103 103
Durbin, R., & Rumelhart, D. E. (1989). Product units: a computationally powerful
and biologically plausible extension to backpropagation networks. Neural
Computation,1(1), 133–142.
Euler, T., Detwiler, P. B., & Denk, W. (2002). Directionally selective calcium signals
in dendrites of starburst amacrine cells. Nature,418(6900), 845–852.
Eysel, U. T., Shevelev, I. A., Lazareva, N. A., & Sharaev, G. A. (1998). Orientation tuning
and receptive field structure in cat striate neurons during local blockade of
intracortical inhibition. Neuroscience,84(1), 25–36.
Fortier, P. A., & Bray, C. (2013). Influence of asymmetric attenuation of single and
paired dendritic inputs on summation of synaptic potentials and initiation of
action potentials. Neuroscience,236(16), 195–209.
Gerstner, W., Kempter, R., Hemmen, J. V., & Wagner, H. (1996). A neuronal learning
rule for sub-millisecond temporal coding. Nature,383(6595), 76–78.
Gu, Q. (2003). Contribution of acetylcholine to visual cortex plasticity. Neurobiology
of Learning and Memory,80(3), 291–301.
Hebb, D. O. (1949). The organization of behavior: a neuropsychological theory. New
York: Wiley.
Holtmaat, A., & Svoboda, K. (2009). Experience-dependent structural synaptic
plasticity in the mammalian brain. Nature Reviews Neuroscience,10(9),
647–658.
Jagadeesh, B. (2000). Inhibition in inferotemporal cortex: generating selectivity for
object features. Nature Neuroscience,3(8), 749–750.
Kepecs, A., Wang, X.-J., & Lisman, J. (2002). Bursting neurons signal input slope. The
Journal of Neuroscience,22(20), 9053–9062.
Koch, C. (1997). Computation and the single neuron. Nature,385(6613), 207–210.
Koch, C., Poggio, T., & Torre, V. (1983). Nonlinear interactions in a dendritic tree:
localization, timing, and role in information processing. Proceedings of the
National Academy of Sciences,80(9), 2799–2802.
Koch, C., Poggio, T., Torre, V., & Casey, H. (1982). Retinal ganglion cells: a functional
interpretation of dendritic morphology. Philosophical Transactions of the Royal
Society of London: Biological Sciences, Royal Society.
Larkum, M. E., Zhu, J. J., & Sakmann, B. (1999). A new cellular mechanism for
coupling inputs arriving at different cortical layers. Nature,398(6725), 338–341.
Larkum, M. E., Zhu, J. J., & Sakmann, B. (2001). Dendritic mechanisms underlying
the coupling of the dendritic with the axonal action potential initiation zone of
adult rat layer 5 pyramidal neurons. The Journal of Physiology,533(2), 447–466.
Legenstein, R., & Maass, W. (2011). Branch-specific plasticity enables self-
organization of nonlinear computation in single neurons. The Journal of
Neuroscience,31(30), 10787–10802.
Liu, G. (2004). Local structural balance and functional interaction of excitatory
and inhibitory synapses in hippocampal dendrites. Nature Neuroscience,7(4),
373–379.
London, M., & Häusser, M. (2005). Dendritic computation. Annual Review of
Neuroscience,28(1), 503–532.
Losonczy, A., Makara, J. K., & Magee, J. C. (2008). Compartmentalized dendritic
plasticity and input feature storage in neurons. Nature,452(7186), 436–441.
Magee, J. C. (2000). Dendritic integration of excitatory synaptic input. Nature
Reviews Neuroscience,1(3), 181–190.
Mainen, Z. F., & Sejnowski, T. J. (1996). Influence of dendritic structure on firing
pattern in model neocortical neurons. Nature,382(6589), 363–366.
Makara, J. K., Losonczy, A., Wen, Q., & Magee, J. C. (2009). Experience-dependent
compartmentalized dendritic plasticity in rat hippocampal CA1 pyramidal
neurons. Nature Neuroscience,12(12), 1485–1487.
Markram, H., Lübke, J., Frotscher, M., & Sakmann, B. (1997). Regulation of synaptic
efficacy by coincidence of postsynaptic APs and EPSPs. Science,275, 213–215.
McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in
nervous activity. The Bulletin of Mathematical Biophysics,5(4), 115–133.
Mel, B. W. (1990). The sigma–pi column: a model of associative learning in cerebral
neocortex. Tech. rep., Technical Report CNS Memo 6, Computation and Neural
Systems Program, California Institute of Technology.
Ngezahayo, A., Schachner, M., & Artola, A. (2000). Synaptic activity modulates
the induction of bidirectional synaptic changes in adult mouse hippocampus.
Journal of Neuroscience,20, 2451–2458.
Poirazi, P., Brannon, T., & Mel, B. W. (2003a). Arithmetic of subthreshold synaptic
summation in a model CA1 pyramidal cell. Neuron,37(6), 977–987.
Poirazi, P., Brannon, T., & Mel, B. W. (2003b). Pyramidal neuron as two-layer neural
network. Neuron,37(6), 989–999.
Reynolds, J. N., & Wickens, J. R. (2002). Dopamine-dependent plasticity of
corticostriatal synapses. Neural Networks,15, 507–521.
Rhodes, P. A., & Llinás, R. R. (2001). Apical tuft input efficacy in layer 5 pyramidal
cells from rat visual cortex. The Journal of Physiology,536(1), 167–187.
Rumelhart, D. E., & Mcclelland, J. L. (1986). Parallel distributed processing,
Parallel distributed processing: explorations in the microstructure of cognition:
foundations. MIT Press.
Schiller, J., Major, G., Koester, H. J., & Schiller, Y. (2000). Nmda spikes in basal
dendrites of cortical pyramidal neurons. Nature,404(6775), 285–289.
Schiller, J., Schiller, Y., Stuart, G., & Sakmann, B. (1997). Calcium action potentials
restricted to distal apical dendrites of rat neocortical pyramidal neurons. The
Journal of Physiology,505(3), 605–616.
Segev, I. (1998). Sound grounds for computing dendrites. Nature,393(6682),
207–208.
Sekiya, Y., Aoyama, T., Tamura, H., & Tang, Z. (2001). A neuron model that a moving
object can recognize in the planer region. In Proc. of 1st international conference
on control automation and systems (p. 149).
Sekiya, Y., Wang, Q., Aoyama, T., & Tang, Z. (2001). Learning-possibility that neu-
ron model can recognize depth-rotation in three-dimension. In Proc. of 6th
international symposium on artificial life and robotics (pp. 486–489).
Sekiya, Y., Zhu, H., Aoyama, T., & Tang, Z. (2000). Learning-possibility for neuron
model in medical superior temporal area. In Proc. of 15th Korea automatic
control conference (pp. 517–520).
Single, S., & Borst, A. (1998). Dendritic integration and its role in computing image
velocity. Science,281(5384), 1848–1850.
Sjöström, P. J., Rancz, E., Roth, A., & Häusser, M. (2008). Dendritic excitability and
synaptic plasticity. Physiological Reviews,88(2), 769–840.
Sjöström, P. J., Turrigiano, G. G., & Nelson, S. B. (2001). Rate, timing, and
cooperativity jointly determine cortical synaptic plasticity. Neuron,32(6),
1149–1164.
Spratling, M. W., & Hayes, G. M. (2000). Learning synaptic clusters for non-linear
dendritic processing. Neural Processing Letters,11(1), 17–27.
Spratling, M. W., & Johnson, M. H. (2003). Exploring the functional significance
of dendritic inhibition in cortical pyramidal cells. Neurocomputing,52–54,
389–395.
Stuart, G. J., & Häusser, M. (2001). Dendritic coincidence detection of EPSPs and
action potentials. Nature Neuroscience,4(1), 63–71.
Stuart, G., Spruston, N., & Häusser, M. (Eds.) (2008). Dendritic voltage-gated ion
channels. Oxford University Press.
Takeuchi, K. (2010). Calculate dendrites of the neuron to perceive a slope in the depth
direction (Master’s thesis). University of Toyama.
Tamura, H., Tang, Z., & Ishii, M. (2002). The neuron model consisting difference of
time of inputs and its movement direction selection function. Transactions of
the Institute of Electrical Engineers of Japan C,122-C (7), 1094–1103.
Tamura, H., Tang, Z., Okihiro, I., & Tanno, K. (1999). Directionally selective cells
have a δ-like morphology. In International symposium on nonlinear theory and
its applications (pp. 215–218).
Tang, Z., Kuratu, M., Tamura, H., Ishizuka, O., & Tanno, K. (2000). A neuron model
based on dendritic mechanism. IEICE,83, 486–498.
Tang, Z., Tamura, H., Okihiro, I., & Tanno, K. (2000). A neuron model with interaction
among synapses. Transactions of the Institute of Electrical Engineers of Japan C,
120-C(7), 1012–1019.
Taylor, W. R., He, S., Levick, W. R., & Vaney, D. I. (2000). Dendritic computation of
direction selectivity by retinal ganglion cells. Science,289(5488), 2347–2350.
Vetter, P., Roth, A., & Häusser, M. (2001). Propagation of action potentials in
dendrites depends on dendritic morphology. Journal of Neurophysiology,85(2),
926–937.
Wang, Y., & Liu, S. C. (2010). Multilayer processing of spatiotemporal spike patterns
in a neuron with active dendrites. Neural Computation,22(8), 2086–2112.
Wegener, I. (1987). The complexity of Boolean functions. Wiley–Teubner.
Wei, D. S., Mei, Y. A., Bagal, A., Kao, J. P. Y., Thompson, S. M., & Tang, C.-M. (2001).
Compartmentalized and binary behavior of terminal dendrites in hippocampal
pyramidal neurons. Science,293(5538), 2272–2275.