ArticlePDF Available

Probabilistic Dynamic Logic of Cognition

Authors:

Abstract and Figures

We developed an original approach to cognition, based on the previously developed theory of neural modeling fields and dynamic logic. This approach is based on the detailed analysis and solution of the problems of artificial intelligence — combinatorial complexity and logic and probability synthesis. In this paper we interpret the theory of neural modeling fields and dynamic logic in terms of logic and probability, and obtain a Probabilistic Dynamic Logic of Cognition (PDLC). We interpret the PDLC at the neural level. As application we considered the task of the expert decision-making model approximation for the breast cancer diagnosis. First we extracted this model from the expert, using original procedure, based on monotone Boolean functions. Then we applied PDLC for learning this model from data. Because of this model may be interpreted at the neural level, it may be considered as a result of the expert brain learning. In the last section we demonstrate, that the model extracted from the expert and the model obtained by the expert learning are in good correspondence. This demonstrate that PDLC may be considered as a model of learning cognitive process.
Content may be subject to copyright.
This article appeared in a journal published by Elsevier. The attached
copy is furnished to the author for internal non-commercial research
and education use, including for instruction at the authors institution
and sharing with colleagues.
Other uses, including reproduction and distribution, or selling or
licensing copies, or posting to personal, institutional or third party
websites are prohibited.
In most cases authors are permitted to post their version of the
article (e.g. in Word or Tex form) to their personal website or
institutional repository. Authors requiring further information
regarding Elsevier’s archiving and manuscript policies are
encouraged to visit:
http://www.elsevier.com/authorsrights
Author's personal copy
INVITED ARTICLE
Probabilistic dynamic logic of cognition
Evgenii E. Vityaev
a,b,
*, Leonid I. Perlovsky
c
, Boris Ya. Kovalerchuk
d
,
Stanislav O. Speransky
b
a
Sobolev Institute of Mathematics, 630090 Novosibirsk, Russia
b
Novosibirsk State University, 630090 Novosibirsk, Russia
c
Harvard University, Air Force Research Laboratory, USA
d
Central Washington University, Ellensburg, WA 98926-7520, USA
Received 15 March 2013; received in revised form 25 June 2013; accepted 28 June 2013
KEYWORDS
Artificial intelligence;
Probabilistic dynamic logic;
Probabilistic inference;
Neuron model;
Breast cancer
Abstract
We developed an original approach to cognition, based on the previously developed theory of neural
modeling fields and dynamic logic. This approach is based on the detailed analysis and solution of the
problems of artificial intelligence combinatorial complexity and logic and probability synthesis. In
this paper we interpret the theory of neural modeling fields and dynamic logic in terms of logic and
probability, and obtain a Probabilistic Dynamic Logic of Cognition (PDLC). We interpret the PDLC at
the neural level. As application we considered the task of the expert decision-making model approx-
imation for the breast cancer diagnosis. First we extracted this model from the expert, using original
procedure, based on monotone Boolean functions. Then we applied PDLC for learning this model from
data. Because of this model may be interpreted at the neural level, it may be considered as a result of
the expert brain learning. In the last section we demonstrate, that the model extracted from the
expert and the model obtained by the expert learning are in good correspondence. This demon-
strate that PDLC may be considered as a model of learning cognitive process.
ª2013 Elsevier B.V. All rights reserved.
1. Introduction
Previously, an original approach was developed to the cog-
nition simulation based on the theory of neural modeling
fields and dynamic logic (Kovalerchuk & Perlovsky, 2008;
Perlovsky, 1998; Perlovsky, 2006; Perlovsky, 2007). On the
one hand, this approach is based on the detailed analysis
of the cognition problem for artificial intelligence combi-
natorial complexity and logic and probability synthesis. On
the other hand, it is based on the psychological, philosoph-
ical or cognitive science data for the basic mechanisms of
cognition. The main idea behind success of NMF is matching
the levels of uncertainty of the problem/model and the lev-
els of uncertainty of the evaluation criterion used to iden-
tify the model. When a model becomes more certain then
2212-683X/$ - see front matter ª2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.bica.2013.06.006
*Corresponding author at: Sobolev Institute of Mathematics,
630090 Novosibirsk, Russia. Tel.: +7 383 363 4562; fax: +7 383 333
2598.
E-mail addresses: vityaev@math.nsc.ru (E.E. Vityaev), leonid@
seas.harvard.edu (L.I. Perlovsky), borisk@cwu.edu (B.Ya. Kovalerchuk),
netid@ya.ru (S.O. Speransky).
Biologically Inspired Cognitive Architectures (2013) 6, 159168
Available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/bica
Author's personal copy
the evaluation criterion is also adjusted dynamically to
match the adjusted model. This process is called dynamic
logic of model construction, which mimics processes of
the mind and natural evolution.
Analysis of the cognition problems has, in fact, a broader
meaning and overcoming these problems can lead to the
other formal characterizations of the cognition process.
With this purpose, in Kovalerchuk and Perlovsky (2008) a
generalization of the theory of neural modeling fields and
dynamic logic was obtained in the form of dynamic logic
of cognition and cognitive dynamic logic. These logics are
formulated in rather general terms: relations of generality,
uncertainty, simplicity; maximization the similarity with
empirical content; training method.
In this paper, we interpret these concepts in terms of
logic and probability: the uncertainty we interpret as a
probability, and the process of training as a semantic
probabilistic inference (Smerdov & Vityaev, 2009; Vityaev,
2006a; Vityaev, 2006b; Vityaev & Smerdov, 2009). Ob-
tained Probabilistic Dynamic Logic of Cognition (PDLC) be-
longs to the field of probabilistic models of cognition
(Probabilistic Inductive Logic Programming, 2008; Probabi-
listic models of cognition, 2006; The Probabilistic Mind,
2008). We show that this logic also solves the cognition
problems (combinatorial complexity and logic and proba-
bility synthesis). Thus, by the generalization obtained in
Kovalerchuk and Perlovsky (2008), we extend the inter-
pretation of the theory of neural modeling fields and dy-
namic logic on the probabilistic models of cognition.
Probabilistic dynamic logic had been used for simulation
of brain activity and cognitive processes in Demin and
Vityaev (2008).
2. Cognition problem from the probabilistic
point of view
Now we repeat and extend the description of cognition
problem for artificial intelligence stated in Perlovsky
(1998) and Kovalerchuk and Perlovsky (2008). The foun-
ders of artificial intelligence in the 1950s and 1960s be-
lieved that by reference to the rules of logic, they
would soon create a computer which intelligence would
be far superior to the human brain. But the application
of logic to artificial intelligence didn’t lead to the results
expected. We need to clearly distinguish the theoretical
and empirical attitudes. In theory using the idealized
knowledge, for example, in physics, geometry, chemistry
and other sciences, logic and logical inference are justi-
fied and work perfectly. But the intelligent systems are
based on the empirical learning process, with the knowl-
edge, obtained as a result, being inductive. For the induc-
tively derived knowledge logical inference does not work
well.
The brain is not a logical but a predictive one. However,
a suitable definition of prediction for the inductively de-
rived knowledge is a problem.
A generally accepted definition of prediction belongs to
Karl Popper and is based on the fact that for the prediction
of some fact it is necessary to infer it from the available
facts and laws. But this definition does not work for the
inductively derived knowledge with estimations of probabil-
ity, confirmation, etc. At the same time, in the logical infer-
ence of predictions it is necessary to deduce the estimations
of probability, confirmation, etc. for the obtained predic-
tion. For probability estimations there is a probabilistic lo-
gic (Gabbay, Johnson, Ohlbach, & Woods, 2002) and
probabilistic inductive logic programming (Raedt et al.,
2008) to deal with it. But it is well known that prediction
estimations may fall during the logical inference and leading
to zero prediction estimations. Predictions with zero esti-
mation can not be regarded as predictions. This problem
is now regarded as a problem of logic and probability syn-
thesis (Cozman et al., 2009). There have already been five
symposia between 2002 and 2011 under the common title
Progic (Probability + Logic).
We have introduced a new concept of prediction (Smer-
dov & Vityaev, 2009; Vityaev, 2006b; Vityaev & Smerdov,
2009) which is not use a logical inference and replace the
‘‘true’’ and ‘‘false’’ values by probability. Instead of logical
inference we introduced semantic probabilistic inference.
The new definition of the prediction is fundamentally differ-
ent from the Karl Popper’s one the prediction of some
fact occurs not as a logical inference of the particular fact
from the existing ones but as a direct inductive inference
of the rule that predicts the fact we are interested in. Esti-
mates of probability strictly increase in the process of
semantic probabilistic inference.
Another problem of cognition in artificial intelligence is
the combinatorial complexity problem (CC), (Kovalerchuk
& Perlovsky, 2008; Perlovsky, 1998). Perception associates
a subset of signals corresponding to the external objects
with representations of these objects. The process of asso-
ciation-recognition-understanding turned out to be not at
all easy, and is connected with the notion of combinatorial
complexity. Subsequent studies found a connection be-
tween CC and logic in a variety of algorithms. Logic consid-
ers a very small change in data or models as a new
proposition. Attribution of truth values ‘‘true’’ and ‘‘false’’
does not allow comparing statements and this leads to CC.
In (Hyafil & Rivest, 1976) it is proved that even the simplest
task of finding the set of propositions describing the deci-
sion trees is NP-hard.
Follow the work (Kovalerchuk & Perlovsky, 2008)we
introduced two order relations on propositions: relations
of generality and comparison that are used in semantic
probabilistic inference. This essentially reduces the search
and, along with the use of statistical estimates, makes it
acceptable and solves CC problem.
Now we recall and extend the basic definitions related to
cognition (Kovalerchuk & Perlovsky, 2008; Perlovsky, 2006).
We assume that the basic mechanisms of cognition include:
instincts, concepts, emotions and behavior. Further we ex-
plain how semantic probabilistic inference may be used in
formalization of these concepts.
Ray Jackendoff (2002) believes that the most appropri-
ate term for mechanism of concepts is a model, or an
internal model of cognition. Concepts are models in a lit-
eral sense. Within our cognitive process, they construct
world objects and situations. Cognition involves the mul-
ti-leveled hierarchy of concept models: from the simplest
elements of perception (line, point) to the concept mod-
els of objects, relations between objects and complex
situations.
160 E.E. Vityaev et al.
Author's personal copy
The fundamental role of emotions in cognition is that
they are connected with the instinct for knowledge that
maximizes a measure of similarity between the concept
models and the world (Kovalerchuk & Perlovsky, 2008). This
emotional mechanism turned out to be fundamental in
‘‘breaking the vicious circle’’ of the combinatorial complex-
ity. In the process of learning and understanding of the input
signals, the models are adapted to represent the input sig-
nals better and to increase the similarity between them.
This increase of similarity satisfies the instinct for knowl-
edge and feels like an aesthetic emotion.
Experiments confirming the relation between emotions
and the instinct for knowledge can be found in the
P.V. Simonov’s Information theory of emotions (Simonov,
1981): ‘‘Summing up the results of our experiments and lit-
erature material we came to the conclusion... that emotion
is a reflection of human and animal brain of any actual need
(its quality and quantity) and probability (possibility) of its
satisfaction that the brain evaluates on the basis of genetic
and acquired earlier individual experience...’’.
The following experiment shows that the instinct for
knowledge causes positive emotions (Simonov, 1981): ‘‘In
our experiments we projected rows of five digits ones
and zeros on the screen, which was placed in front of
the test subject. The test subject was warned that some
of the frames containing a common feature (e.g., two zeros
in a row 00) would be accompanied by a beep. The task was
to find this feature out... Until the first hypothesis about
corroborated feature (usually erroneous frames, e.g. 01)
neither new fames no beeping called GSR (galvanic skin re-
sponse, emotional detector) emerged... The emergence of
hypothesis was accompanied by GSR... After the formation
of hypotheses two situations are possible, we regard them
as the experimental models of negative and positive emo-
tional responses... The hypothesis is wrong, and the
frame... containing corroborated feature (00 and, conse-
quently, not confirming the hypothesis 01) does not cause
GSR. When beeping indicates that the test subject made a
mistake, GSR is registered as a result of a mismatch be-
tween the hypothesis and the present stimulus... The test
subject changes their hypothesis several times and at some
point it starts being accurate. Now, the very appearance of
the corroborated frame causes GSR and its corroboration
with a beep leads to even stronger galvanic skin shifts.
How can we understand this effect? Indeed, in this case
we see a complete match of the hypothesis... with the pres-
ent stimulus. The absence of the mismatch should entail the
absence of GSR...In fact, in the latter case we also encoun-
ter mismatch, but mismatch of another sort when testing
the false hypothesis. Formed in the process of repeated
combinations prediction contains not only the afferent mod-
el of the goal... but also the probability of achieving this
goal. At the moment of the frame corroboration... pre-
dicted by a beep, the probability of solving the problem
(that the hypothesis is correct) has increased and this
mismatch between the prediction and received information
led to a strong GSR’’.
Thus, confirmation of hypothesis, which causes a positive
emotion, increases its credibility and, consequently, the
closeness of concept model to our world (demonstration
of the instinct for knowledge). The whole process of learn-
ing, when a person achieves more accurate and correct
actions in the real world, is supported by the emotions po-
sitive emotions reinforce correct actions (and corresponding
correct predictions, increasing their probability), and nega-
tive emotions correct mismatches between the model and
the real world (and corresponding wrong predictions, reduc-
ing their probability).
In this case similarity of the concept models to our
world, controlled by emotions, is measured by the probabil-
ity of predictions. Semantic probabilistic inference, under-
lying the training operator, implements directed search of
more and more probable rules by adding to the premise of
the rules additional properties of the world that allow
increasing the conditional probability of prediction and,
therefore, provide greater value and the similarity to the
world. This directed search eliminates the CC problem.
The main definitions of PDLC presented in the next section.
In the third section we argue that semantic probabilistic
inference may be also considered as a formalization of the
Hebb’s rule (Caporale & Dan, 2008; Hebb, 1949) and a for-
mal model of neuron. It gives us possibility to interpret
the PDLC at the neuronal level. We have applied the PDLC
for the approximation of expert decision-making model for
the diagnosis of breast cancer (Kovalerchuk, Vityaev, &
Ruiz, 2001).
3. The data, models, relation of generality,
similarity between the model and the data
We define the basic concepts of PDLC in the propositional
case. Expanded definitions in the language of first-order
logic (in several different versions) can be found in (Smer-
dov & Vityaev, 2009; Vityaev, 2006b; Vityaev & Smerdov,
2009).
By data we mean the standard object-feature matrix, in
which the set of features x
1
(a), ...,x
n
(a) are given over the
set of objects A={a
1
,...,a
m
}, where astands for the vari-
able on objects. For each value of feature x
i
we define an
atomic proposition Pi
jðaÞ¼ðxiðaÞ¼xijÞ;j¼1;...;ni, where
xi1;...;xiniare all values of the feature x
i
,i=1,...,n.We
denote the set of all atomic propositions (atoms) as At.
We shall denote this atoms as Boolean variables a,b,c,
.... We call literal a set consisting of atomic propositions
or their negations, which we shall also denote as Boolean
variables a,b,c,...(possibly with indices). The set of all lit-
erals we denote by L.
We assume that data is presented by some empirical sys-
tem (algebraic system)
Data ¼A;P1
1;...;P1
n1;P2
1;...;P2
n2;...;Pk
1;...;Pk
nk
noDE
;
in which the truth values of all atomic propositions over the
set of objects Aare defined.
By the model we shall mean the Boolean function U, with
atoms from At being substituted for Boolean variables. It is
known that any Boolean function can be represented by a
number of rules {R} of the form
ða1&...& ak)bÞ;a1;...;ak;b2L;ð1Þ
Therefore by the model Uwe shall mean a set of rules
{R}. We denote the set of all type (1) rules by Pr. We say
that the rule (1) is applicable to the data if the premise
Probabilistic dynamic logic of cognition 161
Author's personal copy
a
1
,...,a
k
of the rule is true in the Data, and that the model
U={R}is applicable to the data if every rule in the model is
applicable to the data.
For the models, defined as a set of rules, there emerges a
combinatorial complexity problem. To avoid this problem, we
shall define a partial order on the sets of rules and models, as
well as a measure of similarity between the model and the data.
Definition 1. We shall call the rule
R1¼a1
1&a1
2&...&a1
k1)c

more general than the rule
R2¼b2
1&b2
2&...&b2
k2)c

, and denote it as R
1
CR
2
if and
only if a1
1;a1
2;...;a1
k1
no
b2
1;b2
2;...;b2
k2
no
;k1<k2, and not
less general R
1
%R
2
if k
1
6k
2
.
Corollary 1. R
1
%R
2
)R
1
JR
2
and R
1
CR
2
)R
1
JR
2
where
Jis the logical inference in propositional calculus (atoms
being perceived as propositional characters).
Thus, no less general (and more general) statements are
logically stronger. Besides, more general rules are easier,
because they contain a smaller number of letters in the
premise of the rule, so the relation can be interpreted as
the relation of simplicity, in sense of (Kovalerchuk &
Perlovsky, 2008).
Definition 2. We shall call the model U1¼R1
i

no less
general U
1
%U
2
than the model U2¼R2
j
no
iff for every rule
R
2
2U
2
there exists no less general rule R
1
2U
1
,R
1
%R
2
and
more general U
1
CU
2
if at least for one rule R
2
2U
2
there
exists more general rule R
1
CR
2
,R
1
2U
1
nU
2
.
Corollary 2.
U1U2)U1U2:
Corollary 2 implies that the more general model is logi-
cally stronger and simultaneously easier.
We define the set of propositions Fas a set of proposi-
tions obtained from the atoms At by the closure under logic
operations ,,.
Definition 3. We call the mapping l:F´[0, 1] as proba-
bility on the set of propositions F, if it does satisfy the
following conditions (Halpern, 1990):
1. If Ju, then l(u)=1;
2. If J(uw), then l(uw)=l(u)+l(w).
We define the conditional probability of the rule
R=(a
1
&...&a
k
)c)as
lðRÞ¼lðc=a1&...&akÞ¼lða1&...&ak&cÞ
lða1&...&akÞ;if lða1&...&akÞ>0:
For the case l(a
1
&...&a
k
) = 0 the conditional probability
is not defined. We denote by Pr
0
the set of all rules of Pr, for
which the probability is defined. We assume that the prob-
ability lgives the probabilities of events, represented by
propositions above the empirical system Data.
Definition 4. We shall call by a probability law a rule
R2Pr
0
such that it cannot be generalized (logically
increased) without reducing its conditional probability,
i.e. for any R02Pr
0
if R0CR, then l(R0)<l(R).
Probability laws are the most common, simple and
logically strongest rules among the rules with lesser or
equal conditional probability. We denote the set of all
probability laws by PL (Probabilistic Laws). Any rule can be
generalized (simplified and logically enforced) to a proba-
bility law without reducing its conditional probability.
Lemma 1. For every rule R 2Pr
0
, it is either a probability
law or there is a probability law R02PL such that R0CR
and l(R0)Pl(R).
Definition 5. By probabilistic regularity model we mean
the model U={R}, R2PL.
Lemma 2. For any model U, it is either a probabilistic reg-
ularity model or there is no less general model U0%U, for
some probabilistic regularity model U0.
We define an order relation on the set of probabilistic
laws PL.
Definition 6. By relation of probabilistic inference R
1
vR
2
,
R
1
,R
2
2PL we mean simultaneous fulfillment of two
inequalities R
1
%R
2
and l(R
1
)6l(R
2
). If both inequalities
are strict, then the relation of probabilistic inference is
called a strict probabilistic inference
R
1
@R
2
() R
1
CR
2
&l(R
1
)<l(R
2
).
Definition 7. By semantic probabilistic inference (Vityaev,
2006a; Vityaev, 2006b), we shall mean the maximal (that
cannot be extended) sequence of probability laws in rela-
tion of strict probabilistic inference R
1
@R
2
@...@R
k
.We
shall call the most specific the latter probability law R
k
in
this inference.
We now extend the definition of semantic probabilistic
inference on the models and define the relation of similarity
on the probabilistic regularity models.
Definition 8. Probabilistic regularity model U2¼R2
j
no
is
closer to the data than the probabilistic regularity model
U1¼R1
i

(we denote it as U
1
/U
2
) if and only if U
1
CU
2
and for any probabilistic law R
2
2U
2
there exists probabilis-
tic law R
1
2U
1
such that R
1
2R
2
and for at least one proba-
bilistic law R
2
2U
2
there is a probabilistic law R
1
2U
1
nU
2
with strict relation of probabilistic inference R
1
@R
2
.
This definition means that when passing from the
probabilistic regularity model U
1
to the model U
2
there is
such a build-up of premises of rules, which (strictly)
increases the conditional probability of these rules. The
increase of the conditional probabilities of model rules
means the increase of the model predictive ability and its
similarity to the data.
As mentioned in the introduction, the instinct for
knowledge is to ‘‘maximize a measure of similarity between
the concept models and the world’’. In our definition the
162 E.E. Vityaev et al.
Author's personal copy
measure of similarity is defined by the set of conditional
probabilities of the rules of the model, i.e. by total
accuracy of the predictions of the model.
Definition 9. We call the measure of similarity between
the probabilistic regularity model U={R}and the data
the set {l(R), R2U} of conditional probabilities of the
model rules.
Corollary 3. If U1/U2;U1¼R1
i

;U2¼R2
j
no
, then the
measure of similarity {l(R), R 2U
2
}of the model U
2
approximates the measure of similarity {l(R), R 2U
1
}of
the model U
1
in the sense that for any probability
l(R
2
)2{l(R
2
), R
2
2U
2
}there is the same l(R
1
)6l(R
2
) (or
strictly less probability l(R
1
)<l(R
2
)), l(R
1
)2{l(R
1
),
R
1
2U
1
}.
The instinct for knowledge is a process that develops
dynamically by successive approximation to the data.
Definition 10. The training operator L:U
1
U
2
is the
transformation of model U
1
to model U
2
such that both
models are applicable to the data and the similarity of the
model to the data becomes higher U
1
/U
2
, and also all
the maximally specific laws of model U
1
transfer to the
model U
2
.
We have developed a software system named Discovery
(Kovalerchuk & Vityaev, 2000; Vityaev, 2006a), which
precisely implements this training operator by the following
steps:
(1) the initial set of rules U
0
={R} consists of the type (1)
rules with the number of predicates in the premise of
kthat is not more than some number N, defined by the
user;
(2) only those rules that are probabilistic laws U
1
=
{RŒR2U
0
,R2PL} are selected from the set U
0
;
(3) hereafter the training operator L:U
i
U
i+1
,i=1, 2,
... is applied as long as possible. If reinforcement of
the model is no longer possible, the learning process
stops. The strict increase of the probability during
probabilistic inference is verified by Fisher’s exact
test for contingency tables.
In the book (Kovalerchuk & Vityaev, 2000) there is a pseudo
code for this algorithm. This program has been successfully
used for a wide range of applications in various domains
(Kovalerchuk & Vityaev, 1998; Kovalerchuk & Vityaev,
2000; Kovalerchuk et al., 2001; Scientific Discovery website,
xxxx; Vityaev, 2006a).
4. Neural organization of probabilistic dynamic
logic
Semantic probabilistic inference may be regarded as the
formal model of neuron (Vityaev, 2006a). Here we briefly
present this formal model. So, we can illustrate definitions,
introduced in the previous section at the neuronal level.
By the information coming to the ‘‘entrance’’ of the
brain, we shall mean all afferentation perceived by the
brain. We define afferentation transmitted through the neu-
ron dendrites, by monadic predicates Pi
jðaÞ¼ðxiðaÞ¼xij Þ;
j¼1;...;ni, where x
i
(a) is some characteristic, and x
ij
is a
value of this characteristic in the situation (on object) a.
If this afferentation passes on the excitatory synapse, neu-
ron perceives it as information about the truth of the pred-
icate Pi
jðaÞ; if it passes on the inhibitory synapse, neuron
perceives it as a negation of the predicate :Pi
jðaÞ.
We shall define the excitation of neuron in the situation
(on object) aand transfer of this excitation on its axon by
monadic predicate P
0
(a). If neuron is inhibited in the situa-
tion aand does not transmit excitation by its axon, we de-
fine this situation as a negation of the predicate P
0
(a). It
is known that each neuron has a receptive field, stimulation
of which excites it unconditionally. The initial (before train-
ing) semantics of the predicate P
0
is the information that it
receive from the receptive field. In the process of learning
this information enriches.
We assume that the formation of conditional rules (rela-
tions) at the neuronal level originates according to the
Hebb’s rule (Caporale & Dan, 2008; Gerstner & Kistler,
2002; Hebb, 1949) and can be formalized accurately enough
by semantic probabilistic inference.
Predicates and their negations are literals (predicates or
its negations), and, as in the previous section, we will de-
note them by Boolean variables a,b,c,...2L.
We shall define neuron as a set {R} of type (1) rules:
ða1&...& ak)bÞ;a1;...;ak;b2L;
where a
1
,...,a
k
are values (excitatory/inhibitory) of the
predicates coming at the input of the neuron, and bis value
of the predicate P
0
(a) that denotes the output of neuron.
We now define a method for calculating the conditional
probabilities of the neuron rules (a
1
&...&a
k
)b). We cal-
culate the number of cases n(a
1
,...,a
k
,b) where an event
Æa
1
,...,a
k
,bæhas occurred as a simultaneous excitation/
inhibition of the neuron inputs Æa
1
,...,a
k
æand the neuron
itself just before the reinforcement. Reinforcement can
be both positive or negative.
Among the cases n(a
1
,...,a
k
,b) we calculate the num-
ber n
+
(a
1
,...,a
k
,b) of cases when the reinforcement was
positive and the number n
(a
1
,...,a
k
,b) of cases when
the reinforcement was negative. Then the conditional
(empirical) probability of the neuron rule (a
1
&...&a
k
)b)
can be computed as follows:
lðb=a1;...;akÞ¼nþða1;...;ak;bÞnða1;...;ak;bÞ
nða1;...;ak;bÞ:
In the process of elaborating classical conditioning the
conditional signals associate with the result. At the neuronal
level classical conditioning appears in the form of neuronal
plasticity (Caporale & Dan, 2008; Pfister, Barber, & Gerst-
ner, 2003).
Furthermore, if there emerge new stimuli allowing to
predict neuron excitation even better (with higher probabil-
ity), they attached to this conditional relation. The formal
connection of new stimuli to the existing relation is deter-
mined by probabilistic inference (Definition 6), which actu-
ally means that new stimuli are added to the conditional
Probabilistic dynamic logic of cognition 163
Author's personal copy
relation if they increase the conditional probability of pre-
diction of neuron excitation.
Formalization of the process of conditional connections
at the neuronal level is given by semantic probabilistic
inference (Definition 7). Fig. 1 schematically shows a few
semantic probabilistic inferences implemented by neuron:
1. b(a1
1&a1
2

@b(a1
1&a1
2&a1
3&a1
4

@b(a1
1&a1
2&a1
3&a1
4&a1
5&...&a1
k

;
2. b(a2
1&a2
2&a2
3

@b(a2
1&a2
2&a2
3&a2
4&...&a2
m

;
3. b(a3
1

@b(a3
1&a3
2&a3
3

@b(a3
1&a3
2&a3
3&a3
4&...
&a3
nÞ.
According to the definition of semantic probabilistic
inference, the rules must be probabilistic laws (Definition
4). This means that rules do not include stimuli that are
not a signal ones, i.e. each stimulus must increase the prob-
ability of neuron excitation, which is needed for the condi-
tional relation.
Another feature of semantic probabilistic inference is
the requirement of increasing the probability of rules
during probabilistic inference R
1
@R
2
(Definition 6), i.e.
b(a1
1&a1
2

@b(a1
1&a1
2&a1
3&a1
4

. This means that the
conditional relation b(a1
1&a1
2

is enhanced by new stimuli
a1
3&a1
4to the relation b(a1
1&a1
2&a1
3&a1
4

, if they increase
the conditional probability of neuron bexcitation.
The set of semantic probabilistic inferences, which neu-
ron detects in the process of learning, forms its probabilistic
regularity model (Definition 5), predicting neuron excitation
P
0
(a).
The probabilistic regularity model of the neuron be-
comes closer to the data (Definition 8) if at least one of
its conditional relations (one of the probability laws) be-
comes stronger new stimuli were added, which increase
the conditional probability (probabilistic inference was real-
ized Definition 6). The data in this case represents a re-
cord of learning experience.
Similarity to data means that the neuron responds more
accurately to conditioned stimuli, i.e. with higher condi-
tional probability. Measure of similarity between the proba-
bilistic regularity model and data (Definition 9) is a set of
conditional probabilities of neuron excitation over all its
conditional relations.
Neuron training (by the training operator Definition 10)
is a transformation of regularity neuron model, in which at
least one of its conditional relations is strictly increasing.
Thus, the training operator is constantly increasing neuron
conditional relations in virtue of experience.
It is known that in the process of elaborating conditional
relations the speed of conduction impulse from the condi-
tional stimulus to the neuron’s axon became higher, i.e.
the higher the probability of conditional relation is, the
higher the speed of neuron response to the conditional sig-
nal. This confirms that the brain responds primarily to the
high probability predictions and neurons excite to the stron-
gest patterns with the highest conditional probabilities.
A model of neurons group, that express the expert intu-
ition, as in the problem of approximation the expert deci-
sion-making model, considered in the following section, is
the set of neuron models.
If we consider the task of modeling a Boolean function of
the expert decision-making model, presented in the next
section, it is necessary to consider all regularity models of
neurons, predicting Boolean variables of these functions.
The theory of neural modeling fields and dynamic logic
can also be represented at the neuronal level. To determine
the strength of the conditional relation at the neuronal level
we use weights attributed to the conditional stimulus and
determining the strength of its impact on the neuron excita-
tion. During training, these weights as probabilities are
modified.
The main similarity between dynamic logic and probabi-
listic dynamic logic is in their dynamics:
- in the theory of neural modeling fields the fact that the
measure of similarity changes gradually and, at first, the
approximate solutions and models are found, and then
the measure of similarity is specified, and more accurate
models are found;
- in the case of semantic probabilistic inference at first
simple rules are used for prediction, with one or two con-
ditional stimuli in the premise, which do not give very
good conditional probability (a measure of similarity),
and then the rules are built up by adding new specifying
conditions to strengthen this conditional probability.
5. Extraction of expert decision-making
models for the diagnosis of breast cancer
We have applied the developed training operator for the
approximation of expert decision-making models for the
diagnosis of breast cancer (Kovalerchuk et al., 2001).
First, we extracted this model from the expert radiologist
J.Ruiz, using special procedure for extraction of expert
knowledge (Kovalerchuk, Triantaphyllou, Despande, &
Vityaev, 1996; Kovalerchuk et al., 2001), based on mono-
tone Boolean functions. Then we applied the ‘Discovery’
system (Kovalerchuk & Vityaev, 2000; Vityaev, 2006a),
that implements the training operator, for the approxima-
tion of this model.
Fig. 1 Semantic probabilistic inference with neuron.
164 E.E. Vityaev et al.
Author's personal copy
1. Hierarchical approach. First, we asked the expert to
describe specific cases using binary features. Then, we
asked the expert to make a decision on each case. A typ-
ical request for the expert had the following form: ‘‘If
feature 1 has value v
1
, feature 2 value v
2
,..., feature
n value v
n
, is this case suspicious for cancer or not.’’
Each set of values v
1
,v
2
,...,v
n
is a possible clinical case.
It is almost impossible to ask a radiologist to provide
diagnoses for thousands of possible cases. We used a
hierarchical approach together with the property of
monotonicity to reduce the number of questions for the
expert to several tens. First, we constructed a hierarchy
of clinical features. At the bottom of the hierarchy 11
clinical binary features w
1
,w
2
,w
3
,y
1
,y
2
,y
3
,y
4
,y
5
,x
3
,
x
4
,x
5
were set (with values: 1 ‘‘suspicious for cancer’’,
0 ‘‘non-cancerous case’’). The expert found these 11
features could be organized into a hierarchy by introduc-
ing two generalized features: x
1
‘‘the number and
amount of calcification’’, depending on the features
w
1
,w
2
,w
3
:
w
1
the number of calcifications/cm
3
,
w
2
the amount of calcification/cm
3
,
w
3
total number of calcifications. and x
2
‘‘the
shape and density of calcification’’, depending on
the features y
1
,y
2
,y
3
,y
4
,y
5
:
y
1
‘irregularity in the form of individual
calcifications’
y
2
‘variations in the form of calcification’
y
3
‘variations in the size of calcification’
y
4
‘variations in the density of calcification’
y
5
‘the density of calcification’.
We will consider the feature x
1
as function g(w
1
,w
2
,w
3
)
and the feature x
2
as function x
2
=h(y
1
,y
2
,y
3
,y
4
,y
5
)
that are to be found. The result is a decomposition of
the problem in the form of the following function: f(x
1
,
x
2
,x
3
,x
4
,x
5
)=f(g(w
1
,w
2
,w
3
), h(y
1
,y
2
,y
3
,y
4
,y
5
), x
3
,
x
4
,x
5
).
2. Monotonicity. In terms of introduced features and
functions we can present clinical cases in terms of vec-
tors with five generalized variables (x
1
,x
2
,x
3
,x
4
,x
5
).
We consider two clinical cases represented with vectors:
(10110) and (10100). If the radiologist correctly diagno-
ses the case (10100) as suspicious for cancer, then using
the property of monotonicity we can conclude that the
case (10110) should also be suspicious for cancer, as
there are more features with the value 1 conducing can-
cer. The expert agrees with the assumption of monoto-
nicity of the functions f(x
1
,x
2
,x
3
,x
4
,x
5
) and h(y
1
,y
2
,
y
3
,y
4
,y
5
).
6. Extraction of expert decision-making rules
We describe an interview with an expert using a minimal se-
quences of questions to fully restore functions fand h.
These sequences are based on the fundamental Hansel’s
lemma (Hansel, 1966; Kovalerchuk et al., 1996). We shall
omit the description of mathematical details which can be
found in (Kovalerchuk et al., 2001).
We consider Table 1. Columns 2 and 3 represent the val-
ues of functions fand hthat need to be restored. We omit
the restoring of function g(w
1
,w
2
,w
3
), because it takes only
a few questions to restore it. All 32 possible cases for five
Boolean variables Æx
1
,x
2
,x
3
,x
4
,x
5
æare presented in column
1. These cases were grouped in a Hansel chains (Hansel,
1966; Kovalerchuk et al., 1996). The sequence of chains be-
gins with a short chain #1 (01100) < (11100). The largest
chain #10 includes 6 ordered cases: (00000) < (00001) <
(00011) < (00111) < (01111) < (11111). The chains are num-
bered from 1 to 10, and each case has its own number in
the chain, for example, 1.2 is the second case in the first
chain. Asterisks in columns 2 and 3 represent the responses
from the expert, for example, 1* for the case (01100) in col-
umn 3 means that the expert said ‘Yes’ (‘‘suspicious for can-
cer’’) about this case. Responses for some other cases in
columns 2 and 3 are automatically derived in virtue of the
property of monotonicity. The value f(01100) = 1 for the
case 1.1 can be extended to the cases of 1.2, 6.3., 7.3
due to monotonicity. Values of monotone Boolean function
hare computed similarly, only the attributes, for example,
in the sequence (10010) are interpreted as features y
1
,y
2
,
y
3
,y
4
,y
5
. Hansel chains remain the same if the number of
variables does not change.
Columns 4 and 5 list the cases on which you can expand the
values of the functions fand hwithout consulting the expert.
Column 4 is the expansion of value 1 of the functions and col-
umn 5 is for expansion of value 0 of the functions. If the ex-
pert gives the answer f(01100) = 0 for the case (01100), this
value must be expanded in column 2 to the cases 7.1
(00100) and 8.1 (01000), written in column 5. So we should
not ask the expert about the cases 7.1 and 8.1, as they follow
from the monotonicity. The answer f(01100) = 0 cannot be
extended to the case 1.2 with the value f(11100) so the expert
should give an answer about the case f(11100). If their answer
is f(11100) = 0, this value should be extended to the cases 5.1.
and 3.1 being written in column 5.
The total number of questions marked with an asterisk (*)
in columns 2 and 3 is 13 and 12 respectively. This table
shows that 13 questions are needed to restore function
f(x
1
,x
2
,x
3
,x
4
,x
5
) and 12 questions to restore function
h(y
1
,y
2
,y
3
,y
4
,y
5
). This is only 37.5% of the 32 possible
questions. The total number of questions required to re-
store these functions without the monotonicity condition
and hierarchy is 2 * 2
11
= 4096.
7. Decision rules (model) derived from the
expert
We can find Boolean functions f(x
1
,x
2
,x
3
,x
4
,x
5
) and h(y
1
,
y
2
,y
3
,y
4
,y
5
) according to Table 1 as follows:
1. We need to find all lowest ones for all chains and present
them in the form of conjunctions.
2. We need to take the disjunction of obtained
conjunctions.
3. We need to eliminate unnecessary (inferring from the
others) conjunctions.
By column 1 and 3 we find
x2¼hðy1;y2;y3;y4;y5Þ¼y2y3_y2y4_y1y2_y1y4_y1y3
_y2y3y5_y2_y1_y3y4y5y2_y1_y3y4y5:
Probabilistic dynamic logic of cognition 165
Author's personal copy
Function g(w
1
,w
2
,w
3
)=w
2
w
1
w
3
can be obtained by ask-
ing 2
3
= 8 questions to the expert. By columns 1 and 2 we
find
fð
xÞ¼x2x3_x1x2x4_x1x2_x1x3x4_x1x3_x3x4_x3
_x2x5_x1x5_x4x5x1x2_x3x2x1x4Þx5
ðw2_w1w3Þðy1_y2_y3y4y5Þ_x3y1_y2_y3y4y5Þðw2_w1w3Þx4x5:
8. Approximation of the expert model with the
training operator
To approximate the expert decision-making model we used
the software system ‘Discovery’, which implements the
training operator. Several tens of diagnostic rules that
approximate this expert model were discovered. They were
statistically significant with respect to the statistical crite-
rion for the selection of the rules for levels 0.01, 0.05,
0.1. The rules were found for 156 cases (73 malignant, 77
benign, 2 suspicious for cancer and 4 with a mixed diagnosis)
(Kovalerchuk et al., 2001).
The set of rules was tested by the cross-validation meth-
od. The diagnosis was obtained for 134 cases (in 22 cases the
diagnosis was not made). Accuracy of the diagnosis was 86%.
Incorrect diagnosis was obtained in 19 cases (14% of all diag-
nostic cases). Type I error made 5.2% (7 malignant cases
were diagnosed as benign) and type II error made 8.9% (12
benign cases were diagnosed as malignant). Some of these
rules are given in Table 2. The following table gives exam-
ples of rules along with their statistical significance by Fish-
er-test. In this table: ‘NUM’ is the number of calcification in
cm
3
; ‘VOL’ is the amount in cm
3
; ‘TOT’ is the total number
of calcification; ‘DEN’ is the density of calcification; ‘VAR’ is
variations in the form of calcification; ‘SIZE’ is variations in
the size of calcification; ‘IRR’ is the irregularity in the form
of calcification; ‘SHAPE’ is the form of calcification.
We considered three levels of similarity measure for the
training operator 0.75, 0.85 and 0.95, which meant that the
conditional probabilities of all probability laws of probabilis-
tic regularity model are higher than or equal to these values.
Higher levels of conditional probability reduce the num-
ber of rules and diagnosed cases, but increase the accuracy
Table 1 The dynamic sequence of questions to the expert.
Case fhExpansion on the monotonicity Chain Case
1 234:1´15:0´067
(01100) 1
*
1
*
1.2, 6.3, 7.3 7.1, 8.1 Chain 1 1.1
(11100) 1 1 6.4, 7.4 5.1, 3.1 1.2
(01010) 0
*
1
*
2.2, 6.3, 8.3 6.1, 8.1 Chain 2 2.1
(11010) 1
*
1 6.4, 8.4 3.1, 6.1 2.2
(11000) 1
*
1
*
3.2 8.1, 9.1 Chain 3 3.1
(11001) 1 1 7.4, 8.4 8.2, 9.2 3.2
(10010) 0
*
1
*
4.2, 9.3 6.1, 9.1 Chain 4 4.1
(10110) 1
*
1 6.4, 9.4 6.2, 5.1 4.2
(10100) 1
*
1
*
5.2 7.1, 9.1 Chain 5 5.1
(10101) 1 1 7.4, 9.4 7.2, 9.2 5.2
(00010) 0 0
*
6.2, 10.3 10.1 Chain 6 6.1
(00110) 1
*
0
*
6.3, 10.4 7.1 6.2
(01110) 1 1 6.4, 10.5 6.3
(11110) 1 1 10.6 6.4
(00100) 1
*
0
*
7.2, 10.4 10.1 Chain 7 7.1
(00101) 1 0
*
7.3, 10.4 10.2 7.2
(01101) 1 1
*
7.4, 10.5 8.2, 10.2 7.3
(11101) 1 1 5.6 7.4
(01000) 0 1
*
8.2 10.1 Chain 8 8.1
(01001) 1
*
1 8.3 10.2 8.2
(01011) 1 1 8.4 10.3 8.3
(11011) 1 1 10.6 9.3 8.4
(10000) 0 1
*
9.2 10.1 Chain 9 9.1
(10001) 1
*
1 9.3 10.2 9.2
(10011) 1 1 9.4 10.3 9.3
(10111) 1 1 10.6 10.4 9.4
(00000) 0 0 10.2 Chain 10 10.1
(00001) 0
*
0 10.3 10.2
(00011) 1
*
0 10.4 10.3
(00111) 1 1
*
10.5 10.4
(01111) 1 1 10.6 10.5
(11111) 1 1 10.6
Questions 13 12
166 E.E. Vityaev et al.
Author's personal copy
of diagnosis and similarity to the data. Forty-four statisti-
cally significant diagnostic rules were found with the level
of criterion (f-test) 0.05 and the conditional probability of
not less than 0.75; 30 rules with the conditional probability
of not less than 0.85; 18 rules with the conditional probabil-
ity of not less than 0.95. Of these, 30 rules gave accuracy of
90% and 18 rules gave the accuracy of 96.6% with only 3
cases of type II errors (3.4%).
As is clear from these results, the required similarity to
the data 0.75, 0.85 and 0.95 is less than 86%, 90% and
96.6% accuracy obtained with the cross-validation method.
This is due to the Fisher’s exact test that is used to examine
the statistical significance of the increase of the conditional
probabilities in semantic probabilistic inference, used by
the Discovery system. This prevents retraining of the Discov-
ery system. Other experiments (Kovalerchuk & Vityaev,
1998; Kovalerchuk & Vityaev, 2000) show rather high Discov-
ery retrain resistance.
As a result the training operator managed to approximate
reported data accurately enough. This result turned out to
be better than using neural networks (Brainmaker) which
gave 100% accuracy on the training, but only 66% on cross-
validation. Decision trees (software SIPINA) gave 76–82%
accuracy on the training.
9. Comparison of the expert model and its
approximation by the training operator
To compare the model (rule), obtained with Discovery sys-
tem, and the model (rules) extracted from an expert, we
asked the expert to evaluate the first of the rules. Here
are some of the rules detected by the Discovery system
and radiologist’s commentaries on their compliance with
its decision-making model.
IF ‘‘the total number of calcification’’ is more than 30
and ‘‘volume’’ is greater than 5 cm
3
and ‘‘density’’ of calcification is medium,
THEN ‘‘suspicious for cancer.’’
The significance of f-test is 0.05. Cross-validation accu-
racy is 100%. Radiologist commentary ‘‘this rule is prom-
ising but might be risky.’’
IF ‘‘variations in the form of calcification’’ are detectable
and ‘‘the number of calcification’’ is between 10 and 20
and ‘‘the irregularity in the form of calcification’’ is medium
THEN ‘‘suspicious for cancer.’’
The significance of f-test is 0.05. Cross-validation accu-
racy is 100%. Radiologist commentary ‘‘I would trust this
rule’’.
IF ‘‘variations in the size of calcification’’ are medium
and ‘‘variations in the form of calcification’’ are weak
and ‘‘the irregularity in the form of calcification’’ is weak
THEN ‘‘benign.’’
The significance of f-test is 0.05. Cross-validation accu-
racy is 92.86%. Radiologist commentary ‘‘I would trust
this rule’’.
Thus, the training operator found the rules correspond-
ing well enough with expert’s intuition. More detailed com-
parison of discovered rules and expert rules is given in
(Kovalerchuk et al., 2001).
10. Conclusion
Thus, we interpreted the theory of neural modeling fields
and dynamic logic in logical and probabilistic terms and pro-
posed a PDLC for modeling cognitive processes. We demon-
strated that the PDLC also solve artificial intelligence
problems combinatorial complexity and logic and proba-
bility synthesis. We also interpret the corresponding cogni-
tive processes at the neural level.
We applied the PDLC for the approximation of expert
decision-making model for the breast cancer diagnosis.
First, we extracted this model from the expert, using
Table 2 Examples of detected diagnostic rules.
Diagnostic rule f-test f-test value Cross-validation
accuracy (%)
0.01 0.05 0.1
If 10 < NUM < 20 and VOL > 5 then malignant NUM 0.0029 + + + 93.3
VOL 0.0040 + + +
If TOT > 30 and VOL > 5 and DEN is medium then malignant TOT 0.0229 + + 100
VOL 0.0124 ++
DEN 0.0325 ++
If VAR is detectable and 10 < NUM < 20 and IRR is medium then malignant VAR 0.0044 + + + 100
NUM 0.0039 + + +
IRR 0.0254 ++
If SIZE is medium and SHAPE is weak and IRR is weak then benign SIZE 0.0150 + + 92.86
SHAPE 0.0114 ++
IRR 0.0878 +
Probabilistic dynamic logic of cognition 167
Author's personal copy
special procedure, based on monotone Boolean functions.
Then we applied PDLC, training operator and Discovery sys-
tem for learning the decision-making model from data. Be-
cause of the training operator may be interpreted at the
neural level, this model may be considered as a result of
the expert brain learning. In the last section we demon-
strated that the model extracted from the expert, using Ta-
ble 1 list, and the model obtained by the expert brain
learning are in good correspondence. This demonstrates
that the presented PDLC in good correspondence with the
learning cognitive process.
Acknowledgements
This work is supported by a RFBR Grant 11-07-00560-a; inte-
gration projects of the Siberian Branch of the Russian Acad-
emy of Sciences 3, 87, 136, the work was also supported by
the Council of the President of the Russian Federation
grants and state support of leading scientific schools SS-
3606.2010.1.
References
Caporale, N., & Dan, Y. (2008). Spike timing–dependent plasticity:
A Hebbian learning rule. Annual Review of Neuroscience, 31,
25–46.
Cozman, F. G., Haenni, R., Romeijn Jan-Willem Russo, F., Wheeler,
G. R., et al (2009). Combining probability and logic. Journal of
Applied Logic, 7, 131–135.
Demin, A. V., & Vityaev, E. E. (2008). Logical model of adaptive
control system. Neuroinformatics, 3(1), 79–107, in Russian.
Gabbay, D., Johnson, R., Ohlbach, H. J., & Woods, J. (Eds.). (2002).
Handbook of the logic of inference and argument: The turn
toward the practical.Studies in logic and practical reasoning
(vol. 1). Elsevier.
Gerstner, W., & Kistler, W. M. (2002). Mathematical formulations of
Hebbian learning. Biological Cybernetics, 87, 404–415.
Halpern, J. Y. (1990). An analysis of first-order logics of probability.
Artificial Intelligence, 46, 311–350.
Hansel, G. (1966). Sur le nombre des fonctions Booleenes mono-
tones de n variables. Comptes Rendus Hebdomadaires des
Seances de l’Academie des Sciences, 262(20), 1088–1090, in
French.
Hebb, D. O. (1949). The organization of behavior. A neurophysio-
logical theory, NY.
Hyafil, L., & Rivest, R. L. (1976). Constructing optimal binary
decision trees is NP-complete. Information Processing Letters,
5(1), 15–17.
Jackendoff, R. (2002). Foundations of language: Brain, meaning,
grammar, evolution. New York: Oxford Univ. Press.
Kovalerchuk, B., Triantaphyllou, E., Despande, A., & Vityaev, E.
(1996). Interactive learning of monotone boolean function.
Information Sciences, 94(1–4), 87–118.
Kovalerchuk, B., & Vityaev, E. (1998). Discovering lawlike regular-
ities in financial time series. Journal of Computational Intelli-
gence in Finance, 6(3), 12–26.
Kovalerchuk, B. Y., & Vityaev, E. E. (2000). Data mining in finance:
Advances in relational and hybrid methods. Kluwer Academic
Publisher.
Kovalerchuk, B. Y., Vityaev, E. E., & Ruiz, J. F. (2001). Consistent
and complete data and ‘‘expert’’ mining in medicine. In Medical
data mining and knowledge discovery (pp. 238–280). Springer.
Kovalerchuk, B. Y., & Perlovsky, L. I. (2008). Dynamic logic of
phenomena and cognition. IJCNN, 3530–3537.
Perlovsky, L. I. (1998). Conundrum of combinatorial complexity.
IEEE Transactions on PAMI, 20(6), 666–670.
Perlovsky, L. I. (2006). Toward physics of the mind: Concepts,
emotions, consciousness, and symbols. Physics of Life Reviews,
3, 23–55.
Perlovsky, L. I. (2007). Neural networks, fuzzy models and dynamic
logic. In R. Kohler & A. Mehler (Eds.), Aspects of automatic text
analysis (Festschrift in Honor of Burghard Rieger)
(pp. 363–386). Germany: Springer.
Pfister, J. P., Barber, D., & Gerstner, W. (2003). Optimal Hebbian
learning: A probabilistic point of view. In Kaynak et al. (Eds.),
ICANN/ICONIP.LNCS (vol. 2714, pp. 92–98). Berlin Heidelberg:
Springer-Verlag.
Probabilistic models of cognition (2006). Special Issue of the
Journal: Trends in Cognitive Science, 10(7), 287–344.
Raedt, L., Frasconi, P., Kersting, K., & Muggleton, S. H. (Eds.). (2008).
Probabilistic inductive logic programming. LNAI (vol. 4911).
Berlin: Springer-Verlag.
Scientific Discovery website <http://math.nsc.ru/AP/
ScientificDiscovery>.
Simonov, P. V. (1981). Emotional brain. Moscow: Science, in
Russian.
Smerdov, S. O., & Vityaev, E. E. (2009). Logic, probability and
learning synthesis: Formalization of prediction. Siberian math-
ematical electronic reports (vol. 6, pp. 340–365). Novosibirsk:
Sobolev institute of mathematics, in Russian.
The Probabilistic Mind (2008). Prospects for Bayesian cognitive
science. In N. Chater, M. Oaksford (Eds.), Oxfor University Press.
Vityaev, E. E. (2006a). Knowledge discovery. Computational cog-
nition.Cognitive process models. Novosibirsk: Novosibirsk State
University Press, in Russian.
Vityaev, E. E. (2006b). The logic of prediction. In S. S. Goncharov,
R. Downey, & H. Ono (Eds.), Mathematical logic in Asia 2005
(pp. 263–276). World Scientific.
Vityaev, E. E., & Smerdov, S. O. (2009). New definition of prediction
without logical inference. In B. Kovalerchuk (Ed.), Proceedings
of the IASTED international conference on computational
intelligence (CI 2009) (pp. 48–54). Honolulu, Hawaii.
168 E.E. Vityaev et al.
... Нами разработан специальный семантический вероятностный вывод ( Vityaev 2006), который выводит максимально специфические правила. Таким образом, решается проблема статистической двусмысленности и получается подходящая формализация причинности, которая позволила получить нужные математические результаты: 1. доказано, что семантический вероятностный вывод обнаруживает причинные связи в виде максимально специфических правил, которые учитывают всю доступную информацию и предсказывают без противоречий, что решает проблему статистической двусмысленности (Vityaev 2006); 2. доказано, что неподвижные точки по максимально специфическим правилам непротиворечивы (Vityaev Martinovich 2014); 3. доказано, что неподвижные точки по максимально специфическим правилам являются вероятностным обобщением формальных понятий (Vityaev Martinovich 2014), исследуемых в анализе формальных понятий; 4. семантический вероятностный вывод может быть рассмотрен как формальная модель нейрона, а неподвижные точки как клеточные ансамбли ( Vityaev 2013Vityaev , 2015). ...
... С нашей точки зрения смысл деятельности нейронов состоит в обнаружении причинных связей. Приведем формальную модель нейрона, обнаруживающую максимально специфичные условные связи ( Vityaev et al 2013). Под информацией поступающей на «вход» мозга будем понимать всю воспринимаемую мозгом стимуляцию: мотивационную, обстановочную, пусковую, санкционирующую, обратную афферентацию о произведенных действиях, поступающую по коллатералям на «вход» и т. д. ...
Conference Paper
Full-text available
В работах П.К. Анохина показано, что принцип опережающего отражения действительности является основой всего живого. Оно основано на причинности внешнего мира. В работе показывается, что сознание есть отражение мозгом причинности внешнего мира, которое может быть пред- ставлено в виде логически непротиворечивой прогностической модели ре- альности, которая непрерывно во времени и в пространстве проверяет себя на адекватность этой реальности. Показывается связь этой модели со сле- дующими теориями: восприятия, интегрированной информации, функциональных систем и других. Приводятся компьютерные эксперименты, демонстрирующие эффективность данной модели.
... The complexity arises when one attempts to represent the task as a functor that extends across a category bound with neural activities and channels it towards a poset category, ensuring the task's categorical essence harmonizes with another category designated as the conceptual space. Associating each concept with a stochastic process, as a specific kind of random field, is consistent with the approach of dynamic logic theory (Vityaev et al., 2013). Dynamic logic attempts to connect structural knowledge as a property of logic with dynamics that can effectively model the learning process. ...
Conference Paper
Full-text available
The gap between the Mind and Brain remains a deep-rooted enigma. The mind deals with semantic information, focusing on concept formation and transfer. Conversely, the brain processes Shannon Information, emphasizing the statistical shifts in neural activities. Bridging this divide requires identifying common structures in both forms of information processing. The challenge lies in integrating mathematical models like Concept Space, used for mind-level processing, with models like Stochastic Differential Equations, employed at the brain-level. Category Theory might offer a solution, serving as a mathematical linguistic bridge between these disparate information processing realms.
... Fixed points adequately model the process of perception (Vityaev & Neupokoev, 2014). A set of causal relationships models expert knowledge (Vityaev, Perlovsky Kovalerchuk, Speransky, 2013). Therefore, the verification of this formal model for compliance with the actual processes of the brain seems to be an important task. ...
Article
The work demonstrates that brain might reflect the external world causal relationships in the form of a logically consistent and prognostic model of reality, which shows up as consciousness. The paper analyses and solves the problem of statistical ambiguity and provides a formal model of causal relationships as probabilistic maximally specific rules. We suppose that brain makes all possible inferences from causal relationships. We prove that the suggested formal model has a property of an unambiguous inference: from consistent premises we infer a consistent conclusion. It enables a set of all inferences to form a consistent model of the perceived world. Causal relationships may create fixed points of cyclic inter-predictable properties. We consider the “natural” classification introduced by John St. Mill and demonstrate that a variety of fixed points of the objects’ attributes forms a “natural” classification of the external world. Then we consider notions of “natural” categories and causal models of categories, introduced by Eleanor Rosch and Bob Rehder and demonstrate that fixed points of causal relationships between objects attributes, which we perceive, formalize these notions. If the “natural” classification describes the objects of the external world, and “natural” concepts the perception of these objects, then the theory of integrated information, introduced by G. Tononi, describes the information processes of the brain for “natural” concepts formation that reflects the “natural” classification. We argue that integrated information provides high accuracy of the objects identification. A computer-based experiment is provided that illustrates fixed points formation for coded digits.
... The fixed points adequately model the perception [19]. A set of causal relationships models expert knowledge [22]. Therefore, the verification of this formal model for compliance with the actual processes of the brain seems to be an important task. ...
Chapter
Full-text available
In the previous works we analyzed and solved such problem of causal reflection of the outer world as a statistical ambiguity. We defined maximally specific causal relationships that have a property of an unambiguous inference: from consistent premises we infer consistent conclusions. We suppose that brain makes all possible inferences from causal relationships that produce a consistent model of the perceived world that shows up as consciousness. To discover maximally specific causal relationships by the brain, a formal model of neuron that is in line with Hebb rule was suggested. Causal relationships may create fixed points of cyclic inter-predictable attributes. We argue that, if we consider attributes of the outer world objects regardless of how we perceive them, a variety of fixed points of the objects’ attributes forms a “natural” classification of the outer world objects. And, if we consider fixed points of causal relationships between the stimuli of the objects we perceive, they form “natural” concepts described in cognitive sciences. And, if we consider the information processes of the brain when the system of causal relationships between object stimuli produces maximum integrated information, then this system may be considered as a fixed point which has a maximum consistency in the same sense as the entropic measure of integrated information. It was shown in other works that this model of consciousness explains purposeful behavior and perception.
... The conceptual framework above does not address the actual mechanisms of the inferences implementing the inferential processes discussed in the previous chapter. We anticipate that any of probabilistic or fuzzy logic frameworks [7,8,9] may be used for this as well as novel suggestions. Among the latter ones, one promising approach would be to explore the applicability of "disjuncts" (as events and appearances) and "connectors" (as roles) suggested for the unsupervised learning of linguistic structures [10] to be applied to a wider scope of domains. ...
Preprint
Full-text available
In the following writing we discuss a conceptual framework for representing events and scenarios from the perspective of a novel form of causal analysis. This causal analysis is applied to the events and scenarios so as to determine measures that could be used to manage the development of the processes that they are a part of in real time. An overall terminological framework and entity-relationship model are suggested along with a specification of the functional sets involved in both reasoning and analytics. The model is considered to be a specific case of the generic problem of finding sequential series in disparate data. The specific inference and reasoning processes are identified for future implementation.
Chapter
Issues concerning the unity of minds, bodies and the world have often recurred in the history of philosophy and, more recently, in scientific models. Taking into account both the philosophical and scientific knowledge about consciousness, this book presents and discusses some theoretical guiding ideas for the science of consciousness. The authors argue that, within this interdisciplinary context, a consensus appears to be emerging assuming that the conscious mind and the functioning brain are two aspects of a complex system that interacts with the world. How can this concept of reality - one that includes the existence of consciousness - be approached both philosophically and scientifically? The Unity of Mind, Brain and World is the result of a three-year online discussion between the authors who present a diversity of perspectives, tending towards a theoretical synthesis, aimed to contribute to the insertion of this field of knowledge in the academic curriculum.
Article
The paper presents models describing higher emotions and cognition. It argues that contents of models at the top of the mental hierarchy are higher meanings, which in a simplified way could be called meanings of life. These models are vague-fuzzy and inaccessible to consciousness. Improving these models and making steps toward even tentatively conscious forms of them relates to the highest aesthetic emotions, emotions of the beautiful. Experimental data confirming these relations between the beautiful and higher meanings are presented.
Article
Full-text available
What is common among Newtonian mechanics, statistical physics, thermodynamics, quantum physics, the theory of relativity, astrophysics and the theory of superstrings? All these areas of physics have in common a methodology, which is discussed in the first few lines of the review. Is a physics of the mind possible? Is it possible to describe how a mind adapts in real time to changes in the physical world through a theory based on a few basic laws? From perception and elementary cognition to emotions and abstract ideas allowing high-level cognition and executive functioning, at nearly all levels of study, the mind shows variability and uncertainties. Is it possible to turn psychology and neuroscience into so-called "hard" sciences? This review discusses several established first principles for the description of mind and their mathematical formulations. A mathematical model of mind is derived from these principles. This model includes mechanisms of instincts, emotions, behavior, cognition, concepts, language, intuitions, and imagination. We clarify fundamental notions such as the opposition between the conscious and the unconscious, the knowledge instinct and aesthetic emotions, as well as humans' universal abilities for symbols and meaning. In particular, the review discusses in length evolutionary and cognitive functions of aesthetic emotions and musical emotions. Several theoretical predictions are derived from the model, some of which have been experimentally confirmed. These empirical results are summarized and we introduce new theoretical developments. Several unsolved theoretical problems are proposed, as well as new experimental challenges for future research.
Conference Paper
The concept of scientific paradox and the possibility to reveal and resolve these paradoxes by means of artificial intelligence are discussed. The cognitive architecture designed under the Natural-Constructive Approach for modeling the cognitive process is presented. This approach is aimed to interpret and reproduce the human-like cognitive features including uncertainty, individuality, intuitive and logical thinking, and the role of emotions in cognitive process. It is shown that this architecture involves, in particular, the high-level symbolic information that could be associated with concept of “science”. The scientific paradox is treated as impossibility to merge different representations of the same object. It is shown that these paradoxes could be resolved within the proposed architecture by decomposition of the high-level symbols into low-level of corresponding “images”, with subsequent revision of the object’s memorization procedure. This process should be accompanied by positive emotion manifestation (Eureka!).
Chapter
Full-text available
Fundamental mechanisms of the mind are discussed as steps to understanding music: concepts, instincts, and emotions. Aesthetic emotions are related to the knowledge instinct. The top of the mind hierarchy is analyzed: emotions of the beautiful are related to the understanding of the highest meaning and purpose.
Article
Full-text available
Predictions are very important for many Artificial Intelligence tasks and systems, such as expert systems, decision support systems, control systems and robotics. But prediction notion encounters with some deep problems are to be solved yet. We will consider Deductive-Nomological (DN) and Inductive- Statistical explanations/predictions. DN explanations/predictions are treated as predictions in accordance with 'The Logic of Scientific Discovery' by K. Popper [1]. According to this work we cannot apply DN explanations/predictions to inductively obtained knowledge. We argue that logical inference of predictions from inductively obtained knowledge induces some problems relating to probability and logic synthesis. To avoid this complications we propose an inductive inference of predictions without logical inference. We will define an inductive inference of predictions (Semantic Probabilistic Inference (SPI)) and a p-prediction. For any literal A pprediction inductively infers a rule, that predicts this literal with estimation no less than the corresponding estimations obtained by probabilistic logic or probabilistic logic programming. Moreover, we prove that inductively inferred rules possess many important properties: for example, predictions based on these rules are free from the problem of statistical ambiguity. Finally, we will mention the program system 'Discovery', implementing SPI, which was successfully applied for solution of many practical tasks (see www.math.nsc.ru/AP/ScientificDiscovery).
Article
Full-text available
Presented paper is devoted to the question of prediction formalized in probabilistic and logical terms. The aim of investigation is to examine different methods such as based on SLD-inferences and alternative semantic approach. Prediction is introduced as a statement of abductive sort attained by inductive schemes. One of the significant problems concerns unregulated decrease of trusting estimations for regularities obtained during the process of inference organized by analogy with syntax logical systems. Suggested semantic approach generalizes the notion of inference and reveals essential advantages in many aspects without assuming rather strong constraints. In particular, a special set of probabilistic laws is synthesized inductively, this collection has an optimal ability to predict (in the context of available data). Semantic definition of prediction leads us to a new paradigm, where deduction is replaced with computability concept: it rises conditional probability during the steps of inference (in contrast to SLD) and also maximally specifies resulted prediction rule. Moreover, we prove that probabilistic estimations obtained by semantic predictions are greater or equal to those by corresponding SLD-analogical systems. In conclusion practical applications are discussed.
Article
Full-text available
The ultimate purpose of many medical data mining systems is to create formalized knowledge for a computer-aided diagnostic system, which can in turn, provide a second diagnostic opinion. Such systems should be consistent and complete as much as possible. The system is consistent if it is free of contradictions (between rules in a computer-aided diagnostic system, rules used by an experienced medical expert and a database of pathologically confirmed cases). The system is complete if it is able to cover (classify) all (or largest possible number of) combinations of the used attributes. A method for discovering a consistent and complete set of diagnostic rules is presented in this chapter. Advantages of the method are shown for development of a breast cancer computer-aided diagnostic system
Book
The rational analysis method, first proposed by John R. Anderson, has been enormously influential in helping us understand high-level cognitive processes. The Probabilistic Mind is a follow-up to the influential and highly cited 'Rational Models of Cognition' (OUP, 1998). It brings together developments in understanding how, and how far, high-level cognitive processes can be understood in rational terms, and particularly using probabilistic Bayesian methods. It synthesizes and evaluates the progress in the past decade, taking into account developments in Bayesian statistics, statistical analysis of the cognitive 'environment' and a variety of theoretical and experimental lines of research. The scope of the book is broad, covering important recent work in reasoning, decision making, categorization, and memory. Including chapters from many of the leading figures in this field, The Probabilistic Mind, will be valuable for psychologists and philosophers interested in cognition.