Conference PaperPDF Available

Ungrounding symbols in language development: implications for modeling emergent symbolic communication in artificial systems

Authors:
Ungrounding symbols in language development:
implications for modeling emergent symbolic communication in artificial systems
Joanna Rączaszek-Leonardi
Faculty of Psychology
University of Warsaw
Warsaw, Poland
raczasze@psych.uw.edu.pl
Terrence W. Deacon
Department of Anthropology
University of California at Berkeley
Berkeley, USA
deacon@berkeley.edu
AbstractThe relation of symbolic cognition to embodied
and situated bodily dynamics remains one of the hardest
problems in the contemporary cognitive sciences. In this paper
we show that one of the possible factors contributing to this
difficulty is the way the problem is posed. Basing on the
theoretical frameworks of cognitive semiotics, ecological
psychology and dynamical systems we point to an alternative
way of formulating the problem and show how it suggests
possible novel solutions. We illustrate the usefulness of this
theoretical change in the domain of language development and
draw conclusions for computational models of the emergence of
symbols in natural cognition and communication as well as in
artificial systems.
KeywordsHuman-human and human-robot interaction and
communication; Language acquisition; Epistemological
Foundations and Philosophical Issues
I. INTRODUCTION
Despite years of theoretical, experimental and modeling
work, the relation of symbolic cognition to embodied and
situated bodily dynamics remains one of the hardest problems
of the contemporary cognitive sciences. The problem
permeates numerous cognitive domains, from most obvious
ones such as explaining human natural language processing,
interpretation of language in various domains of NLP to
automatic linguistic description of visual scenes, linguistic
control of robots’ behavior and human-robot interaction. In all
those domains, and many more, the gist of the problem is how
to relate the seemingly abstract, conventional and formal
entities, called symbols, to real physical, continuous dynamics
of action, interaction and/or interpretation.
For many years, since at least the 1970s and early 80s, the
problem was considered mainly from one specific perspective.
Namely, it was asked how symbols are endowed with
meaning, how can they refer to something other than other
symbols. In other words, the problem was posed as a “symbol
grounding problem” [1, 2, 3, 4, 5]. This manner of posing the
question brings forward some aspects of the problem while
obfuscating others, which may make important principles and
processes difficult to recognize.
First the symbol grounding problem assumes that symbols
exist and, fashioning them on the alleged properties of
linguistic or mathematical symbols, accepts their nature as
abstract, conventional and formally related to each other.
Therefore, the very question presupposes them as ungrounded
entities. It becomes thus more urgent to ask about their
grounding than to ask about how the symbolic properties could
have come about in the first place and which principles of
cognition and which processes could have produced them.
Thus in the “grounding problem”, the very nature of the
processes of abstraction, conventionalization and formalization
are not inquired into.
Second, the search for the meaning of signs in most
domains of the cognitive sciences seems to be limited to
“symbolic meaning”. This approach tends to overlook the fact
that non-symbolic informational structures also play a vital
role in regulating the relationships of the organism to its
environment, relationships among organisms, and crucially
the emergence the symbols themselves. So the problem is
formulated as a “symbol grounding problem” and not a “sign
grounding problem.This obfuscates the possible mediation of
non-symbolic meaningful forms in the emergence of symbolic
systems.
The goal of this paper is to change the way we pose the
problem of the relation between symbols and dynamics. This
way of formulating the question is inspired by a theoretical
framework developed by Deacon (1997) and by research on
language development, where it is particularly clear that
initially all meaningful behaviors, linguistic forms included,
appear in rich interactive, dynamical contexts. Just as any
other stimuli or gestures, they are fully and causally embedded
in rich meaningful dynamics, and, from very early on, have
power of controlling these dynamics. In such contexts it
becomes evident that the real problem is not the grounding of
symbols but rather explaining the mystery of how such an
embedded, embodied and situated use of signs can ever (at
least partially) become liberated from the immediate reliance
on the on-line events, thus, how they become, at least partially,
“ungrounded”. How do everyday interactions can ever give
rise to the apparently abstract, conventional and formal
symbols? And in the process how do they maintain their
fundamental grounding and remain causal controls on
interactive dynamics.
In what follows, Section II presents the theoretical
framework for symbol ungrounding, which uses the well-
known model of symbol emergence by Deacon [6]. In Section
III, using the domain of language development, we
demonstrate the usefulness of the model in 1) identifying the
processes that make signs meaningful in development and 2)
identifying the possible paths to ungrounding symbols. We
provide multiple examples of both processes based on
microanalyses of real parent-infant interactions [7, 8, 9],
This work was funded by the Beethoven UMO-2014/15/G/HS1/04536
grant to the first author.
232
2018 Joint IEEE International Conference on Development and Learning and
Epigenetic Robotics (ICDL-EpiRob)
Tokyo, Japan, September 16-20, 2018
© 2018 IEEE
paying particular attention to the structures present in the
social environment of the child that scaffold development.
Section IV briefly reviews some of the attempts at modeling
language evolution with grounded symbols, points out how the
models could be enriched by the microanalyses of early social
interaction and postulates general desiderata for such models.
In the Conclusions we summarize general guidelines for such
models and note that the ‘symbol ungrounding’ approach and
the reliance on microanalyses of language development data
could be particularly valuable for the community of epigenetic
robotics, which appreciates the ontogenetic processes shaping
the experience of intelligent agents, and which often
demonstrates ecological sensitivity to structures provided by
both physical and social environment.
II. SEMIOTIC INFRASTRUCTURE OF SYMBOLIC
REFERENCE
Deacon’s 1997 model originated as a theory of the
semiotic hierarchy that underlies symbolic reference.
Following Peirce’s semiotic theory, symbols are seen as
requiring an infrastructure of simpler semiotic relationships.
These more basic relationships include iconic signs, like
pictures, in which the sign vehicle and its referent share formal
properties; and indexical signs, like symptoms, in which the
sign vehicle and its referent are physically or habitually linked.
Because icons and indices share properties with what they
refer to, they are in this sense “grounded” signs. In contrast,
symbolic sign vehicles, typically lack properties shared with
their referents and by virtue of this lack of grounding are able
to be combined and manipulated in ways that makes possible
nearly unrestricted referential relationships. According to
Peirce, "Symbols grow. They come into being by development
out of other signs” [10]. Therefore, it should be possible to
trace the emergence of symbolic forms of reference from prior
icons and indices, which have more obvious isomorphic or
causal - relationships with the social-pragmatic dynamics in
which they are immersed.
Deacon initially analyzed this process as it was
exemplified in a study involving two chimpanzees and a
highly simplified 6-lexigram (computer keys with arbitrary
marks on them) symbol system. In this well-known study by
Savage-Rumbaugh and colleagues [11, 12] chimps had to
learn to combine lexigrams for two food and two drink items
in specific combinations with the appropriate delivery
lexigram (glossed as ‘pour’ and ‘give’). The chimps easily
learned the indexical relation between a lexigram and the
correlated food reward, but it was particularly difficult to get
them to shift to using specific lexigram compositionality to
refer to a specific food-action relation (e.g. pour juice). Only
by foregrounding the lexigram-lexigram “agreement” rule and
systematically extinguishing all other combinations was it
possible to get the chimps to abandon simple indexicality and
pay attention to the implicit abstract iconicity between
lexigram-lexigram and food-action relations. Lexigrams that
were initially grounded indexically to individual items in the
chimps’ world, thus became (also indexically) grounded in
relation to other lexigrams. In this respect they became
“doubly grounded” [13]. See Figure 1 for details of the stages
of this process. This double grounding not only allows for
single lexigrams to be used indexically, but also for the
relations among them to also be referential i.e., meaningfully
related to structured events in which these symbols are a part.
In this way their syntactic relationships also became endowed
with meaning. Although the original grounding of the
lexigrams was not lost, their referential function was
significantly transformed by the iconic and indexical relations
between lexigrams. This system of relations allowed them to
become partially “ungrounded” from these primary indexical
relationships so that the abstract relationships between them
could provide relevant referential clues.
This model is analogous to the situation of language
development, in that the child initially interacts with caretakers
by virtue of pragmatically grounded iconic and indexical
means and eventually uses these signs as a scaffold in the
acquisition of her first language. So by observing everyday
pre-linguistic infant-caretaker interactions, and attending to
their iconic and indexical functions, it should be possible to
discern how the infant learns to communicate with the
ungrounded sign vehicles of language. But the ungrounding
process should be generic. So the study how an infant’s non-
symbolic use of gestures and words provides the scaffolding
upon which ungrounded symbolic communication is built
should also inform the design of systems capable of
meaningful symbolic communication.
III. SEMIOTIC INFRASTRUCTURE IN LANGUAGE
DEVELOPMENT
Language development gives us a particularly good
opportunity to study the emergence of symbols [14, 15]. Early
interactions provide especially vivid illustrations of the initial
grounding of informational forms (signs), readily available for
observations. We can thus appreciate the richness of the
interactive context, i.e. the structuring provided by the
caregiver and the infant for the events in which the first words
appear. We can thus ask about the processes going on in the
infants’ heads and adopting the focus of ecological
psychology we can ask “what their heads are inside of.”
Unlike the evolution of symbolic systems, lost in the past and
difficult to study because of the lack of fossils, symbol
ungrounding in development “happens all the time”. This
Figure 1. Stages of emergence of symbolic reference. From [9]
233
gives researchers ample opportunity to study the processes
behind this semiotic development: extending from utterances
used as indices and icons controlling on-line interaction to
symbolic communications and social conventions mediated by
language.
A. Utterances as indices and icons in social physics
Posing the problem as symbol ungrounding suggests a
reformulation of the questions of language acquisition: instead
of asking how children ground the words they hear from adults
or how they map utterances to objects events or states of the
world, we ask how utterances that initially function as icons
and indices come to acquire the properties of a symbolic
system. Thus in the present approach utterances do not have
(initially) the status of symbols. In fact, they do not have any
status that would make them privileged with respect to any
other actions that influence early interactions: gazes, gestures,
smiles, non-linguistic vocalizations. The characteristics of
symbols: arbitrariness, conventionality, formal structuring are
not granted to symbols; rather the genesis of those properties is
what requires explanation.
Let’s illustrate this difference in the analysis with an
example of conventional sign use by children, given by
Elizabeth Bates and her colleagues [15]. When Carlotta, a 9
months-old infant (dressed in a red sweater) raises a fist in a
combative gesture after an adult’s utterance “Compagni!”, a
bystander (as well as the researcher) is prone to interpret this
as the child’s ability to recognize and use conventions: “once a
child begins using arbitrary signals signals that he could not
possibly have discovered without observing them in the social
world we have particularly clear evidence that he recognizes
and uses conventions.”
A child, presumably, has formed an association between
the utterance and the gesture and the association is arbitrary,
because nothing in the sound ‘Compagni!’ naturally triggers or
resembles the gesture that follows. This, however, is a
bystander perspective, taking into account, at best, a slice of
the process of how a sign effectively changes the infant’s
behavior. If we look at how the relation came about, we note
that for a child neither the utterance nor the gesture are
arbitrary. They are causally intertwined in a social routine, in
which some actions enable predicting (and triggering) others
because they are reenacted with a particular sequencing and
timing. This is how the social world, constructed around the
infant, works: in the social physicsreenacted for a child,
raising the fist after somebody shouted “Compagni! is
followed by a cascade of positive events, such as smiles,
praises and being in the center of attention. And only because
of this the gesture is performed.
Careful observations of early language development, in
which language is granted a controlling and not merely a
descriptive (mapping) role, makes it more amenable to
pragmatic and ecologically valid analysis. A child “tunes-in”
to the utterances as affordances, which control individual and
collective behavior. This is how ‘words’ can become
‘messages’ in the first place, i.e., as signs sustained in their
causal roles due to reenacted social routines [16]. Tuning-in,
congruently with the tenets of ecological psychology, consists
in changes in the way utterances are perceived, as specifying
action and co-action in a social world. Before we turn to the
ungrounding process, let’s consider how early utterances are
embedded within the reenacted social environment.
Utterances may play the role of indices for specific
interactional behaviors. Examples abound in early interactions.
Mothers use vocalizations to draw the attention of a child, to
forecast events and behaviors, and to evoke particular
responses. A greeting (‘hello!’), a child’s name, an imperative
(e.g., ‘look’!) a farewell (‘bye-bye!’) are each typical examples
of utterances that first function as simple indices of subsequent
moves in interaction. It is important to note that they cannot be
usefully described as indices for ‘referents out there’ but are
rather elements of coordinative events: it is more important
that they evoke actions than images in the head of the child.
An interpretation of a sign “Grego!” is not “ah, mother is
referring to me” but rather “I have to look at my mom”. Such a
sign functions as an index in a series of events that allows
prediction of what is likely to happens next or what is expected
from the child. Soon it becomes used by the child as well, not
only to ‘refer’ in the sense of indicating or describing the
world but rather to control the flow of interaction.
Utterances can also serve as icons: the context and the time
at which they are produced is aligned in an isomorphic way
with important dimensions of interactive events. Parents use
specific prosody, length of utterances, amplitude modulation
that is coordinated and thus helps predict properties of events.
A mother picks up a child with a long “oooooopppallaaa”,
coordinated with the length of the upward movement; says
‘peek-a-boo’ coordinated with the surprising suddenness of
appearance; says roll, roll, roll in a ‘rolling’ way when turning
the infant [9]; says ‘tap, tap, tap’ when tapping on his belly.
Semiotic analysis of such interactive uses of utterances
makes it clear that linguistic forms need not be engaged in
communication only as symbols. Long before they are used as
a full-blown symbolic system (i.e. natural language) they
function as interaction coordinators and controls, managing
attention and joint attention, establishing rhythms, aiding the
partition of events, and synchronizing emotions. Two things
are crucial to note. First, because of the immersion in
multimodal on-line interactions, utterances can be quite
precisely grounded, i.e., infants are quickly tuned-in to them as
interaction controls. As Jerome Bruner says, in contrast to
corrections of grammar, “speech acts, on the contrary, get not
only immediate feedback but also correction.” [17, p. 37-38].
Second, even though later the linguistic forms enter in
complex relations with other linguistic forms, the initial
grounding does not vanish but continues to provide for the
language’s pragmatic coordinative role, albeit transformed by
the possibility of more complex utterances.
B. Ungrounding through relations to other signs
The observation that a child’s first “words” are grounded in
this manner in the on-going interactions makes the task of
explaining how they gain symbolic properties seemingly more
difficult. The model presented above suggests that a possible
aid to “ungrounding” is through grounding in other signs.
Caregivers rarely talk in mono-words, rather providing
structured utterances. Soon the (indexical and iconic) relations
of words to other words become apparent to a child,
constituting a second kind of grounding: within the vocal
modality. This allows for reliable predictions about what
234
follows within an utterance besides the predictions of what
follows within a coordinative event.
We observe several processes that lead to privileging the
vocal modality in communication and to its emergence from
the multimodal stream of events. The structure of turn taking
(present in actions in general) is seen in vocal interactions
from very early on [18, 19]. Additionally, research shows that
mothers’ responses to infant’s vocalizations differ depending
on the quality of the vocalizations: i.e., the more an infant’s
vocalization resembles language (e.g. in its syllabic form) the
greater the probability that a) the vocalization will be
responded to and b) that the response will be language-like.
Importantly, if the mother responds with speech it is also more
probable that the next vocalization of the child will be
language-like [20]. It was also noted that a greater propensity
in mothers to respond to language-like vocalizations of infants
with verbal responses is correlated with better language
development as measured several months later [21]. These
observations demonstrate that the infant is embedded in a
highly structured behavioral and social niche, enacted by
adults and which provides semiotically grounded scaffolding
for the emergence of symbolic language.
Because of caretakers’ differential responsiveness to an
infant’s language-like vocalizations, the relations among
words within utterances become more salient to the infant. At
the same time, utterances remain functional in their pragmatic
contexts. In this way, intensive multimodal interactions
continue to provide the embedding context for higher-order
relations. This is crucial: the relations themselves become
meaningful by controlling the interactive events, which also
can become more complex. As Bruner points out [17], the first
context of the use of complex two-word affordances by
children is often the request format, where the roles of actors
and objects in the real world are quite evident for a child. The
socially-causal relations between these actors and objects in
the interactive situations may aid in understanding the
relations (semantic bootstrapping), while the perception and
control of the events might, with time, be ordered by these
inter-sign relationships (syntactic bootstrapping).
Apart from regular everyday interactions, playing with
infants is also structured in a way that may aid understanding
how the relations among signs, in turn, relate to the relations
and events in interaction. Types of games, in which language
is in a pre-specified way connected to a series of events and
movements (or motions, like motions in a game) performed
with a child are particularly good examples: e.g. enumerating
while touching fingers, enacting simple narratives, when
touching and moving the baby (such as in the games “This
little finger went….”, “Questo è l’occhio bello…”, and so on).
These are events in which linguistic structures do not “map
ontointeractive structures but rather help to control/predict
them. While relations among grounded signs may lead to
simple associations resulting in generalizations [5], it is
important to note that the grounding is more comprehensive
and complex with respect to symbols since symbol-symbol
relations are themselves are grounded. At the same time,
perhaps paradoxically, grounding in other utterances across
situations provides a mechanism for liberation from the
immediate context. Relations among elements of utterances
can bring attention to dimensions that might not be
immediately perceptually present. Thus, no wonder the
critique for early symbolic models for their solipsism was
based on pointing out that one cannot get semantics by
grounding symbols just in other symbols [2, 3]. Grounding
symbols in other symbols cannot provide semantic grounding
because it leads to the (always partial) ungrounding of
language from the immediate context. According to Deacon’s
model, the systemic property constituted by the relations
among signs makes those signs symbols. In development,
patterns of words co-occurrences (systemicity) are provided by
the adult’s utterances and by enactments of early dialogue
around the vocalization of the child. Grounded first in the
relations among controlling events the structures transfer the
control to novel situations.
IV. IMPLICATIONS FOR MODELING THE
EMERGENCE OF SYMBOLIC REFERENCE
The aim of this section is to formulate guidelines for
computational models of the emergence of symbols from non-
symbolic meaningful forms, i.e. for simulating the
ungrounding process. Many of the current ingenious and
successful models for clarifying aspects of symbolic
functioning, have nevertheless usually been concerned with
grounding the symbolic forms. Additionally, they have often
been concerned primarily with an evolutionary timescale. This
perspective renders many critical aspects of the process of
language emergence inaccessible to observation. In the work
presented above, Deacon’s model has been applied to
understanding language emergence in development. This fills
in the theoretical frame with real-life examples. The
developmental timescale for the emergence of symbolic
communication undoubtedly differs from the evolutionary one
(the most prominent difference is, of course, the co-occurrence
of symbolic sign use in linguistic structures provided by the
adults). However, the developmental time scale makes some
elements of the model more amenable for study, and the
semiotic principles, as noted earlier, should be generic and
relevant to research on other timescales.
There is not space to review the extensive modeling work
on the emergence of symbolic communication, even though
some of it is directly relevant to the present work [4, 5, 13, 22-
29]. These models are based on a variety of architectures and
diverse learning algorithms, and they aim at explaining various
aspects of the emergence of structured communication
systems. However, in general most of the models of symbol
acquisition by cognitive systems take the prior existence of
symbols for granted.
Some models remain at the purely symbolic level, without
any concern for the grounding problem. This does not mean
that they aren’t informative. Consider, for example, the study
by Smith, Brighton and Kirby who demonstrated that
compositional systems are more stable in the face of
bottlenecks in cultural transmission [30]. Other models that
include semantic aspects, explain the necessity of grammar by
invoking the semantic complexity of the content conveyed by
symbols [25, 31]. Yet others ground symbolic reference more
thoroughly in the actions of agents in the environment, by
coupling symbolic functioning to evolutionary fitness [3, 4,
32] or success in on-line interactions [22, 23]. However even
in these pragmatically oriented models, grounding is assumed
235
to be a mapping relation, either a simple one, from objects in
the agent’s environment to signs [32] or more complex,
mediated by generalized conceptual representations [4, 5] or
by internal structured representations of the environment or
action plans [23]. Reformulating the problem as ungrounding,
along the lines presented above, as well as capitalizing on the
controlling role of signs (including symbols) provide two
general tenets guiding the future modeling work. Glimpses of
similar approaches can already be discerned in existing studies
and will be very helpful in the elaboration of our models.
For example, in a recent model [29], agents use the
signaling of other agents to directly control their actions.
Importantly, in some scenarios, agents, besides using the
signals of others to compute their movement trajectories,
include them in the computing of their own signals “in
response. This results in a dialogical, communicative
behavior, which may lead to cooperation. This is a very
promising direction, however it is not clear how a signal from
the other agent differs from other aspects of the environment
that are used to compute trajectories and further signals (i.e., it
is not clear why they are called symbolsand not indices)
and also how syntactic structures may emerge as used by one
agent (not only in a turn-taking mode (but see e.g. [33] on the
role of dialogicity in the emergence of grammatical structures).
Also very helpful in this context is research that capitalizes
on the relations among the signs in modeling the emergence of
symbolic reference. An example is ‘symbolic theft’ in which
grounding of abstract dimensions can be achieved by
associating the names for abstract categories with already
grounded ones [5]. This can explain the enlargement of a
symbol system, though not its emergence. But it also
demonstrates how the more concrete (in our framework iconic
and indexical) controls can become ungrounded through
selecting and creating important dimensions not obviously
present in the input, by only in other signs.
The above strands of modeling work can benefit both from
the change of perspective we propose here, as well as from the
emphasis on language development, which makes steps in the
ungrounding process more obvious. This aids recognizing that
being immersed in co-action with others provides the complex
semiotic infrastructure on which symbolic systems rely. The
indexical and iconic involvement of signs in the control of
interactive situations constitutes a vital part of the model.
Without accounting for the direct, Gibsonian-like involvement
of signs grounded in the social physics as controls, or enabling
constraints, symbolic reference appears unattainable
Recognizing the “double grounding”, i.e. indexical and
iconic grounding of signs both in coactions in the world and in
other linguistic forms is another key requirement of the model.
As noted above, it is the relation between/among signs that
provides a novel form of control in pragmatic social
interaction. The fact that grammar can reduce the
computational complexity of semantic interpretation [23]
stems from the fact that grammar imposes constraints on the
relations between referents. This realization might be helpful
in the development of the models such as [29].
Epigenetic robotics seems to be a particularly good
environment for developing models of the emergence of
symbols, as guided by these principles. Robots are immersed
in some kind of structured physics, in which signs may
function as icons and indices, thus events can be predictably,
informationally connected for them. Relations among signs
reflect also these informational connections and generalize
them to other relations. Agents are immersed in their
environments as actors, therefore their primary attitude
towards reality is not the description or representation but
control. Most importantly, the environment is constituted by
other actors, thus the criteria for this control are primarily
pragmatic and coordinative. Symbolic systems emerge in
dialogical scenarios of mutual control and coordination within
joint activities. Congruently with Vygotsky: A sign is always
originally a means used for social purposes, a means of
influencing others, and only later becomes a means of
influencing oneself. ” [34, p. 157].
V. CONCLUSION
Instead of following the usual approaches to symbol
grounding i.e., starting from ungrounded symbols and
trying to link them to dynamic events we frame the
problem differently. We ask: how does an infant learn to
communicate with ungrounded sign vehicles (symbols) that
are amenable to conventionalization and formal relations,
beginning with only initially grounded signs. We think that an
answer to the problem of the emergence of symbols requires
answering questions about how events in interaction become
understood as icons and indices and how these become
symbols.
We employed a model proposed by Deacon [9], which
shows that one important path to developing ungrounding
symbols relies on their systemicity, i.e. grounding of signs not
only in events but also in other signs. We showed the
developmental realization of such a process, where linguistic
signs are first icons and indices in the infant’s “social physics”
making it predictable and controllable. Subsequently, through
establishing relations to other signs the control can become
qualitatively different, guided by transmittable relations among
linguistic forms. Finally, we described what features of our
computational models, are likely necessary to model the
ungrounding process. Exploring this will be the next step in
our work. Summarizing the features of such models, they
should:
Be informed by developmental processes, where the data
on coordinative processes constituting the meaning of the
utterances are readily available.
Pay attention to available patterns created by social physics
of the agents, i.e. their active involvement in complex
events. In simulated environments this could be achieved
by immersion of agents in collaborative tasks.
Allow not only for agents’ action (pragmatic goals) but
social-coordinative action, allowing for symbolic systems
to emerge the „Vygotskyan way”.
Capitalize on the physicality of signs: signs must be
physical entities with physical structure, present publicly
in the environment, and amenable to re-presentation by
the agents to each other. In this way, they can remain
causal in “social physics”.
236
Allow for signs to be predictably linked among
themselves. Symbolic signs do not just co-exist but 1)
they are usually in systematic sequences, which, for
example, make one an index for another and 2) they co-
exist as controls, grounded in events, transferring relations
between episodes of control.
The field of epigenetic robotics (if indeed epigenetic and
indeed robotic ,i.e., developmentally and pragmatically related
to the environment) seems like a promising environment for
exploring the emergence and evolution of symbolic
communication. But, the field could benefit from including the
“ungrounding” process in the design of artificial systems.
Cybernetic relations between meaningful forms and the
behavioral interactive dynamics in an environment help to
demonstrate how symbols relate to dynamics and to
foreground the pragmatic aspects, which are transparent for the
participants and thus often taken for granted and difficult to
study.
REFERENCES
[1] Dreyfus, H. (1972). What Computers Can’t Do. New York: Harper and
Row.
[2] Searle, J. R. (1980). Minds, brains and programs. Behavioral and Brain
Sciences 3, 417-424.
[3] Harnad S. (1990). The Symbol Grounding Problem. Physica D 42: 335-
346
[4] Cangelosi A. (1999). Modeling the evolution of communication:
From stimulus associations to grounded symbolic associations. In
D. Floreano et al. (Eds.), Proceedings of ECAL99 European
Conference on Artificial Life, Berlin: Springer-Verlag, 654-663
[5] Cangelosi A., Greco A., & Harnad S. (2000). From robotic toil to
symbolic theft: Grounding transfer from entry-level to higher-level
categories. Connection Science, 12(2), 143-162
[6] Deacon, T. W. (1997). The Symbolic Species: The Co-evolution of
Language and the Brain. New York: W.W. Norton & Company.
[7] Nomikou, I., & Rohlfing, K. J. (2011). Language does something: Body
action and language in maternal input to three-month-olds. IEEE
Transactions on Autonomous Mental Development 3(2):113 - 128
[8] Szufnarowska J., Rohlfing K. J. (2014). Enfolding interaction with two-
month-olds. In: Proceedings of the 16th European Conference on
Developmental Psychology, Lausanne, Switzerland. Bologna: Monduzzi
Editore, 213218.
[9] Rączaszek-Leonardi, J., Nomikou, I., Rohlfing, K. J. & Deacon, T. W.
(2018). Language Development From an Ecological Perspective:
Ecologically Valid Ways to Abstract Symbols. Ecological Psychology,
30:1, 39-73, DOI: 10.1080/10407413.2017.1410387
[10] Peirce, Charles Sanders (1931) Collected Papers of Charles Sander
Pierce. Vol. II Elements of Logic. C. Hartshorn and P. Weiss (eds.)
Cambridge, MA: Harvard University Press
[11] Savage-Rumbaugh S. & Rumbaugh D.M. (1978). Symbolization,
language, and Chimpanzees: A theoretical reevaluation on Initial
language acquisition processes in four Young Pan troglodytes. Brain
and Language, 6: 265-300.
[12] Savage-Rumbaugh, E. S., Rumbaugh, D. M., Smith, S. T., & Lawson, J.
(1980). Reference: The linguistic essential. Science, 210 (4472), 922-
925.
[13] Cangelosi, A . 2001. “Evolution of communication and language using
signals, symbols and words”. IEEE Transactions on Evolutionary
Computation 5(2): 93101.
[14] Piaget, J. (1945/962). Play, Dreams, and Imitation in Childhood. New
York: W. W. Norton & Company. (orig: "La formation du symbole
chez l'enfant: Imitation, jeu et rêve, Image et représentation")
[15] Bates, E., with L. Benigni, I. Bretherton, L. Camaioni, & V. Volterra.
(1979). The emergence of symbols: Cognition and communication in
infancy. New York: Academic Press.
[16] Rączaszek-Leonardi, J. (2016). How does a word become a message?
An illustration on a developmental time-scale. New Ideas in
Psychology, 42, 46-55. doi:10.1016/j.newideapsych.2015.08.001
[17] Bruner, J. S. (with Watson, R.). (1983). Child's talk: Learning to use
language. Oxford, UK: Oxford University Press.
[18] Trevarthen C. (1979). Communication and cooperation in early infancy:
a description of primary intersubjectivity. In: Before Speech: The
Beginning of Interpersonal Communication ed. Bullowa M., editor.
Cambridge: Cambridge University Press, 321347.
[19] Leonardi, G., Nomikou, I., Rohlfing, K. J. & Rączaszek-Leonardi, J.
(2016). Vocal interactions at the dawn of communication: The
emergence of mutuality and complementarity in mother-infant
interaction. In: Proceedings of the IEEE ICDL-EpiRob, Cergy-
Pontoise, pp. 288-293.
[20] Warlaumont, A. S., Richards, J. A., Gilkerson, J., & Oller, D. K. (2014).
A social feedback loop for speech development and its reduction in
autism. Psychological Science , 25 (7), 13141324.
doi:10.1177/0956797614531023
[21] Radkowska, A., Nomikou, I., Leonardi, G., Rohlfing, K. J. &
Rączaszek-Leonardi J. (2017). Scaffolding vocal development: maternal
responsiveness to early speechlike vocalizations in three, six and eight
month olds. Poster presented at IASCL.
[22] Steels, L. (2000) The Emergence of Grammar in Communicating
Autonomous Robotic Agents. In Horn, Werner, editor, ECAI2000,
pages 764769.
[23] Steels, L. (2005) What Triggers the Emergence of Grammar? In
AISB'05: Proceedings of the Second International Symposium on the
Emergence and Evolution of Linguistic Communication (EELC'05),
pages 143--150.
[24] Batali J. (1994). Innate biases and critical periods: Combining evolution
and learning in the acquisition of syntax. In R. Brooks & P. Maes (eds),
Artificial Life IV, Cambridge, MA: MIT Press, 160-171.
[25] Batali, J. (1998). Computational simulation of the emergence of
grammar. In J. R. Hurford, M. Studdert-Kennedy, & C. Knight (Eds.),
Approaches to the evolution of language. Cambridge, UK: Cambridge
University Press.
[26] Hutchins E. & Hazelhurst B. (1995). How to invent a lexicon. The
development of shared symbols in interaction, In N. Gilbert e R. Conte
(Eds.) Artificial societies: The computer simulation of social life,
London: UCL Press.
[27] Hashimoto, T. and T. Ikegami (1996) Emergence of net-grammar in
communicating agents. BioSystems 38 (1996) 1-14.
[28] Leijnen, S. (2012). Emerging symbols. In T. Schilhab, F. Stjernfeld & T.
Deacon (Eds.) The Symbolic Species Evolved. Biosemiotics 6. Dordrecht:
Springer, pp. 253-262.
[29] Grouchy, P., D'Eleuterio, G.M., Christiansen, M.H., & Lipson, H. (2016).
On The Evolutionary Origin of Symbolic Communication. Scientific
Reports, 6, 34615.
[30] Smith, K., Brighton, H., & Kirby, S. (2003). Complex systems in
language evolution: The cultural emergence of compositional structure.
Advances in Complex Systems, 6(4), 537558.
[31] Schönemann, P. T. (1999). Syntax as emergent characteristic of the
evolution of semantic complexity. Minds and Machines, 9, 309346.
[32] Cangelosi, A., Parisi D. (1998). The emergence of a "language" in an
evolving population of neural networks. Connection Science, 10(2), 83-
97.
[33] Jennings, R.E., & Thompson, J.J. (2012). The Biology of Language and the
Epigenesis of Recursive Embedding. Interaction Studies, 13(1), 80102.
[34] Vygotsky, L. (1931/1981). The Genesis of Higher Mental Functions. In
James Wertsch, The Concept of Activity in Soviet Psychology.
Armonk, NY: M.E. Sharpe.
237
... Terrence Deacon's (1997) The Symbolic Species, as well as the co-edited follow-up volume The Symbolic Species Evolved (Schilhab et al., 2012), and co-authored work with Raczaszek-Leonardi and Deacon (2018), generate a compelling argument indicating the dangers of what occurs when symbols become untethered from grounding in physical reality. The Borges Effect, named after the author's famous parable of a map overtaking the territory it is meant to represent, is one of the consequences of identifying symbolic thought and virtuality as more real than the world it represents. ...
Article
Full-text available
This review admires Michael Marder’s inquiry as a parallel for which biosemiotics can find points of conceptual resonance, even as methodological differences remain. By looking at the dump of ungrounded semiosis – the semiotics of dislocating referents from objects, and its effects – we can better do the work of applying biosemiotics not just towards the wonders of living relations, but also to the manifold ways in which industrial civilization is haphazardly yet systematically destroying the possibility for spontaneous yet contextualized semiogenesis. Biosemiotics has much to gain by understanding the ways, gross and subtle, in which Anthropocenic hubris undercuts our own ability to make sense of the world, doubling down on overconfidence at the expense of meaning-making.
... These tasks involve associating pairs or multiplicities of arbitrary concepts (learning to associate stored items) from a large space of acquired concepts, where network nodes are more programmable than commonly studied neural network models (Hetz et al., 1991;Marcus, 2001) (but see also Natschl'ager and Maass, 2002). Interpretation is a central component of the semiotics approach to cognition and meaning making (Konderak, 2018;Kull, 2018;Raczaszek-Leonardi and Deacon, 2018), and our work also aligns with the constructivism theory of epistemology and learning (Fosnot, 2005). From a biological viewpoint, how high may the inference in our approach go? ...
Article
Full-text available
How do humans learn the regularities of their complex noisy world in a robust manner? There is ample evidence that much of this learning and development occurs in an unsupervised fashion via interactions with the environment. Both the structure of the world as well as the brain appear hierarchical in a number of ways, and structured hierarchical representations offer potential benefits for efficient learning and organization of knowledge, such as concepts (patterns) sharing parts (subpatterns), and for providing a foundation for symbolic computation and language. A major question arises: what drives the processes behind acquiring such hierarchical spatiotemporal concepts? We posit that the goal of advancing one’s predictions is a major driver for learning such hierarchies and introduce an information-theoretic score that shows promise in guiding the processes, and, in particular, motivating the learner to build larger concepts. We have been exploring the challenges of building an integrated learning and developing system within the framework of prediction games, wherein concepts serve as (1) predictors, (2) targets of prediction, and (3) building blocks for future higher-level concepts. Our current implementation works on raw text: it begins at a low level, such as characters, which are the hardwired or primitive concepts, and grows its vocabulary of networked hierarchical concepts over time. Concepts are strings or n-grams in our current realization, but we hope to relax this limitation, e.g., to a larger subclass of finite automata. After an overview of the current system, we focus on the score, named CORE. CORE is based on comparing the prediction performance of the system with a simple baseline system that is limited to predicting with the primitives. CORE incorporates a tradeo between how strongly a concept is predicted (or how well it fits its context, i.e., nearby predicted concepts) vs. how well it matches the (ground) “reality,” i.e., the lowest level observations (the characters in the input episode). CORE is applicable to generative models such as probabilistic finite state machines (beyond strings). We highlight a few properties of CORE with examples. The learning is scalable and open-ended. For instance, thousands of concepts are learned after hundreds of thousands of episodes. We give examples of what is learned, and we also empirically compare with transformer neural networks and n-gram language models to situate the current implementation with respect to state-of-the-art and to further illustrate the similarities and differences with existing techniques. We touch on a variety of challenges and promising future directions in advancing the approach, in particular, the challenge of learning concepts with a more sophisticated structure.
... This situation is-possibly-the case for some of the language-to-language tasks, for which it seems sufficient to rely on statistical co-occurrences of patterns, omitting the connection to interactive dynamical processes or first-person experiences. For the sake of brevity, we refrained from discussing the role of language in establishing novel conceptual relations, and making them public (but see [30,[80][81][82][83][84][85][86][87][88]). However, it should be clear that patterns of word co-occurrences will always be the surface manifestations of dynamical process on many timescales and levels, and it seems important that we are aware of that fact if we are to be able, when necessary, to retrace these threads to social interaction and experience. ...
Article
Full-text available
Research concerning concepts in the cognitive sciences has been dominated by the information-processing approach, which has resulted in a certain narrowing of the range of questions and methods of investigation. Recent trends have sought to broaden the scope of such research, but they have not yet been integrated within a theoretical framework that would allow us to reconcile new perspectives with the insights already obtained. In this paper, we focus on the processes involved in early concept acquisition and demonstrate that certain aspects of these processes remain largely understudied. These aspects include the primacy of movement and coordination with others within a structured social environment as well as the importance of first-person experiences pertaining to perception and action. We argue that alternative approaches to cognition, such as ecological psychology, enactivism and interactivism, are helpful for foregrounding these understudied areas. These approaches can complement the extant research concerning concepts to help us obtain a more comprehensive view of knowledge structures, thus providing us with a new perspective on recurring problems, suggesting novel questions and enriching our methodological toolbox. This article is part of the theme issue ‘Concepts in interaction: social engagement and inner experiences’.
... The only advance of it is that they make the hybrid system work in a real robot successfully. Another way of applying Peirce's semiotic theory to the SGP is based on Terrence Deacon's model of symbol emergence [18]. Instead of focusing on the triadic relations, this model is mostly built on the classification of signs following Peirce: icons, indexes and symbols. ...
Article
Full-text available
The symbol grounding problem (SGP) proposed by Stevan Harnad in 1990, originates from Searle’s “Chinese Room Argument” and refers to the problem of how a pure symbolic system acquires its meaning. While many solutions to this problem have been proposed, all of them have encountered inconsistencies to different extents. A recent approach for resolving the problem is to divide the SGP into hard and easy problems echoing the distinction between hard and easy problems for resolving the enigma of consciousness. This however turns out not to be an ideal strategy: Everything related to consciousness that cannot be well-explained by present theories can be categorized as a hard problem which as a consequence would doom the SGP to irresolvability. We therefore argue that the SGP can be regarded as a general problem of how an AI system can have intentionality, and develop a theoretical direction for its solution.
Chapter
This chapter proposes a phylogenetically conditioned background for one contributing factor in the human tendency to amass fortune. Besides natural language and culture as a mnemonic mechanism of non-hereditary information, what distinguishes human behaviour from other animals is the extravagant gathering of ‘things’ that are not strictly necessary for survival or nesting. Human lust for material wealth goes well beyond the scope of personal need, but the one(s) with the most of what is desired have no use of their fortunes in death. Because culture has the characteristics of stereoscopicity, the basic elements present in its formation will be explicated and schematically presented as the stereomeion, which simultaneously displays the foundations for the analogous structure and function of the individual intellect and collective intellect (i.e. culture). By resorting to the sublogical connection between the colour ‘red’, the ‘blood’ of the living body, and the precious metal ‘gold’; their relation to the ‘Sun’ in its ‘rise’ and ‘set’ provides consciousness of the categories of ‘beginning’ and ‘end’, which will be shown to function as a driving factor in the becoming of the lust for wealth in avoidance of death. More specifically, the central claim is that ‘gold’ is sublogically equal to ‘life’.
Article
Over the past several decades, research in the cognitive sciences has foregrounded the importance of active bodies and their continuous dependence on the changing environment, strengthening the relevance of dynamical models. These models have been steadily developed within the ecological psychology approach to cognition, which arguably contributes to the “ecological turn” we are witnessing today. The embodied and situated nature of cognition, regarded by some as a passing trend, is presently becoming a largely accepted assumption. In this paper, I claim that in light of these developments, ecological psychology, in alliance with related approaches, such as enactivism and interactivism, has the potential to deeply transform our perspectives on cognition and action, restoring their pertinence to humans as persons. However, an important challenge to the realization of this potential has to be noted: neither the mainstream information‐processing approach nor the dynamics‐oriented perspective on cognition provides an account of how the capacity of humans to use language and think “symbolically” can be derived from the continuous flow of agent−environment interaction. I will attempt to show that posing the “dynamical” and “computational” hypotheses about the nature of cognition as mutually exclusive approaches to cognition results in undesirable reductionism, which makes it difficult to meet this challenge. There are good reasons, advanced over half a century ago by, for example, Michael Polanyi or Howard Pattee, to think that we need complementary descriptions to understand cognizing systems, in order to grasp the fact that they are governed both by physical laws and by emergent historical constraints. Details of such a complementarity‐based approach still await elucidation, but some proposed solutions have the potential to ease the tension between the information‐processing and dynamical approaches to cognition and to lead to a better understanding of their interrelation.
Preprint
Full-text available
Everyday metalinguistic ascriptions ("My name is Oliver", "Swahili ng'ombe means cow", "She lied about you") seemingly attribute properties to phenomena of a distinctively linguistic ontology. However, non-representational approaches to cognition, such as ecological psychology, cannot accommodate this linguistic ontology without contravening their non-representational principles. An alternative might be to construe metalinguistic ascriptions as 'folk' fictions which are, strictly speaking, false. Yet this would render unintelligible the practical role that metalinguistic ascription occupies in everyday discourse. We suggest another alternative. By analogy to mindshaping approaches in folk-psychological debates, we propose a nonrepresentational account of metalinguistic ascription as a form of language-shaping. Metalinguistic ascriptions shape language behavior over temporal and social scales by prospectively shaping discursive niches. Language has radically transformed the human ecological niche. Language skills enable us to formulate hypotheses about the origins of life, gossip about our colleagues, plan next year's holiday, and exchange stories about our childhood. Theorists of language have standardly assumed that the explanation of such displays of language skills must take among its foundational explananda a specifically linguistic ontology consisting of phenomena such as languages,
Preprint
Full-text available
*** This paper has been accepted as a poster with full paper publication to the 2023 Annual Cognitive Society Meeting in Sydney, Australia. Please cite the published version: Rorot, W., & Rączaszek-Leonardi, J. (2023). Understanding "Compositionality" in Research on Language Emergence. Proceedings of the Annual Meeting of the Cognitive Science Society, 45. Retrieved from https://escholarship.org/uc/item/6tg5d31m *** The goal of this paper is to analyze the notion of “compositionality” and its use in contemporary cognitive science. We argue that the concept has undergone a series of apparently minor definitional shifts since its initial inception within the field of philosophy of language (as indicated by Janssen, 2012). These changes result in a divergent meaning of the term as it is used in the emergent communication and language evolution communities. Hitherto, this fact has been underappreciated, whereas we believe that it has significant implications for understanding the nature of syntax and the sources of linguistic and conceptual structure. We argue that originally, “compositionality” was understood as pertaining primarily to the process of understanding a compound utterance by a hearer. Other scholars, however, take it to be a prerequisite of the structure of languages. In all contexts, investigating compositionality of natural languages requires making a host of idealizing assumptions. For this reason, we propose to understand compositionality as just one idealized principle influencing the construction of compound expressions in language, necessarily complemented by other principles. This allows for appreciating the structural entanglements permeating natural language and opens new avenues for accounting for them.
Preprint
Full-text available
Computational simulations are a popular method for testing hypotheses about the emergence of communication. This kind of research is performed in a variety of traditions including language evolution, developmental psychology, cognitive science, machine learning, robotics, etc. The motivations for the models are different, but the operationalizations and methods used are often similar. We identify the assumptions and explanatory targets of several most representative models and summarise the known results. We claim that some of the assumptions -- such as portraying meaning in terms of mapping, focusing on the descriptive function of communication, modelling signals with amodal tokens -- may hinder the success of modelling. Relaxing these assumptions and foregrounding the interactions of embodied and situated agents allows one to systematise the multiplicity of pressures under which symbolic systems evolve. In line with this perspective, we sketch the road towards modelling the emergence of meaningful symbolic communication, where symbols are simultaneously grounded in action and perception and form an abstract system.
Article
Full-text available
We propose a model which argues that aesthetics is based on biosemiotic processes and introduces the non-anthropomorphic aesthetics. In parallel with habit-taking, which is responsible for generating semiotic regularities, there is another process, the semiotic fitting, which is responsible for generating aesthetic relations. Habit by itself is not good or bad, it is good or bad because of semiotic fitting. Defining the beautiful as the perfect semiotic fitting corresponds to the common conceptualisation of the aesthetic as well as extends it over all umwelten. Perfection is not omnipotence, it only means the omnirelational semiotic fitting in the umwelt, or harmony with context. The process that presents something to be perceived as beautiful is of the same kind as the semiotic process that builds something to become beautiful. The argument is based on the observation that learning has a tendency towards perfection, until it is grounded (non-symbolically – based on imprinting, conditioning, or imitation). Semiosis is usually biased towards semiotic fitting, which stepwise leads towards perfection, and thus towards beauty. Such a general semiotic model implies that beauty is species-specific; that it is not limited to the sphere of emotions; that the reduction of the evolution of aesthetic features to sexual selection is false; and that humans should learn the aesthetics of other beings in order to avoid destroying valuable biocoenoses.
Article
Full-text available
In the embodied, situated, enacted and distributed approaches to cognition, the coordinative role of language comes to the fore. Language, with its symbolic properties, arises from a multimodal stream of interactive events and gradually gains power to constrain them in a functional and adaptive way. In this article, we attempt to integrate three approaches to information in cognitive systems to provide a theoretical background to the process of development of language as such a coordinator. Ecological psychology provides an explanation for how any behaviors or events become informative through the process of “tuning” to affordances that control individual and collective behavior. The dynamical approach helps to operationalize this control as a functional reduction of degrees of freedom of individual and collective systems. Cognitive semiotics provides a typology of constraints showing their interrelations: it proposes conditions under which informational controls that function as indices and icons may become symbolic, providing a qualitatively different form of constraint, which can be partially ungrounded from the ongoing stream of multimodal events. The article illustrates the proposed processes with examples from actual parent-infant interaction and points to ways of verifying them in a more quantitative way.
Poster
Full-text available
A growing number of studies attest to the role of social interaction for the development of infants’ preverbal vocalizations. Warlaumont et al. (2014) suggest a social feedback loop reinforcing infants’ own communicative acts. Goldstein et al. (2009) have shown that phonological properties of infants’ vocalizations are socially guided and Gros-Louis et al. (2006) showed that maternal responses are tuned to the phonological characteristics of the infant’s vocalizations. Yet, little is known about the scaffolding of infant vocalizations in even earlier stages of vocal development, even before the production of the rst canonical syllables and whether maternal responses change along with the developing infant. We analyzed interactions of 14 German mother-infant dyads filmed during diaper changing when the infants were 3, 6 and 8 months of age. We coded infants’ language-like protophones and mothers’ responses. We hypothesized that with time infants will produce more complex and advanced vocalizations. Furthermore, we expected that mothers will become more responsive to infants’ vocalizations with their development.
Conference Paper
Full-text available
This study analysed the course of interaction between mothers and their two-month-old infants. Twenty mother-infant dyads from Poland were videotaped during a peek-a-boo play which is a highly structured form of interaction. In our microanalysis, we found that most of infants were able to engage in the interaction through looking at the mother for over half of total interaction time and smiling. Furthermore, every infant who looked at the mother for over 65% of the time responded with smiles. Infants were most likely to smile after the first unit of play when the interaction pattern became more familiar to them. However, they responded with a smile in a given unit of play only when the mothers were able to attract their attention within one second after uncovering. As long as the infants smiled, the mothers continued the play. They motivated their infants to take a turn in form of a social smile and responded contingently to it with at least three different modalities in most cases. Our results suggest, that social smile of the infant is the product of establishing the mutual gaze (which is the basis for further turn-taking) and serves as a 'hook' for the mother for continuing the interaction.
Article
Full-text available
The emergence of symbolic communication is often cited as a critical step in the evolution of Homo sapiens, language, and human-level cognition. It is a widely held assumption that humans are the only species that possess natural symbolic communication schemes, although a variety of other species can be taught to use symbols. The origin of symbolic communication remains a controversial open problem, obfuscated by the lack of a fossil record. Here we demonstrate an unbroken evolutionary pathway from a population of initially noncommunicating robots to the spontaneous emergence of symbolic communication. Robots evolve in a simulated world and are supplied with only a single channel of communication. When their ability to reproduce is motivated by the need to find a mate, robots evolve indexical communication schemes from initially noncommunicating populations in 99% of all experiments. Furthermore, 9% of the populations evolve a symbolic communication scheme allowing pairs of robots to exchange information about two independent spatial dimensions over a one-dimensional channel, thereby increasing their chance of reproduction. These results suggest that the ability for symbolic communication could have emerged spontaneously under natural selection, without requiring cognitive preadaptations or preexisting iconic communication schemes as previously conjectured.
Article
Full-text available
Recent changes in views on cognition underscore its embodied, situated and distributed character. These changes are compatible with the conceptual framework of ecological psychology. However for ecological psychology to propose explanations for a broad range of cognitive phenomena, including language, it needs an account of how to link the dynamics of coupling between the organism and the environment with the apparent symbolicity of informational structures. In this paper it is proposed that a theory of information in biological systems, advocated by Howard Pattee, may help forge this link. By treating informational structures as constraints on dynamics this approach helps to identify which processes, in which systems and on what timescales are needed for structures to 'become messages'. I will illustrate how these processes might work on developmental timescale building on the work by Edward Reed (1995, 1996) and extending it using the view of linguistic structures as constraints.
Conference Paper
Continuous interaction of mother and infant in the first weeks and months of an infant's life entrains the infant on many crucial aspects of how to do things together. Contingencies of gaze, vocalizations, and other movements are slowly routinized; this scaffolds directing of attention to each other and the world and gives to such multimodal interactions meaning. It is within these continuous interactions with caregivers that language emerges, starting from the first non-reflexive vocalizations that infants produce. The response that caregivers promptly give to these vocalizations informs infants of their relevance and helps shape them. We explored this systematicity by observing the coupling of infants' and mothers' vocalizations in unconstrained interactions longitudinally. While at three months, mothers seem to answer consistently to any speech related vocalization within the first two seconds, this pattern fades away at six and eight months. What remains stable across age is a structure in which overlapping vocalizations are rare and give way to a sequential pattern of vocal reciprocity — an embryonic turn-taking behavior. Discussion relates this finding to early coordination in other modalities in an attempt to sketch a more holistic account of emerging co-action.
Article
This article can be viewed as an attempt to explore the consequences of two propositions. (I) Intentionality in human beings (and animals) is a product of causal features of the brain. I assume this is an empirical fact about the actual causal relations between mental processes and brains. It says simply that certain bran processes are sufficient for intentionality. (2) Instantiating a computer program is never by itself a sufficient condition of intentionality. The main argument of this paper is directed at establishing this claim. The form of the argument is to show how a human agent could instantiate the program and still not have the relevant intentionality. These two propositions have the following consequences: (3) The explanation of how the brain produces intentionality cannot be that it does it by instantiating a computer program. This is a strict logical consequence of 1 and 2. (4) Any mechanism capable of producing intentionality must have causal powers equal to those of the brain. This is meant to be a trivial consequence of 1. (5) Any attempt literally to create intentionality artificially (strong AI) could not succeed just by designing programs but would have to duplicate the causal powers of the human brain. This follows from 2 and 4. 'Could a machine think?' On the argument advanced here only a machine could think, and only very special kinds of machines, namely brains and machines with internal causal powers equivalent to those of brains. And that is why strong AI has little to tell us about thinking, since it is not about machines but about programs, and no program by itself is sufficient for thinking.