FIGURE 3 - available via license: CC BY
Content may be subject to copyright.
| The expansion of a multimodal categorizer from personal to interpersonal: (A) shows a generative model of a personal multimodal categorizer between haptics and vision, and (B) shows a generative model of an inter-personal multimodal categorizer between the agents. Dashed lines in (B) show communication between agents. The parameters of these models are simplified.

| The expansion of a multimodal categorizer from personal to interpersonal: (A) shows a generative model of a personal multimodal categorizer between haptics and vision, and (B) shows a generative model of an inter-personal multimodal categorizer between the agents. Dashed lines in (B) show communication between agents. The parameters of these models are simplified.

Source publication
Article
Full-text available
This study focuses on category formation for individual agents and the dynamics of symbol emergence in a multi-agent system through semiotic communication. In this study, the semiotic communication refers to exchanging signs composed of the signifier (i.e., words) and the signified (i.e., categories). We define the generation and interpretation of...

Contexts in source publication

Context 1
... modeled the symbol emergence in a multi-agent system and the category formation in individual agents as a generative model by expanding a personal multimodal categorizer (see Figure 3A) to an interpersonal multimodal categorizer (see Figure 3B). First, (A) shows a personal multimodal categorizer, which is a generative model with an integrated category c as a latent variable and sensor information from haptics and vision as observations o h and o v . ...
Context 2
... modeled the symbol emergence in a multi-agent system and the category formation in individual agents as a generative model by expanding a personal multimodal categorizer (see Figure 3A) to an interpersonal multimodal categorizer (see Figure 3B). First, (A) shows a personal multimodal categorizer, which is a generative model with an integrated category c as a latent variable and sensor information from haptics and vision as observations o h and o v . ...
Context 3
... (B) shows an interpersonal multimodal categorizer in which two agents are modeled as a collective intelligence, with word w as a latent variable, and sensor information from agent A and B as observations o A and o B . As shown in Figure 3, the model generating observations through categories on each sensor from an integrated concept in an agent can be extended as the model generating observations through categories on each agent from a word in a multi-agent system. Figure 3A represents a graphical model for probabilistic generative model multimodal categorization (e.g., Nakamura et al., 2014). ...
Context 4
... shown in Figure 3, the model generating observations through categories on each sensor from an integrated concept in an agent can be extended as the model generating observations through categories on each agent from a word in a multi-agent system. Figure 3A represents a graphical model for probabilistic generative model multimodal categorization (e.g., Nakamura et al., 2014). It can integrate multimodal information, e.g., haptics and visual information, and form categories. Index of category is represented by c in this figure. ...
Context 5
... 4 shows the graphical model is a single graphical model. However, following the SERKET framework (see Figure 3), it can be owned by two different agents separately. The right and left parts indicated with a dashed line in Figure 4 show the parts owned by agents A and B, respectively. ...

Similar publications

Article
Full-text available
Natural languages vary in their quantity expressions, but the variation seems to be constrained by general properties, so-called universals. Their explanations have been sought among constraints of human cognition, communication, complexity, and pragmatics. In this article, we apply a state-of-the-art language coordination model to the semantic dom...

Citations

... Several models, such as the referential signaling game (Lewis, 2008) and naming game (Steels and Loetzsch, 2012), have explored EmCom, utilizing feedback mechanisms to refine coordination and vocabulary. In contrast, a recent approach called the Metropolis-Hastings (MH) naming game offers a different approach to EmCom which does not rely on explicity feedback, but rather on a principle of joint attention where both agents focus on the same observation (Hagiwara et al., 2019). This principle is hypothesized to be critical in the developmental stages of human infants around nine to 15 months and is theorized to facilitate significant advancements in lexical acquisition and language development (Tomasello and Farrar, 1986;Carpenter et al., 1998). ...
... In the context of symbol emergence systems that employ DGM with multimodal data, three critical questions emerge that are yet to be addressed in previous works (Hagiwara et al., 2019;2022;Taniguchi et al., 2023): ...
Article
Full-text available
Deep generative models (DGM) are increasingly employed in emergent communication systems. However, their application in multimodal data contexts is limited. This study proposes a novel model that combines multimodal DGM with the Metropolis-Hastings (MH) naming game, enabling two agents to focus jointly on a shared subject and develop common vocabularies. The model proves that it can handle multimodal data, even in cases of missing modalities. Integrating the MH naming game with multimodal variational autoencoders (VAE) allows agents to form perceptual categories and exchange signs within multimodal contexts. Moreover, fine-tuning the weight ratio to favor a modality that the model could learn and categorize more readily improved communication. Our evaluation of three multimodal approaches - mixture-of-experts (MoE), product-of-experts (PoE), and mixture-of-product-of-experts (MoPoE)–suggests an impact on the creation of latent spaces, the internal representations of agents. Our results from experiments with the MNIST + SVHN and Multimodal165 datasets indicate that combining the Gaussian mixture model (GMM), PoE multimodal VAE, and MH naming game substantially improved information sharing, knowledge formation, and data reconstruction.
... In this study, our objective is to investigate whether the MHNG, which models symbol emergence as a decentralized Bayesian inference (Hagiwara et al., 2019;Taniguchi et al., 2023), can serve as a valid explanatory principle of symbol emergence between human individuals. MHNG involves computational agents playing the JA-NG, where agents independently form categories of objects and name them while assuming joint attention. ...
... We performed a communication experiment with human participants. The communication structure in the experiment resembled that of the JA-NG in a simulation experiment conducted by Hagiwara et al. (2019). We observed the acceptance or rejection assessments of participants and tested whether they used the acceptance probability calculated by the MHNG theory to a certain extent. ...
... A PGM can be decomposed into two parts corresponding to the two agents using the Neuro-SERKET framework (Taniguchi et al., 2020) in the inference process. Hagiwara et al. (2019) found that a certain type of language game can be regarded as a decentralized inference process for an Inter-PGM, and Taniguchi et al. (2023) formulated this idea as the MHNG. ...
Article
Full-text available
We explore the emergence of symbols during interactions between individuals through an experimental semiotic study. Previous studies have investigated how humans organize symbol systems through communication using artificially designed subjective experiments. In this study, we focused on a joint-attention-naming game (JA-NG) in which participants independently categorized objects and assigned names while assuming their joint attention. In the Metropolis-Hastings naming game (MHNG) theory, listeners accept provided names according to the acceptance probability computed using the Metropolis-Hastings (MH) algorithm. The MHNG theory suggests that symbols emerge as an approximate decentralized Bayesian inference of signs, which is represented as a shared prior variable if the conditions of the MHNG are satisfied. This study examines whether human participants exhibit behavior consistent with the MHNG theory when playing the JA-NG. By comparing human acceptance decisions of a partner's naming with acceptance probabilities computed in the MHNG, we tested whether human behavior is consistent with the MHNG theory. The main contributions of this study are twofold. First, we reject the null hypothesis that humans make acceptance judgments with a constant probability, regardless of the acceptance probability calculated by the MH algorithm. The results of this study show that the model with acceptance probability computed by the MH algorithm predicts human behavior significantly better than the model with a constant probability of acceptance. Second, the MH-based model predicted human acceptance/rejection behavior more accurately than four other models (i.e., Constant, Numerator, Subtraction, Binary). Among the models compared, the model using the MH algorithm, which is the only model with the mathematical support of decentralized Bayesian inference, predicted human behavior most accurately, suggesting that symbol emergence in the JA-NG can be explained by the MHNG.
... The acceptance probability r in MH-receiving is equivalent to that in the MH algorithm for P(w d | x Sp d , x Li d , θ Sp , θ Li ) in the case that P(w | x Sp , θ Sp ) is a proposal distribution. This result is a generalization of (Hagiwara et al., 2019(Hagiwara et al., , 2022) and a special case of . For the details of the proof, please refer to the original papers. ...
... This is based on the Gaussian mixture model (GMM) and is a special case of the multi-agent Inter-PGM. Hagiwara et al. (2019Hagiwara et al. ( , 2022 proposed the Inter-Dirichlet mixture (Inter-DM) which combines two Dirichlet mixtures (DMs), p(x n d | w d ) and p(o n d | x n d ), represented as categorical distributions in Figure 2A. Taniguchi et al. (2023) proposed Inter-GMM + VAE which combines two GMM + VAEs, i.e., p(x n d | w d ) and p(o n d | x n d ) represented as a categorical distribution as a part of GMM and a VAE respectively. ...
Article
Full-text available
In the studies on symbol emergence and emergent communication in a population of agents, a computational model was employed in which agents participate in various language games. Among these, the Metropolis-Hastings naming game (MHNG) possesses a notable mathematical property: symbol emergence through MHNG is proven to be a decentralized Bayesian inference of representations shared by the agents. However, the previously proposed MHNG is limited to a two-agent scenario. This paper extends MHNG to an N -agent scenario. The main contributions of this paper are twofold: (1) we propose the recursive Metropolis-Hastings naming game (RMHNG) as an N -agent version of MHNG and demonstrate that RMHNG is an approximate Bayesian inference method for the posterior distribution over a latent variable shared by agents, similar to MHNG; and (2) we empirically evaluate the performance of RMHNG on synthetic and real image data, i.e., YCB object dataset, enabling multiple agents to develop and share a symbol system. Furthermore, we introduce two types of approximations—one-sample and limited-length—to reduce computational complexity while maintaining the ability to explain communication in a population of agents. The experimental findings showcased the efficacy of RMHNG as a decentralized Bayesian inference for approximating the posterior distribution concerning latent variables, which are jointly shared among agents, akin to MHNG, although the improvement in ARI and κ coefficient is smaller in the real image dataset condition. Moreover, the utilization of RMHNG elucidated the agents' capacity to exchange symbols. Furthermore, the study discovered that even the computationally simplified version of RMHNG could enable symbols to emerge among the agents.
... Developing didactic-mathematical knowledge and competence can be done by identifying mathematical elements that are relevant to the learning strategy or media used by the teacher (Bianchini et al., 2019;Mariotti, 2013). The elements referred to are related to mathematical terms, symbols, and symbols (Fatimah et al., 2020;Hagiwara et al., 2019). ...
Article
Full-text available
Mathematics learning for autistic students needs to be done concretely and interestingly. If this is implemented, it is hoped that autistic students will be able to understand and even use mathematics in everyday life. This study aimed to analyze semiotic objects in the bead maze media for learning mathematics for autistic students in elementary schools. The research was conducted in a descriptive qualitative manner. Methods of data collection through observation, documentation, and interviews with elementary school mathematics teachers and assistants for autistic students. Observations were made by observing the bead maze adaptive media. Objects that can be observed are the beads in it, such as different shapes and colors. Interviews were conducted with one third-grade autistic student, an elementary school math teacher, and an assistant teacher. Data obtained from observation, documentation, and interviews were then analyzed using triangulation. The triangulation methodology is carried out by comparing the information obtained from observation, documentation, and interviews. The study results found that six primary semiotic objects, namely language, problem situations, concepts, procedures, properties, and arguments, have been identified based on basic mathematical concepts (numbers, algebra, geometry, measurement) being studied in the third grade. Each semiotic object in bead maze media can potentially increase students' mathematical activities, which are contextual, interesting, and meaningful for autistic students in elementary schools.
... The acceptance probability r in MH-receiving is equivalent to that in the MH algorithm for P (w d | x Sp d , x Li d , θ Sp , θ Li ) in the case that P (w | x Sp , θ Sp ) is a proposal distribution. This result is a generalization of (Hagiwara et al., 2019(Hagiwara et al., , 2022) and a special case of (Taniguchi et al., 2023). For the details of the proof, please refer to the original papers. ...
... This is based on the Gaussian mixture model (GMM) and is a special case of the multi-agent Inter-PGM. Hagiwara et al. (2019Hagiwara et al. ( , 2022 proposed the Inter-Dirichlet mixture (Inter-DM) which combines two Dirichlet mixtures (DMs), p(x n d | w d ) and p(o n d | x n d ), represented as categorical distributions in Figure 2 (A). Taniguchi et al. (2023) proposed Inter-GMM+VAE which combines two GMM+VAEs, i.e., p(x n d | w d ) and p(o n d | x n d ) represented as a categorical distribution as a part of GMM and a VAE respectively. ...
Preprint
In the studies on symbol emergence and emergent communication in a population of agents, a computational model was employed in which agents participate in various language games. Among these, the Metropolis-Hastings naming game (MHNG) possesses a notable mathematical property: symbol emergence through MHNG is proven to be a decentralized Bayesian inference of representations shared by the agents. However, the previously proposed MHNG is limited to a two-agent scenario. This paper extends MHNG to an N-agent scenario. The main contributions of this paper are twofold: (1) we propose the recursive Metropolis-Hastings naming game (RMHNG) as an N-agent version of MHNG and demonstrate that RMHNG is an approximate Bayesian inference method for the posterior distribution over a latent variable shared by agents, similar to MHNG; and (2) we empirically evaluate the performance of RMHNG on synthetic and real image data, enabling multiple agents to develop and share a symbol system. Furthermore, we introduce two types of approximations -- one-sample and limited-length -- to reduce computational complexity while maintaining the ability to explain communication in a population of agents. The experimental findings showcased the efficacy of RMHNG as a decentralized Bayesian inference for approximating the posterior distribution concerning latent variables, which are jointly shared among agents, akin to MHNG. Moreover, the utilization of RMHNG elucidated the agents' capacity to exchange symbols. Furthermore, the study discovered that even the computationally simplified version of RMHNG could enable symbols to emerge among the agents.
... However, in the developmental process of human infants, joint attention, which is acquired at around nine months of age, is well known to precede tremendous progress in lexical acquisition and language development. Another notable idea is the naming game based on joint attention and the associated theoretical basis, called MHNG, in which each agent independently forms categories and shares signs associated with those categories through communication in the joint attention naming game (JA-NG) [27]. This theory suggests that symbol emergence can be viewed as the approximate decentralized Bayesian inference of a posterior distribution over a shared latent variable conditioned on the observations of all agents participating in the communication. ...
... In this study, our objective is to investigate whether the MHNG, which models symbol emergence as a decentralized Bayesian inference [14,27], can serve as a valid explanatory principle of symbol emergence between human individuals. The MHNG involves computational agents playing a JA-NG, where agents independently form categories of objects and name them while assuming joint attention. ...
... To achieve this, we conducted a communication experiment with human participants. The communication structure in the experiment resembled that of the JA-NG in the simulation experiment conducted by Hagiwara et al [27]. We observed the acceptance or rejection assessments of participants and tested whether they utilized the acceptance probability calculated by MHNG theory to a certain extent. ...
Preprint
In this study, we explore the emergence of symbols during interactions between individuals through an experimental semiotic study. Previous studies investigate how humans organize symbol systems through communication using artificially designed subjective experiments. In this study, we have focused on a joint attention-naming game (JA-NG) in which participants independently categorize objects and assign names while assuming their joint attention. In the theory of the Metropolis-Hastings naming game (MHNG), listeners accept provided names according to the acceptance probability computed using the Metropolis-Hastings (MH) algorithm. The theory of MHNG suggests that symbols emerge as an approximate decentralized Bayesian inference of signs, which is represented as a shared prior variable if the conditions of MHNG are satisfied. This study examines whether human participants exhibit behavior consistent with MHNG theory when playing JA-NG. By comparing human acceptance decisions of a partner's naming with acceptance probabilities computed in the MHNG, we tested whether human behavior is consistent with the MHNG theory. The main contributions of this study are twofold. First, we reject the null hypothesis that humans make acceptance judgments with a constant probability, regardless of the acceptance probability calculated by the MH algorithm. This result suggests that people followed the acceptance probability computed by the MH algorithm to some extent. Second, the MH-based model predicted human acceptance/rejection behavior more accurately than the other four models: Constant, Numerator, Subtraction, and Binary. This result indicates that symbol emergence in JA-NG can be explained using MHNG and is considered an approximate decentralized Bayesian inference.
... Hagiwara et al. proposed a naming game based on a probabilistic generative model (PGM) and the Metropolis-Hastings method, which is a type of Markov Chain Monte Carlo algorithms [17]. We call this naming game the Metropolis-Hastings method-based (MH-based) naming game. ...
... The Inter-MDM is proposed to model the symbol emergence between two agents [18]. It is based on a model proposed by Hagiwara et al. [17], in which each agent forms categories from the information of a single modality. Figure 1 shows the probabilistic graphical model of the T2Ttype Inter-MDM. ...
Preprint
In this study, we propose a head-to-head type (H2H-type) inter-personal multimodal Dirichlet mixture (Inter-MDM) by modifying the original Inter-MDM, which is a probabilistic generative model that represents the symbol emergence between two agents as multiagent multimodal categorization. A Metropolis--Hastings method-based naming game based on the Inter-MDM enables two agents to collaboratively perform multimodal categorization and share signs with a solid mathematical foundation of convergence. However, the conventional Inter-MDM presumes a tail-to-tail connection across a latent word variable, causing inflexibility of the further extension of Inter-MDM for modeling a more complex symbol emergence. Therefore, we propose herein a head-to-head type (H2H-type) Inter-MDM that treats a latent word variable as a child node of an internal variable of each agent in the same way as many prior studies of multimodal categorization. On the basis of the H2H-type Inter-MDM, we propose a naming game in the same way as the conventional Inter-MDM. The experimental results show that the H2H-type Inter-MDM yields almost the same performance as the conventional Inter-MDM from the viewpoint of multimodal categorization and sign sharing.
... This study aims to provide a new model for emergent communication, which is based on a probabilistic generative model. We define the Metropolis-Hastings (MH) naming game by generalizing a model proposed by Hagiwara et al. [18]. The MH naming game is a sort of MH algorithm for an integrative probabilistic generative model that combines two agents playing the naming game. ...
... We first define the Metropolis-Hastings (MH) naming game. This type of game was first introduced by Hagiwara et al. [18] for a specific probabilistic model. In this paper, the MH naming game is generalized and formally defined. ...
... The limitation of the models proposed by Hagiwara et al. [18,17] is that they do not involve deep generative models and could not enable agents to conduct symbol emergence on raw images. In this research, we present Inter-GMM+VAE, a deep probabilistic generative model, and an inference procedure based on an MH naming game and a decomposition-and-communication strategy to model emergent communication based on deep probabilistic models [53]. ...
Preprint
Emergent communication, also known as symbol emergence, seeks to investigate computational models that can better explain human language evolution and the creation of symbol systems. This study aims to provide a new model for emergent communication, which is based on a probabilistic generative model. We define the Metropolis-Hastings (MH) naming game by generalizing a model proposed by Hagiwara et al. \cite{hagiwara2019symbol}. The MH naming game is a sort of MH algorithm for an integrative probabilistic generative model that combines two agents playing the naming game. From this viewpoint, symbol emergence is regarded as decentralized Bayesian inference, and semiotic communication is regarded as inter-personal cross-modal inference. We also offer Inter-GMM+VAE, a deep generative model for simulating emergent communication, in which two agents create internal representations and categories and share signs (i.e., names of objects) from raw visual images observed from different viewpoints. The model has been validated on MNIST and Fruits 360 datasets. Experiment findings show that categories are formed from real images observed by agents, and signs are correctly shared across agents by successfully utilizing both of the agents' views via the MH naming game. Furthermore, it has been verified that the visual images were recalled from the signs uttered by the agents. Notably, emergent communication without supervision and reward feedback improved the performance of unsupervised representation learning.
... Among several aspects of language ability, phonological segmentation is the most prominent example of showing such an experiential process based on innate bias. This aspect of language is treated in research with regard to symbol emergence in biological [6] and artificial systems [7]. In human language acquisition, infants initially have inborn substrates, making them possible to acquire various languages that separate sounds into different types of segments ( [8] as a recent experimental study). ...
Article
Full-text available
Language acquisition is supported by phonological awareness, which intentionally makes children aware of phonological units. By understanding the internal processes of children during language acquisition, this study aims to elucidate factors that can correct erroneous phonological generation. Therefore, we developed a cognitive model using innate and experiential factors of the memory retrieval in the cognitive architecture–ACT-R. Furthermore, we performed simulations using Shiritori, a Japanese word game, as an interaction task. The simulation included the observation of effects of the experiential factor of repeating a task and innate factors of different settings. It showed that repeating a single task causes incorrect convergence, and this convergence can be prevented by comprehensive activation of overall phonological knowledge during the interval of Shiritori tasks. Moreover, the simulation in specific innate settings exhibited commonalities with cases of developmental disorder by showing errors like consonant deletion. In the future, we will examine the correlation of the aforementioned findings with actual language development to realize the use of cognitive architecture in real world.
... Hagiwara et al. proposed a computational model of a symbol emergence system comprising two agents that perform categorization based on a visual modality, i.e. a single modality [5]. We call the model proposed in [5] interpersonal Dirichlet mixture (Inter-DM) in this study because the model is obtained by combining two Dirichlet mixtures (DMs). ...
... Hagiwara et al. proposed a computational model of a symbol emergence system comprising two agents that perform categorization based on a visual modality, i.e. a single modality [5]. We call the model proposed in [5] interpersonal Dirichlet mixture (Inter-DM) in this study because the model is obtained by combining two Dirichlet mixtures (DMs). Inter-DM is an advanced version of the Talking Heads experiment, in which various computational models of language emergence using perceptual categories based on sensory experiences were proposed by Steels et al. [6]. ...
... Regarding symbol emergence systems based on multimodal sensory information, the following three questions arise that have not been verified in previous works [5][6][7]: ...
Article
Full-text available
This paper describes a computational model of multiagent multimodal categorization that realizes emergent communication. We clarify whether the computational model can reproduce the following functions in a symbol emergence system, comprising two agents with different sensory modalities playing a naming game. (1) Function for forming a shared lexical system that comprises perceptual categories and corresponding signs, formed by agents through individual learning and semiotic communication. (2) Function to improve the categorization accuracy in an agent via semiotic communication with another agent, even when some sensory modalities of each agent are missing. (3) Function that an agent infers unobserved sensory information based on a sign sampled from another agent in the same manner as cross-modal inference. We propose an interpersonal multimodal Dirichlet mixture (Inter-MDM), which is derived by dividing an integrative probabilistic generative model, which is obtained by integrating two Dirichlet mixtures (DMs). The Markov chain Monte Carlo algorithm realizes emergent communication. The experimental results demonstrated that Inter-MDM enables agents to form multimodal categories and appropriately share signs between agents. It is shown that emergent communication improves categorization accuracy, even when some sensory modalities are missing. Inter-MDM enables an agent to predict unobserved information based on a shared sign.