Simple three-stream template representation of phones /p/ and /b/ over a three symbol alphabet

Source publication

Inductive String Template-Based Learning of Spoken Language.

Conference Paper

Full-text available

Jan 2005

This paper deals with formulation of alternative structural approach to the speech recognition problem. In this approach, we require both the represen- tation and the learning algorithms defined on it to be linguistically meaningful, which allows the speech recognition system to discover the nature of the linguis- tic classes of speech patterns cor...

Context 1

... a single token, each stream is a string of symbols from one of the corresponding alphabets. Figure 1 shows a simple representation for the two-class problem consisting of /p/ and /b/ consonants, for each of which two realizations are available. Each template consists of three independent distinctive feature streams (over a three-symbol alphabet) from the SPE features system defined in [9]. ...

View in full-text

Context 2

... an example representation in Fig. 1, and defining the weighted Levenshtein distance to act on the templates, we obtain a simple metric space where the set P consists of four templates and the metric is defined as a linear combination of three independent per-stream weighted Levenshtein edit distances over three different ...

View in full-text

Context 3

... the set of weights {∆ˆω{∆ˆ {∆ˆω } fromˆΩfromˆ fromˆΩ. The set of transformationsˆOformationsˆ formationsˆO is necessary since the concept of a distance can properly be defined only in terms of these operations. Figure 2 shows the non-trivial stream-specific transformations discovered during the learning process for the two-class phone problem of Fig. 1. These operations (the corresponding optimal sets of weightsˆΩweightsˆ weightsˆΩ /p/ andˆΩandˆ andˆΩ /b/ are not shown) together with the trivial one-symbol transformations form the optimal set of transformations for each class. Together with the corresponding sets of reference objectsˆCobjectsˆ objectsˆC + /p/ andˆCandˆ andˆC + /b/ ...

View in full-text

Doit-on supposer un niveau de représentations pré-l exical de nature phonémique ?

Article

Full-text available

This study examines the nature of the phonological representations mediating spoken word recognition b y means of phonetic priming in which primes and targets are phonetically similar but share no phonemes (GUESS - CAGE). We found an inhibitory priming effect of similar size for words and non-word primes in both a shadowing and a same-different jud...

Lexical Ambiguity Resolution and Spoken Word Recognition: Bridging the Gap

Article

Full-text available

Apr 2001

Phonological variation in speech production can neutralize phonemic distinctions. In some cases, the alternations also create lexical ambiguity, as in the sentence “A quick rum picks you up,” where the underlined sequence could be interpreted as either rum or as a place assimilated form of run. Three cross-modal priming experiments examined the per...

Investigating the Lexical Representation of Mandarin Tone 3 Phonological Alternations

Article

Full-text available

Aug 2021

Phonological alternations pose challenges for models of spoken word recognition in how surface information is mapped onto stored representations in the lexicon. In the current study, an auditory-auditory priming lexical decision experiment was conducted to investigate the alternating representations of Mandarin Tone 3 in both half-third and third t...

Phrasal or Lexical Constructions? Some Comments on Underspecification of Constituent Order, Compositionality, and Control

Conference Paper

Full-text available

Jan 2007

Stefan Müller

This paper is a follow up on Müller, 2006 . It contains some comments on suggestions about the interaction of phrasal Constructions with constituent order that Adele Goldberg made at various occasions. In addition the paper discusses various HPSG analyses of particle verbs that assume lexical representations including phonologically specified parts...

Figure 4 Distribution of responses along a 10-point scale, from...

Figure 5 Posterior distributions of model estimates and associated 50%...

Gradience in prosodic representation: vowel reduction and neoclassical elements in Brazilian Portuguese

Article

Full-text available

Jun 2021

In Brazilian Portuguese, neoclassical elements (NCEs) may combine with both independent lexical words (e.g., 'psico' in 'psicolinguística' ‘psycholinguistics’) and non-lexical words (e.g., 'psico' in 'psicologia' ‘psychology’). This has led to the proposal that they have distinct prosodic representations depending on the type of structure that they...

Some Notes on Computational and Theoretical Issues in Artificial Intelligence and Machine Learning

Chapter

Full-text available

Jun 2016

In the attempt to implement applications of public utility which simplify the user access to future, remote and nearby social services, new mathematical models and new psychological and computational approaches from existing cognitive frameworks and algorithmic solutions have been developed. The nature of these instruments is either deterministic, probabilistic, or both. Their use depends upon their contribute to the conception of new ICT functionalities and evaluation methods for modelling concepts of learning, reasoning, and data interpretation. This introductory chapter provide a brief overview on the theoretical and computational issues of such artificial intelligent methods and how they are applied to several research problems.

Towards Formal Structural Representation of Spoken Language: An Evolving Transformation System (ETS) Approach

Thesis

Full-text available

Jan 2006

Alexander Gutkin

Speech recognition has been a very active area of research over the past twenty years. Despite an evident progress, it is generally agreed by the practitioners of the field that performance of the current speech recognition systems is rather suboptimal and new approaches are needed. The motivation behind the undertaken research is an observation that the notion of representation of objects and concepts that once was considered to be central in the early days of pattern recognition, has been largely marginalised by the advent of statistical approaches. As a consequence of a predominantly statistical approach to speech recognition problem, due to the numeric, feature vector-based, nature of representation, the classes inductively discovered from real data using decision-theoretic techniques have little meaning outside the statistical framework. This is because decision surfaces or probability distributions are difficult to analyse linguistically. Because of the later limitation it is doubtful that the gap between speech recognition and linguistic research can be bridged by the numeric representations. This thesis investigates an alternative, structural, approach to spoken language representation and categorisation. The approach pursued in this thesis is based on a consistent program, known as the Evolving Transformation System (ETS), motivated by the development and clarification of the concept of structural representation in pattern recognition and artificial intelligence from both theoretical and applied points of view. This thesis consists of two parts. In the first part of this thesis, a similarity-based approach to structural representation of speech is presented. First, a linguistically well-motivated structural representation of phones based on distinctive phonological features recovered from speech is proposed. The representation consists of string templates representing phones together with a similarity measure. The set of phonological templates together with a similarity measure defines a symbolic metric space. Representation and ETS-inspired categorisation in the symbolic metric spaces corresponding to the phonological structural representation are then investigated by constructing appropriate symbolic space classifiers and evaluating them on a standard corpus of read speech. In addition, similarity-based isometric transition from phonological symbolic metric spaces to the corresponding non-Euclidean vector spaces is investigated. Second part of this thesis deals with the formal approach to structural representation of spoken language. Unlike the approach adopted in the first part of this thesis, the representation developed in the second part is based on the mathematical language of the ETS formalism. This formalism has been specifically developed for structural modelling of dynamic processes. In particular, it allows the representation of both objects and classes in a uniform event-based hierarchical framework. In this thesis, the latter property of the formalism allows the adoption of a more physiologically-concreteapproach to structural representation. The proposed representation is based on gestural structures and encapsulates speech processes at the articulatory level. Algorithms for deriving the articulatory structures from the data are presented and evaluated.

Simple three-stream template representation of phones /p/ and /b/ over a three symbol alphabet

Contexts in source publication

Similar publications

Citations