ArticlePDF Available

Group Theory and Computational Linguistics

Authors:

Abstract and Figures

There is currently much interest in bringing together the tradition of categorial grammar, and especially the Lambek calculus, with the recent paradigm of linear logic to which it has strong ties. One active research area is designing non-commutative versions of linear logic (Abrusci, 1995; Retor, 1993) which can be sensitive to word order while retaining the hypothetical reasoning capabilities of standard (commutative) linear logic (Dalrymple et al., 1995). Some connections between the Lambek calculus and computations in groups have long been known (van Benthem, 1986) but no serious attempt has been made to base a theory of linguistic processing solely on group structure. This paper presents such a model, and demonstrates the connection between linguistic processing and the classical algebraic notions of non-commutative free group, conjugacy, and group presentations. A grammar in this model, or G-grammar is a collection of lexical expressions which are products of logical forms, phonological forms, and inverses of those. Phrasal descriptions are obtained by forming products of lexical expressions and by cancelling contiguous elements which are inverses of each other. A G-grammar provides a symmetrical specification of the relation between a logical form and a phonological string that is neutral between parsing and generation modes. We show how the G-grammar can be oriented for each of the modes by reformulating the lexical expressions as rewriting rules adapted to parsing or generation, which then have strong decidability properties (inherent reversibility). We give examples showing the value of conjugacy for handling long-distance movement and quantifier scoping both in parsing and generation. The paper argues that by moving from the free monoid over a vocabulary V (standard in formal language theory) to the free group over V, deep affinities between linguistic phenomena and classical algebra come to the surface, and that the consequences of tapping the mathematical connections thus established can be considerable.
Content may be subject to copyright.
c
a
ccO
c
a
c
a
a
b
a
c
c
O
c
a
a
b
u2un
u1
r1
r2
rn
O
a
c
c
O
c
a
a
b
a
c
c
c
c
a
O
a
a
b
c
c
c
b
a
a
b
c
a
ccO
a
a
c
b
c
a
ccO
a
a
b
c
c
a
ccO
c
a
c
a
a
b
jlp
lp
j
saw in
paris
louise
john
i(s(j,l),p)
s(j,l)
s(j,l)
(5)
(4)
(2) (3)
(1)
saw
john
louise in
paris
j p
l
i(s(j,l),p)
s(j,l)
D
t
A
mC
B
s
Γ
DA
BC
Γ
saw
john
louise in
paris
j p
l
i(s(j,l),p)
s(j,l)
saw
john
louise in
paris
j p
l
s(j,l)
m
man
(1)
saw
s(x,y)
xy
(3)
sm(w,y,s(x,y))
ev(m,x,sm(w,y,s(x,y))) x
every m
(4) (4)
w
woman
(2)
y
some w
s(x,y)
sm(w,y,s(x,y))
(5) (5)
ev(m,x,sm(w,y,s(x,y)))
sm(w,y,s(x,y))
s(x,y)
saw
every x
m
man
woman
y
w
some
(4) (5)
(1)
(3) (5)
(2)
(4)
f
d
AB
m
c
h
e
b
g
d
m
h
AB
f
a
d
c
b
hm
g
e
AB
Γ
Γ
dm
h
e
b
g
AB
... This paper, following [4], presents an approach to grammar description and processing based on the geometry of cancellation diagrams, a concept which plays a central role in combinatorial group theory [6]. The focus here is on the geometric intuitions and on relating group-theoretical diagrams to the traditional charts associated with context-free grammars and type-0 rewriting systems. ...
... A computation can thus be seen as a "witness", or as a "proof", of the fact that a given element of F (V ) is a result of the computation structure. 4 For specific computation tasks, one focusses on results of a certain sort, for instance results which express a relationship of input-output, where input and output are assumed to belong to certain object types. For example, in computational linguistics, one is often interested in results which express a relationship between a fixed semantic input and a possible textual output (generation mode), or conversely in results which express a relationship between a fixed textual input and a possible semantic output (parsing mode). ...
... A result of GCS which belongs to the public interface will be called a public result of GCSA. 4 Relating the geometric and the algebraic views: ...
Article
Full-text available
This paper, following (Dymetman:1998), presents an approach to grammar description and processing based on the geometry of cancellation diagrams, a concept which plays a central role in combinatorial group theory (Lyndon-Schuppe:1977). The focus here is on the geometric intuitions and on relating group-theoretical diagrams to the traditional charts associated with context-free grammars and type-0 rewriting systems. The paper is structured as follows. We begin in Section 1 by analyzing charts in terms of constructs called cells, which are a geometrical counterpart to rules. Then we move in Section 2 to a presentation of cancellation diagrams and show how they can be used computationally. In Section 3 we give a formal algebraic presentation of the concept of group computation structure, which is based on the standard notions of free group and conjugacy. We then relate in Section 4 the geometric and the algebraic views of computation by using the fundamental theorem of combinatorial group theory (Rotman:1994). In Section 5 we study in more detail the relationship between the two views on the basis of a simple grammar stated as a group computation structure. In section 6 we extend this grammar to handle non-local constructs such as relative pronouns and quantifiers. We conclude in Section 7 with some brief notes on the differences between normal submonoids and normal subgroups, group computation versus rewriting systems, and the use of group morphisms to study the computational complexity of parsing and generation.
... Group Theory. Group theory and grammar formalisms based on groups and pre-groups play an important role in computational linguistics(Lambek 1958;Dymetman 1998). From the d In our investigations, we will focus on VSM composition operations which preserve the format (i.e., which yield a vector of the same dimensionality), as our notion of compositionality requires models that allow for iterated composition. ...
Article
Full-text available
We give an in-depth account of compositional matrix-space models (CMSMs), a type of generic models for natural language, wherein compositionality is realized via matrix multiplication. We argue for the structural plausibility of this model and show that it is able to cover and combine various common compositional natural language processing approaches. Then, we consider efficient task-specific learning methods for training CMSMs and evaluate their performance in compositionality prediction and sentiment analysis.
... From a grammatical point of view, a human language is considered as a set of abstract elements where each one is described-limited-by other surrounding elements that constitute its " cotext " . Marc Dymetman (1998) argues that by moving from the free monoid over a vocabulary V (standard in formal language theory) to the free group over V, deep affinities between linguistic phenomena and classical algebra come to the surface. Dominic Widdows and Stanley Peters (2003) have described some of the ways vectors have been used to represent the meanings of terms and documents in natural language processing. ...
Article
Full-text available
The present attempt deals with the generation of metric and Hilbert spaces for language aspects. Diaphatic, diatopic and diastratal linguistic variations have been exhibited in the form of regular, normal and compact spaces respectively. Some properties of metric spaces prevalent in their classical arena have been proved for the generated linguistic spaces.
... Group theory and grammar formalisms based on groups and pre-groups play an important role in computational linguistics (Dymetman, 1998;Lambek, 1958). From the perspective of our compositionality framework, those approaches employ a group (or pre-group) (G, ·) as semantical space S where the group operation (often written as multiplication) is used as composition operation . ...
Conference Paper
Full-text available
We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
Article
Full-text available
Dalam era digital saat ini, data teks yang terus meningkat menjadi sumber informasi yang berharga. Untuk memahami pandangan dan emosi yang terkandung dalam teks tersebut, metode analisis sentimen linguistik komputasional sangat diperlukan. Artikel ini menjelaskan tentang konsep analisis sentimen linguistik komputasional, teknik-teknik yang digunakan, serta manfaatnya dalam berbagai aplikasi. Kata Kunci: Analisis Sentimen Linguistik Komputasional, Emosi dalam Teks, Teknik Analisis Sentimen, Aplikasi NLP, Algoritma Machine Learning 1. Pengantar Dalam era digital saat ini, munculnya platform media sosial dan layanan berbasis teks lainnya telah menciptakan ledakan data teks. Data ini mencakup pendapat, ulasan, dan komentar dari pengguna yang mencerminkan emosi, sikap, dan pandangan mereka terhadap berbagai topik. Memahami sentimen ini menjadi penting dalam berbagai konteks, termasuk pengambilan keputusan bisnis, analisis opini publik, dan pemantauan merek. Linguistik Komputasional adalah bidang interdisipliner yang menggabungkan linguistik dan ilmu komputer untuk mengembangkan algoritma dan model untuk pemrosesan bahasa alami. Tujuan dari analisis sentimen linguistik komputasional adalah untuk mengidentifikasi, mengklasifikasikan, dan menafsirkan emosi yang terkandung dalam teks. Dengan menggunakan teknik-teknik seperti pemrosesan bahasa alami (Natural Language Processing/NLP) dan algoritma machine learning, analisis sentimen dapat memberikan wawasan yang berharga dalam pemahaman opini dan pandangan dari berbagai sumber teks.
Article
Full-text available
Naturally, human do the search for truth in various ways. All of that becomes methodological, but not a methodology. The right way and produce truth by involving the principle of believing the truth, getting a way, and having a purpose. So, its not justification. Three basic principles of methodology are not enough. However, the methodology becomes the method science, it needs to be systematic that flows the truth logically, and needs a structure to form the truth logically. Thus, abstraction is helped by rules or formulations that describe the structure, at least from the point formation, as definitions, lemmas, propositions, theorems, corollaries
Article
Full-text available
We provide a logical system to express Minimalist Grammars, which aims at being "minimal" in the sense that it contains the least rules and the simplest lexical entries as possible. By limiting ourselves to the proofs in that system that satisfy one constraint on hypotheses management, we simulate minimalist derivations. This system is an elaboration on previous works by Lecomte and Retore which were based on a new use of the Lambek calculus, where words were no longer associated with formulae, but with axioms and where a proof of a sentence was a proof of the theorem ` c instead of a proof of a sequent ` c. Such a calculus which suered from limitations is here replaced by a version of partially non commutative linear logic, due to P. de Groote, and we show that in this system, when we limit ourselves to special proofs, we may mimick move (besides of course merge) for A-movement as well as head-movement. Moreover we show that the use of second-order types, allowed by the categorial g...
Article
We want to decide whether a given monoid presentation defines a group or not. There are easy examples of both cases, and we derive generalisations of these. A large number of open problems suggest themselves, and we mention two of these explicitly.
Book
I/Constraints on Denotations.- 1 / Determiners.- 2 / Quantifiers.- 3 / All Categories.- 4 / Conditionals.- 5 / Tense and Modality.- 6 / Natural Logic.- II/Dynamics of Interpretation.- 7 / Categorial Grammar.- 8 / Semantic Automata.- III/Methodology of Semantics.- 9 / Logical Semantics as an Empirical Science.- 10/ The Logic of Semantics.- References.- Index of Names.- Index of Subjects.
Article
The linear logic introduced in [3] by J.-Y. Girard keeps one of the so-called structural rules of the sequent calculus: the exchange rule . In a one-sided sequent calculus this rule can be formulated as The exchange rule allows one to disregard the order of the assumptions and the order of the conclusions of a proof, and this means, when the proof corresponds to a logically correct program, to disregard the order in which the inputs and the outputs occur in a program. In the linear logic introduced in [3], the exchange rule allows one to prove the commutativity of the multiplicative connectives, conjunction (⊗) and disjunction (⅋). Due to the presence of the exchange rule in linear logic, in the phase semantics for linear logic one starts with a commutative monoid. So, the usual linear logic may be called commutative linear logic . The aim of the investigations underlying this paper was to see, first, what happens when we remove the exchange rule from the sequent calculus for the linear propositional logic at all, and then, how to recover the strength of the exchange rule by means of exponential connectives (in the same way as by means of the exponential connectives ! and ? we recover the strength of the weakening and contraction rules).