ArticlePDF Available

The structure of the Merriam-Webster pocket dictionary /

January 1980

January 1980

Authors:

Robert A. Amsler

Thesis (Ph. D.)--University of Texas at Austin, 1980. Vita. Includes bibliographical references (leaves 260-269).

Content uploaded by Robert A. Amsler

Content may be subject to copyright.

p133-amsler

Data

June 2013

Robert A. Amsler

Download

tr80-164c

Data

June 2013

Robert A. Amsler

Download

tr80-164b

Data

June 2013

Robert A. Amsler

Download

Learning to Define Terms in the Software Domain

Conference Paper

Jan 2018

The RACAI Text-to-Speech Synthesis System

Conference Paper

Sep 2013

A Conceptually Oriented Approach to Semantic Composition in RRG

Chapter

Full-text available

Jun 2023

Role and Reference Grammar (RRG) is a theory of language in which linguistic structures are accounted for in terms of the interplay of discourse, semantics and syntax. With contributions from a team of leading scholars, this Handbook provides a field-defining overview of RRG. Assuming no prior knowledge, it introduces the framework step-by-step, and includes a pedagogical guide for instructors. It features in-depth discussions of syntax, morphology, and lexical semantics, including treatments of lexical and grammatical categories, the syntax of simple clauses and complex sentences, and how the linking of syntax with semantics and discourse works in each of these domains. It illustrates RRG's contribution to the study of language acquisition, language change and processing, computational linguistics, and neurolinguistics, and also contains five grammatical sketches which show how RRG analyses work in practice. Comprehensive yet accessible, it is essential reading for anyone who is interested in how grammar interfaces with meaning.

The age of computer dictionary and its problems

Article

Full-text available

Jun 2022

Mehmet Atli

The use of computers in practical lexicography in the 1960s introduced a fundamental change in traditional lexicography. Especially in the 1990s, the accumulating of electronic language data and the increasing use of low-cost high-speed broadband internet at the beginning of the 21st century is leading to a rapid growth of e- lexicography studies. Along with these developments, many printed dictionaries became available as e-dictionaries within a short time. The advantage of electronically published dictionaries over printed dictionaries catches the attention of many lexicographers immediately. But nobody noticed the problems caused by hybridizations. Likewise, issues such as corpus coherence, data reliability, access path, quality, and utility attract few researchers' attention. Nevertheless, modern developments in e- lexicography have led to an unbalanced growth in expectations of the discipline, as well as a radical change in its functional field. In this study, the definition of the term computer lexicography is emphasized in addition to the historical advances of e-lexicography. And the phases of e-dictionary making were also discussed. In addition, an attempt was made to determine whether there are problems with today's e-dictionaries in terms of hybridization, corpus, data reliability, access path, personalization and quality. It is known that electronic devices require speech technology software to analyse linguistic and lexicographical data, and language technology applications require lexical data to be readable/understandable. Language needs to be standardized, because e-devices are more sensitive than humans to detect mistakes and errors. Therefore, this study makes some suggestions to identify and get rid of problems with e-dictionaries.

Lacking the Embedding of a Word? Look it up into a Traditional Dictionary

Conference Paper

Full-text available

Jan 2022

The problem of loss of solutions in the task of searching similar documents: Applying terminology in the construction of a corpus vector model

Article

Full-text available

Jun 2021

This article considers the problem of finding text documents similar in meaning in the corpus. We investigate a problem arising when developing applied intelligent information systems that is non-detection of a part of solutions by the TF-IDF algorithm: one can lose some document pairs that are similar according to human assessment, but receive a low similarity assessment from the program. A modification of the algorithm, with the replacement of the complete vocabulary with a vocabulary of specific terms is proposed. The addition of thesauri when building a corpus vector model based on a ranking function has not been previously investigated; the use of thesauri has so far been studied only to improve topic models. The purpose of this work is to improve the quality of the solution by minimizing the loss of its significant part and not adding “false similar” pairs of documents. The improvement is provided by the use of a vocabulary of specific terms extracted from the text of the analyzed documents when calculating the TF-IDF values for corpus vector representation. The experiment was carried out on two corpora of structured normative and technical documents united by a subject: state standards related to information technology and to the field of railways. The glossary of specific terms was compiled by automatic analysis of the text of the documents under consideration, and rule-based NER methods were used. It was demonstrated that the calculation of TF-IDF based on the terminology vocabulary gives more relevant results for the problem under study, which confirmed the hypothesis put forward. The proposed method is less dependent on the shortcomings of the text layer (such as recognition errors) than the calculation of the documents’ proximity using the complete vocabulary of the corpus. We determined the factors that can affect the quality of the decision: the way of compiling a terminology vocabulary, the choice of the range of n-grams for the vocabulary, the correctness of the wording of specific terms and the validity of their inclusion in the glossary of the document. The findings can be used to solve applied problems related to the search for documents that are close in meaning, such as semantic search, taking into account the subject area, corporate search in multi-user mode, detection of hidden plagiarism, identification of contradictions in a collection of documents, determination of novelty in documents when building a knowledge base.

Auto-Encoding Dictionary Definitions into Consistent Word Embeddings

Conference Paper

Jan 2018

Polysemy - Evidence from Linguistics, Behavioural Science and Contextualised Language Models

Article

Full-text available

Dec 2023

Polysemy is the type of lexical ambiguity where a word has multiple distinct but related interpretations. In the past decade, it has been the subject of a great many studies across multiple disciplines including linguistics, psychology, neuroscience, and computational linguistics, which have made it increasingly clear that the complexity of polysemy precludes simple, universal answers, especially concerning the representation and processing of polysemous words. But fuelled by the growing availability of large, crowdsourced datasets providing substantial empirical evidence; improved behavioral methodology; and the development of contextualised language models capable of encoding the fine-grained meaning of a word within a given context, the literature on polysemy recently has developed more complex theoretical analyses. In this survey we discuss these recent contributions to the investigation of polysemy against the backdrop of a long legacy of research across multiple decades and disciplines. Our aim is to bring together different perspectives to achieve a more complete picture of the heterogeneity and complexity of the phenomenon of polysemy. Specifically, we highlight evidence supporting a range of hybrid models of the mental processing of polysemes. These hybrid models combine elements from different previous theoretical approaches to explain patterns and idiosyncrasies in the processing of polysemous that the best known models so far have failed to account for. Our literature review finds that i) traditional analyses of polysemy can be limited in their generalisability by loose definitions and selective materials; ii) linguistic tests provide useful evidence on individual cases, but fail to capture the full range of factors involved in the processing of polysemous sense extensions; and iii) recent behavioural (psycho) linguistics studies, largescale annotation efforts and investigations leveraging contextualised language models provide accumulating evidence suggesting that polysemous sense similarity covers a wide spectrum between identity of sense and homonymy-like unrelatedness of meaning. We hope that the interdisciplinary account of polysemy provided in this survey inspires further fundamental research on the nature of polysemy and better equips applied research to deal with the complexity surrounding the phenomenon, e.g. by enabling the development of benchmarks and testing paradigms for large language models informed by a greater portion of the rich evidence on the phenomenon currently available.

Representasi Identitas Agama dalam Pilkada sebagai Media Resolusi Konflik Etnoreligius di Mamasa

Article

Full-text available

Apr 2023

Social conflict in Mamasa was a crucial issue. The conflict which initially had a political dimension has led to ethnic and religious conflicts. Various efforts to rebuild the social integration of the community continue to be carried out. This study aims to examine the role of religious identity representation as media of conflict resolution in building social reintegration. The author analyzes the electoral arena in Mamasa which always provides space for religious representation in contesting. The research used an ethnographic approach to explain this phenomenon. Data collection techniques were carried out by observation, interviews and documentation. This study found that religious representation in the regional head election has a significant role in the process of social reintegration of the community in Mamasa district, West Sulawesi. In the process, the community does not rely on extreme identity politics (ethnic and religious) but rather builds moderate political awareness. These political choices contribute to the development of social integration in society.

ஆக்கமுறை அகராதி அகராதிக் கோட்பாடு அடிப்படையில் தமிழ் வினைச்சொற்களின் பல்பொருண்மை ஆய்வு (Poly-semantic Study of Tamil Verbs based on the Principles of Generative Lexicon)

Book

Full-text available

Jul 2021

Rajendran Sankaravelayuthan

This monograph written in Tamil has eleven chapters. The first ten chapters deal about the principles of generative lexicon based on James Pustejovsky’s (1995) book on “Generative lexicon. The generative lexicon (shortly GL) presents a new and exciting theory of lexical semantics that addresses the problem of the “multiplicity of word meaning”, that is, how we are able to give an infinite number of senses to words with finite means. As the first formally elaborated theory of generative approach to word meaning, it lays the foundation for an implemented computational treatment of word meaning that connects explicitly to a compositional semantics. In contrast to static view of word meaning (where each word is characterized by a predetermined number of word senses) that imposes a tremendous bottleneck on the performance capability of any natural language processing, Pustejovsky proposes that the lexicon becomes an active and central component in the linguistic description. The essence of his theory is that the lexicon functions generatively, first by providing a rich and expressive vocabulary for characterizing lexical information; then, by developing a frame work for manipulating fine-grained distinctions in word descriptions; and finally, by formalizing a set of mechanisms for specialized composition of aspects of such description of words, as they occur in context, extended and novel senses are generated. The second part of the book is on polysemantic study of Tamil verbs based on principles of generative lexicon. Taking into account the principles of generative lexicon propounded by Pustejovsky, the semantic extension of meanings has been studied taking the data from Crea’s Modern Tamil Dictionary (Ramakrishnan 2008). Nearly 28 verbs have been taken from Crea and their meanings have been tabulated by extracting the participants of actions expressed by verbs such as subject, object, indirect object, location, etc., from the examples given for meanings of each verbal entry. The participants are realized as noun phrases (NPs). The verbs such as acai1 ‘move’ , acai2 ‘cause to move’, anjcu ‘be afraid of’, aTangku ‘subside’, aTakku ‘control’, aTi ‘beat’, aTai1 ‘get’ and aTai2 ‘confine’, aNi ‘wear’, aNai1 ‘be extinguished’ and aNai2 ‘extinguish’, aNai3 ‘embrace’, amai1 ‘be established’, amai2 ‘establish’, aaTu ‘move’, aaTTu ‘shake’, izu ‘pull’, iir ‘attract’, uTai ‘be broken’, uTai ‘break’, uuRu ‘secrete’, uuRRu ‘pour’, eTu ‘take’, eeRu ‘climb’, eeRRu ‘load’, ooTu ‘run’, ooTTu ‘cause to run’, kaTTu ‘construct’, kalai ‘become disorderly’, kalai ‘change the order’, ceer1 ‘join’, ceer ‘assemble’, taTTu ‘pat’, ndaTa ‘walk’ and maRRu ‘change’ have been selected from Crea. From the meanings of each verb and taking into account the participants of the action such as subject, object, indirect object, location, etc. realized as NPs, the sematic features of the participants are studied and the polysemy and semantic extension of the verbs are plotted and given explanation from the point of view of Pustejovsky.

ResearchGate has not been able to resolve any references for this publication.

The structure of the Merriam-Webster pocket dictionary /

Abstract

Supplementary resources (3)

Recommended publications

An evaluation framework for internet lexicography

Problems in English nominal compounds with a critique of the Prince algorithm /

Repetition priming for identical, related, and unrelated words /

Story telling as an aid to teaching English to Latin American children /