Figure 1 - uploaded by Patrick Ye
Content may be subject to copyright.
Parse tree examples 

Parse tree examples 

Source publication
Article
Full-text available
This paper describes a maxent-based preposition sense disambiguation system entry to the preposition sense disambiguation task of the SemEval 2007. This system uses a wide variety of semantic and syntactic features to perform the disambiguation task and achieves a precision of 69.3% over the test data.

Contexts in source publication

Context 1
... tree features Given the position of the tar- get preposition p in the parse tree, the basic form of the corresponding parse tree feature is just the list of nodes of p's siblings in the tree (the POS tags are treated as part of the terminal). For example, sup- pose the original parse tree for the sentence I live in Melbourne is the left tree in Figure 1, for the target preposition in, the basic form of the parse tree fea- ture would be (1, NP). In order to gain more syn- tactic information, we further annotated each non- terminal of the parse tree with its parent node, and used the new non-terminals as our features. ...
Context 2
... order to gain more syn- tactic information, we further annotated each non- terminal of the parse tree with its parent node, and used the new non-terminals as our features. The right tree in Figure 1 shows the result of applying this annotation once to the original parse tree. Two levels of additional annotation were performed on the original parse trees in our feature extraction. ...

Similar publications

Article
Full-text available
This study is placed within the general framework of the automatic verb sense disambiguation. To assign a meaning to a verb, we take into account the construction of the verb, i.e. the other lexical and syntactic units within the utterance (co-text). We now seek to finalize our method by taking into account the semantic features of this co-text. We...
Conference Paper
Full-text available
Search Result Clustering (SRC) groups the results of a user query in such a way that each cluster represents a set of related results. To be useful to the user, the different cluster should contain the results corresponding to different possible meanings of the user query and the cluster labels should reflect these meanings. However, existing SRC a...
Article
Full-text available
We describe an unusual data set of thou- sands of annotated images with interest- ing sense phenomena. Natural language image sense annotation involves increased semantic complexities compared to dis- ambiguating word senses when annotating text. These issues are discussed and illus- trated, including the distinction between word senses and iconogr...

Citations

... After that, it established feature vectors and applied them to the generated maximum entropy model to make preposition errors check. Among the contextual features of prepositions, the contribution of contextual word features in the extraction window is greater than that of collocation features, named entities, etc. [18]. In [19], the authors proposed a method based on examples and the introduction of negative rules to achieve grammar checking. ...
Article
Full-text available
Natural language processing technology is a theory and approach for exploring and developing successful human-computer communication. With the rapid growth of computer science and technology, statistical learning methods have become an important research area in artificial intelligence and semantic search. If there are errors in the semantic units (words and sentences), it will affect future text analysis and semantic understanding, eventually affecting the whole application system performance. As a result, intelligent word and grammatical error detection and correction in English text are a significant and difficult aspect of natural language processing. Therefore, this paper examines the phenomena of word spelling and grammatical errors in undergraduate English essays and balances the mathematical-statistical models and technology solutions involved in intelligent error correction. The research findings of this study are represented in two aspects. (1) In nonword mistakes, four sorts of errors are studied: insertion, loss, replacement, and exchange between letters. It focuses on nonword mistakes and varied word forms (such as English abbreviations, hyphenated compound terms, and proper nouns) produced by word pronunciation difficulties. This paper utilizes the nonword check information to recommend an optimum combination prediction method based on the suggested candidate list for actual word errors, and the genuine word repair model is trained. This approach is 83.78% accurate when used with actual words with spelling errors in the context. (2) It verifies and corrects sentence grammar using context information from the text training set, as well as grammatical rules and statistical models. In addition, it has investigated singular and plural inconsistency, word confusion, subject, and predicate inconsistency, and modal (auxiliary) verb errors. It includes sentence boundary disambiguation, word part-of-speech tagging, named entity identification, and context information extraction. The software for checking and fixing sentence grammatical mistakes presented in this article works on English texts with difficulty levels 4 and 6. Furthermore, this work obtains a clause correctness rate of 99.70%, and the system’s average corrective accuracy rate for four-level and six-level essays is more than 80%.
... Corpus-based computational work on semantic disambiguation specifically of prepositions and possessives 2 falls into two categories: the lexicographic/word sense disambiguation approach (Litkowski andHargraves, 2005, 2007;Litkowski, 2014;Ye and Baldwin, 2007;Saint-Dizier, 2006;Dahlmeier et al., 2009;Tratz and Hovy, 2009;Hovy et al., 2010Hovy et al., , 2011Tratz and Hovy, 2013), and the semantic class approach (Moldovan et al., 2004;Badulescu and Moldovan, 2009;O'Hara and Wiebe, 2009;Roth, 2011, 2013;Schneider et al., 2015Schneider et al., , 2016, see also Müller et al., 2012 for German). The lexicographic approach can capture finer-grained meaning distinctions, at a risk of relying upon idiosyncratic and potentially incomplete dictionary definitions. ...
Preprint
Semantic relations are often signaled with prepositional or possessive marking--but extreme polysemy bedevils their analysis and automatic interpretation. We introduce a new annotation scheme, corpus, and task for the disambiguation of prepositions and possessives in English. Unlike previous approaches, our annotations are comprehensive with respect to types and tokens of these markers; use broadly applicable supersense classes rather than fine-grained dictionary definitions; unite prepositions and possessives under the same class inventory; and distinguish between a marker's lexical contribution and the role it marks in the context of a predicate or scene. Strong interannotator agreement rates, as well as encouraging disambiguation results with established supervised methods, speak to the viability of the scheme and task.
... The elements of disambiguation system have been broken up into three categories: (1) Collocation features [21] are stimulated with the support of one-sense-per-collocation heuristic proposed through exploring strong collocation properties on different senses of the target preposition. Unlike the previous study, in this study [22], the researchers proved the ability of reusable an entropy framework function. ...
... Corpus-based computational work on semantic disambiguation specifically of prepositions and possessives 2 falls into two categories: the lexicographic/word sense disambiguation approach (Litkowski andHargraves, 2005, 2007;Litkowski, 2014;Ye and Baldwin, 2007;Saint-Dizier, 2006;Dahlmeier et al., 2009;Tratz and Hovy, 2009;Hovy et al., 2010Hovy et al., , 2011Tratz and Hovy, 2013), and the semantic class approach (Moldovan et al., 2004;Badulescu and Moldovan, 2009;O'Hara and Wiebe, 2009;Roth, 2011, 2013;Schneider et al., 2015Schneider et al., , 2016, see also Müller et al., 2012 for German). The lexicographic approach can capture finer-grained meaning distinctions, at a risk of relying upon idiosyncratic and potentially incomplete dictionary definitions. ...
... Notably, the polysemy of over and other prepositions has been explained in terms of sense networks encompassing core senses and motivated extensions (Brugman, 1981;Lakoff, 1987;Dewell, 1994;Evans, 2001, 2003). The Preposition Project (TPP; Litkowski and Hargraves, 2005) broke ground in stimulating computational work on finegrained word sense disambiguation of English prepositions (Litkowski and Hargraves, 2005;Ye and Baldwin, 2007;Tratz and Hovy, 2009;Dahlmeier, Ng, and Schultz, 2009). Typologists, meanwhile, have developed semantic maps of functions, where the nearness of two functions reflects their tendency to fall under the same adposition or case marker in many languages (Haspelmath, 2003;Wälchli, 2010). ...
Article
Full-text available
We consider the semantics of prepositions, revisiting a broad-coverage annotation scheme used for annotating all 4,250 preposition tokens in a 55,000 word corpus of English. Attempts to apply the scheme to adpositions and case markers in other languages, as well as some problematic cases in English, have led us to reconsider the assumption that a preposition's lexical contribution is equivalent to the role/relation that it mediates. Our proposal is to embrace the potential for construal in adposition use, expressing such phenomena directly at the token level to manage complexity and avoid sense proliferation. We suggest a framework to represent both the scene role and the adposition's lexical function so they can be annotated at scale---supporting automatic, statistical processing of domain-general language---and sketch how this representation would inform a constructional analysis.
... Notably, the polysemy of over and other prepositions has been explained in terms of sense networks encompassing core senses and motivated extensions (Brugman, 1981;Lakoff, 1987;Dewell, 1994;Evans, 2001, 2003). The Preposition Project (TPP; Litkowski and Hargraves, 2005) broke ground in stimulating computational work on fine-grained word sense disambiguation of English prepositions (Litkowski and Hargraves, 2005;Ye and Baldwin, 2007;Tratz and Hovy, 2009;Dahlmeier et al., 2009). Typologists, meanwhile, have developed semantic maps of functions, where the nearness of two functions reflects their tendency to fall under the same adposition or case marker in many languages (Haspelmath, 2003;Wälchli, 2010). ...
... Un autre avantage de l'utilisation de ces catégories est leur universalité (elles sont communes à plusieurs langues), ce qui permet leur utilisation pour des tâches telles que la traduction automatique ou la recherche et l'extraction d'informations multilingues. L'annotation en SuperSenses a été aussi utilisée comme première étape pour plusieurs tâches, telles que la désambiguïsation en sens (Ye & Baldwin, 2007), et le réordonnancement des hypothèses d'un parser (Collins & Koo, 2005). ...
Article
Full-text available
RÉSUMÉ: Nos travaux portent sur la construction rapide d’outils d’analyse linguistique pour des langues peu dotées en ressources. Dans une précédente contribution, nous avons proposé une méthode pour la construction automatique d’un analyseur morpho-syntaxique via une projection interlingue d’annotations linguistiques à partir de corpus parallèles (méthode fondée sur les réseaux de neurones récurrents). Nous présentons, dans cet article, une amélioration de notre modèle neuronal, avec la prise en compte d’informations linguistiques externes pour un annotateur plus complexe. En particulier, nous proposons d’intégrer des annotations morpho-syntaxiques dans notre architecture neuronale pour l’apprentissage non supervisé d’annotateurs sémantiques multilingues à gros grain (annotation en SuperSenses). Nous montrons la validité de notre méthode et sa généricité sur l’italien et le français et étudions aussi l’impact de la qualité du corpus parallèle sur notre approche (généré par traduction manuelle ou automatique). Nos expériences portent sur la projection d’annotations de l’anglais vers le français et l’italien. ABSTRACT: This work focuses on the development of linguistic analysis tools for resource-poor languages. In a previous study, we proposed a method based on cross-language projection of linguistic annotations from parallel corpora to automatically induce a morpho-syntactic analyzer. Our approach was based on Recurrent Neural Networks (RNNs). In this paper, we present an improvement of our neural model. We investigate the inclusion of external information (POS tags) in the neural network to train a multilingual SuperSenses Tagger. We demonstrate the validity and genericity of our method by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted for cross-lingual annotation projection from English to French and Italian.
... Prepositional polysemy has also been recognized as a challenge for AI (Herskovits, 1986) and natural language processing, motivating semantic disambiguation systems (O'Hara and Wiebe, 2003;Ye and Baldwin, 2007;Hovy et al., 2010;Srikumar and Roth, 2013b). Training and evaluating these requires semantically annotated corpus data. ...
Article
Full-text available
We present the first corpus annotated with preposition supersenses, unlexicalized categories for semantic functions that can be marked by English prepositions (Schneider et al., 2015). That scheme improves upon its predecessors to better facilitate comprehensive manual annotation. Moreover, unlike the previous schemes, the preposition supersenses are organized hierarchically. Our data will be publicly released on the web upon publication.
... Similar to NER, supersense tagging approaches have generally used statistical sequence models and have been evaluated in English, Italian,Chinese,Arabic,and Danish. 3 Features based on supersenses have been exploited in downstream semantics tasks such as preposition sense disambiguation, noun compound interpretation, question generation, and metaphor detection (Ye and Baldwin, 2007;Heilman, 2011;Hovy et al., 2013;Tsvetkov et al., 2013). ...
... Theoretical linguists have puzzled over questions such as how individual prepositions can acquire such a broad range of meanings and to what extent those meanings are systematically related (e.g., Brugman, 1981;Lakoff, 1987;Tyler and Evans, 2003;O'Dowd, 1998;Saint-Dizier and Ide, 2006;Lindstromberg, 2010). Prepositional polysemy has also been recognized as a challenge for AI (Herskovits, 1986) and natural language processing, motivating semantic disambiguation systems (O'Hara and Wiebe, 2003;Ye and Baldwin, 2007;Hovy et al., 2010;Srikumar and Roth, 2013b). Training and evaluating these requires semantically annotated corpus data. ...