Semantic ontology structure diagram

Source publication

Semantic classification method for network Tibetan corpus

Article

Full-text available

Mar 2017

Tibetan web pages appear enormously. It is meaningful that the information processing technology is utilized to find the useful knowledge from the Tibetan web information. Tibetan semantic ontology can enrich the Tibetan digital resource and is helpful to improve the information processing performance. In this paper, semantic classification of Tibe...

Figure 4: Crack images after processing.

Building crack monitoring based on digital image processing

Article

Full-text available

Mar 2020

Building crack monitoring is of great value to the judgment of building safety. In this study, the digital image processing technology was studied and applied to the monitoring of building cracks. Crack images were collected by CCD camera, and then operations such as graying, correction, denoising and segmentation were carried out to obtain clear c...

Method of Feature Reduction in Short Text Classification Based on Feature Clustering

Article

Full-text available

Apr 2019

One decisive problem of short text classification is the serious dimensional disaster when utilizing a statistics-based approach to construct vector spaces. Here, a feature reduction method is proposed that is based on two-stage feature clustering (TSFC), which is applied to short text classification. Features are semi-loosely clustered by combining spectral clustering with a graph traversal algorithm. Next, intra-cluster feature screening rules are designed to remove outlier feature words, which improves the effect of similar feature clusters. We classify short texts with corresponding similar feature clusters instead of original feature words. Similar feature clusters replace feature words, and the dimension of vector space is significantly reduced. Several classifiers are utilized to evaluate the effectiveness of this method. The results show that the method largely resolves the dimensional disaster and it can significantly improve the accuracy of short text classification.

A Lightweight Linked Data Reasoner Using Jena and Axis2

Chapter

Jun 2019

Semantic Web is rapidly becoming a reality through the development of Linked Data in recent years. Linked Data uses RDF data model to describe statements that link arbitrary data resources on the Internet. It can facilitate to infer new data resources at runtime through the RDF links, and then provide more complete answers as new data resources appear on the Internet. Linked Data provides the means to reach the goal of Semantic Web. At present, Linked Data being used only in the promotion of information sharing or exchange is not a semantic inference due to the lack of an easily shared inference engine. This study addresses the issue developing a Lightweight Linked Data Reasoner (LLDR) which is based on Jena reasoner and is implemented in the apache Axis2. To illustrate the LLDR application, this study developed the Vehicle Ontology to annotate project document from heterogeneous and distributed project resources as Linked Data.

Improving text classification with word embedding

Conference Paper

Dec 2017

Morphosyntactic Parser and Textual Corpora: Processing Uncommon Phenomena of Tibetan Language

Conference Paper

Jun 2017

This article analyzes the problems of parsing texts with linguistic phenomena of controversial nature which may rarely be encountered in NLP projects focusing on Indo-European languages, but are quite frequent in other languages, e.g. in the corpus of Tibetan Indigenous Grammatical Treatises, therefore, parsing texts with such phenomena is necessary for completeness of automatic morphosyntactic annotation of textual corpora. Development of the morphosyntactic analyzer for the Tibetan language started in 2016 and had already proved to be quite useful to deal with specific phenomena of Tibetan, and with previously unsolvable issues of tokenization. The ultimate goal of the project is to create a consistent formal grammatical description (formal grammar) of the Tibetan language, including all grammar levels of the language system from morphosyntax (syntactics of morphemes) to the syntax of composite sentences and supra-phrasal entities. The previously published version of the automatic morphosyntactic annotation was created on the basis of morphologically tagged corpora of Tibetan texts and had high, but not 100 percent coverage (the ratio of the amount of atoms covered by parse trees to the total amount of atoms), precision and recall. This article describes the problems that had to be solved after that, in order to develop the current version of the morphosyntactic parser which allowed to achieve complete and correct automatic annotation of the corpus, and the chosen ways of solving them, which allowed obtaining a complete morphosyntactic annotation of units previously treated as tokens (lexical tokens, words or other atomic parse elements), but required a substantial refactoring (restructuring existing code without changing its external behavior) of the formal grammar. Thus, not only the frequent, but all the constructions turned out to be important in the construction of the formal model.

Semantic ontology structure diagram

Similar publications

Citations