A labeled example of SR-CRF.

Source publication

Useful structured information in encyclopedias.

The framework of our encyclopedia-based ontology learning system.

Two InfoBox examples extracted from Wikipedia, the labels in the green...

The synonym relations extracted from the redirection pages and InfoBox...

Self-Supervised Chinese Ontology Learning from Online Encyclopedias

Article

Full-text available

Mar 2014

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the...

Learning from syntax generalizations for automatic semantic annotation

Article

Full-text available

Oct 2014

Nowadays, there is a huge amount of textual data coming from on-line social communities like Twitter or encyclopedic data provided by Wikipedia and similar platforms. This Big Data Era created novel challenges to be faced in order to make sense of large data storages as well as to efficiently find specific information within them. In a more domain-...

Cross-Cultural Studies Using Social Networks Data

Article

Full-text available

Jun 2019

With the widespread access of people to the Internet and the increasing usage of social networks in all nations, social networks have become a new source to study cultural similarities and differences. We identified major issues in traditional methods of data collection in cross-cultural studies: difficulty in access to people from many nations, limited number of samples, negative effects of translation, positive self-enhancement illusion, and a few unreported problems. These issues are either causing difficulty to perform a cross-cultural study or have negative impacts on the validity of the final results. In this paper, we propose a framework that aims to calculate cultural distance among several countries using the information and cultural features extracted from social networks. To this aim, the framework estimates the distribution of news-oriented tweets for each nation and computes the cultural distance from these sets of distributions. Based on a sample composed of more than 17 million tweets from late 2017, our framework calculated cultural distance between 22 countries. Our results show a positive correlation between cultural distances computed by our framework and distances computed by Hofstede's cultural scores and also identified connections between some of the cultural features.

On building a diabetes centric knowledge base via mining the web

Article

Full-text available

Apr 2019
BMC MED INFORM DECIS

Background Diabetes has become one of the hot topics in life science researches. To support the analytical procedures, researchers and analysts expend a mass of labor cost to collect experimental data, which is also error-prone. To reduce the cost and to ensure the data quality, there is a growing trend of extracting clinical events in form of knowledge from electronic medical records (EMRs). To do so, we first need a high-coverage knowledge base (KB) of a specific disease to support the above extraction tasks called KB-based Extraction. Methods We propose an approach to build a diabetes-centric knowledge base (a.k.a. DKB) via mining the Web. In particular, we first extract knowledge from semi-structured contents of vertical portals, fuse individual knowledge from each site, and further map them to a unified KB. The target DKB is then extracted from the overall KB based on a distance-based Expectation-Maximization (EM) algorithm. Results During the experiments, we selected eight popular vertical portals in China as data sources to construct DKB. There are 7703 instances and 96,041 edges in the final diabetes KB covering diseases, symptoms, western medicines, traditional Chinese medicines, examinations, departments, and body structures. The accuracy of DKB is 95.91%. Besides the quality assessment of extracted knowledge from vertical portals, we also carried out detailed experiments for evaluating the knowledge fusion performance as well as the convergence of the distance-based EM algorithm with positive results. Conclusions In this paper, we introduced an approach to constructing DKB. A knowledge extraction and fusion pipeline was first used to extract semi-structured data from vertical portals and individual KBs were further fused into a unified knowledge base. After that, we develop a distance based Expectation Maximization algorithm to extract a subset from the overall knowledge base forming the target DKB. Experiments showed that the data in DKB are rich and of high-quality.

Opinion and emotion analysis through the linked data lens

Conference Paper

Full-text available

May 2018

The immense contribution of the social web has greatly motivated researchers. This led to the emergence of techniques that have proven their effectiveness in customised opinion and emotion modeling related to applications such as NLP, automatic learning etc.... On the other hand, when it comes to interoperability and a unique encoding of opinions and emotions, there are some weaknesses. This prompted a new research direction that combines opinion analysis works with those of "Linked Data". In this article, we will expose different solutions and projects by presenting some limitations. The reasons why we also believe that linked data is important and how we would like to conduct this research are also detailed in this article.

Generating Domain Ontology from Chinese Customer Reviews to Analysis Fine-gained Product Quality Risk

Conference Paper

May 2018

With the rapid development of E-commerce in China, quality of the products on online shopping platforms has caused wide concern. Customer reviews, which commented by people who bought the very product, now have been one of the most important resources for analyzing product's quality risk. We can get fine-gained, aspect-oriented risk information of a product by mining its reviews. Unfortunately, people tend to write reviews with casual grammar or just omit parts of components of a sentence. Both these features will cause negative impacts when parsing the raw customer reviews directly. Thus a knowledge base which is built totally beyond the reviews could be used to analyze it despite the drawbacks above. In this paper, we generate a domain ontology from raw text in the online encyclopedia. It can be viewed as a graph whose nodes represent domain concepts and edges represent the relations between these concepts. In our work, we integrate syntactic tree structure in linear-chain CRFs for recognizing domain concepts and train SVMs and MaxEnt models on elaborate features for clarifying three types of relationship, namely "Attribute-of", "Part-of" and "Instance-of". Once the ontology has been built, product properties with potential risk will be extracted by our matching method. Experiment show that our approach achieves 64.4% precision and 82.4% recall on risky property extraction task.

Design of Automatic Extraction Algorithm of Knowledge Points for MOOCs

Article

Full-text available

Sep 2015
Comput Intell Neurosci

In recent years, Massive Open Online Courses (MOOCs) are very popular among college students and have a powerful impact on academic institutions. In the MOOCs environment, knowledge discovery and knowledge sharing are very important, which currently are often achieved by ontology techniques. In building ontology, automatic extraction technology is crucial. Because the general methods of text mining algorithm do not have obvious effect on online course, we designed automatic extracting course knowledge points (AECKP) algorithm for online course. It includes document classification, Chinese word segmentation, and POS tagging for each document. Vector Space Model (VSM) is used to calculate similarity and design the weight to optimize the TF-IDF algorithm output values, and the higher scores will be selected as knowledge points. Course documents of “C programming language” are selected for the experiment in this study. The results show that the proposed approach can achieve satisfactory accuracy rate and recall rate.

Knowledge graph construction of traditional Tibetan medicine formulas

Conference Paper

Oct 2023

Personalized Recommendation of Learning Resources Based on Knowledge Graph

Conference Paper

Jan 2022

Research on Domain Ontology Automation Construction Based on Chinese Texts

Conference Paper

Feb 2019

The main construction method of the current ontology is to rely on ontology experts for manual construction. Because manual construction requires a lot of manual participation, manual construction has great limitations. Text data as one of the main forms of data source, how to construct domain ontology automatically from texts and how to provide semantic retrieval support to text quickly by ontology is the hotspot of ontology research at present. Aiming at the above problems, an automatic construction method of domain ontology based on knowledge graph and association rule mining is presented, and it can extract the concepts, hierarchies and non-hierarchies of domain ontology from text, and finally form ontology by Jena. It also provides semantic retrieval of text by associating text and concepts in the process of ontology construction. Finally, the effect of automatic ontology construction is verified by the effect of text retrieval.

Machine Learning-Based Charging Network Operation Service Platform Reservation Charging Service System

Conference Paper

Nov 2018

This paper proposes a machine learning-based electric vehicle (EV) reserved charging service system, which takes into consideration the impacts from both the power system and transportation system. The proposed framework of charging network operation service platform links the power system with transportation system through the charging navigation of massive EVs. The "reserved charging + consumption" integrated service model would be great significant for dealing with large-scale integration of electric vehicles. It applies the concept of charging time window to optimization of EV charging prediction for the reserved charging service system, and designs a dynamic dispatching model based on sliding time axis to make charging process of users get rid of constraints of queuing time and charging service fee period.

Research on Information Integration Method of Agricultural Products Producing and Managing Based on Knowledge Graph: 11th IFIP WG 5.14 International Conference, CCTA 2017, Jilin, China, August 12-15, 2017, Proceedings, Part I

Chapter

Jan 2019

In order to improve the integration and access efficiency of agricultural information, this paper propose an agricultural information integration framework based on knowledge graph. A knowledge graph of agricultural products producing and managing was constructed, covering the basic process of “Planting - farming - processing - quality inspection - warehousing - Transportation - Sales” and realizing the storage, mapping and inquiry of knowledge graph. Improves the method of mapping data linkage based on database mapping relation, and realizes the transformation of elements from database to knowledge graph elements. Map data link method of database based on mapping relations, realize the conversion of database elements to the knowledge graph elements, the iterative discovery of relation and pattern in text information is realized by means of weak supervised machine learning method. This method integrates the application in the Green-Cloud-Grid platform, and improves the efficiency of information source integration, correlation analysis and mining utilization under the platform.

A labeled example of SR-CRF.

Similar publications

Citations