FIGURE 5 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Source publication
Extracting software features from the public product descriptions in the natural language is beneficial for developing new products. Because software features are often expressed in phrases, many approaches currently propose to define phrase patterns and extract phrases as features from product descriptions accordingly. However, there are often lot...
Context in source publication
Similar publications
We contribute to the debate on societal impact of SSH by developing a methodology that allows a fine-grained observation of social groups that make use, directly or indirectly, of the results of research. We develop a lexicon of users with 76,857 entries, which saturates the semantic field of social groups of users and allows normalization. We use...
The purpose of this study is to examine the psychological structure of “images of KENDO” in university Kendo club students. This study included 235 subjects (89 male, 146 female), who were asked to write the 10 sentences that came to mind after the phrase “Kendo is” on a questionnaire. Of those sentences, they were asked to choose the one sentence...
The aim of the article is to identify unique image attributes of Poznań, Wrocław and Bratislava. It was made using the analysis of opinions posted on the English-language TripAdvisor website ('Things to do…' category for three cities). 76 most-emerging words-attributes were extracted from 29,383 reviews with the text mining procedure. These words d...
The purpose of this study is to extract features and structure them using text mining and to analyze changes over time on consultation records accumulated in a cancer consultation and support center database from 2009 to 2018. The text-mining approach worked effectively under conditions of expanding data, and a co-occurrence network revealed patter...
The task of recognizing arguments and their components in text is known as argument extraction. Most arguments might be broken down into a petition and at least one premise that support it. A method to extract arguments is suggested in this work. The major words which are of high importance in arguments extraction were included in the suggested met...
Citations
... [7] Textual data have inherent overlapping and some researchers have focused on this issue (e.g. [31]- [34]). In this research, we introduce a topological overlapping clustering algorithm that is suitable for textual data. ...
Text clustering is used to extract specific information from textual data and even categorizes text based on topic and sentiment. Due to inherent overlapping in textual documents, overlapping clustering algorithms have become a suitable approach for text analysing. However, state-of-the-art algorithms are not fast enough to analyse a large volume of textual data within tolerable time limits. In this research, we propose our text clustering algorithm, FOCT, which is a fast overlapping extension of SOM, one of the best algorithms for clustering textual data. We apply some heuristics to extract special characteristics presented in textual data and establish a very fast overlapping clustering algorithm. We use fast methods to represent the vectors of documents, compute the similarity of documents and neurons and update the weights of neurons. In our algorithm, each document can belong to one or more neurons and this is in line with what many documents have in their essence. We analyse the efficiency of the proposed algorithm over k-means, OKM, SOM and OSOM clustering approaches and experimentally demonstrate that it runs 12 to 690 times faster, and the overlap size of FOCT clusters is closer to the overlap size of the original data. The quality of clusters is also measured by four different internal and external evaluation criteria where FOCT clusters represent up to 64% better quality.