Figure 3 - available via license: Creative Commons Attribution 3.0 Unported
Content may be subject to copyright.
WordCloud of Real News. Figure 4. WordCloud of Fake News.

WordCloud of Real News. Figure 4. WordCloud of Fake News.

Source publication
Article
Full-text available
The spread of Fake news on the Internet misleads people to have false understandings and recognitions of certain events and make ill-advised decisions. This widespread can post a significant threat to modern society politically, economically, and ethnically. In this paper, multiple methods and approaches are mainly discussed for text-data preproces...

Context in source publication

Context 1
... WordCloud is created which can visualize the content of the news/job postings and have a clear view of what appears most frequently in Real/Fake news. Figure 3 and Figure 4 shows the WordCloud of real news and fake news in Dataset 1. ...

Similar publications

Presentation
Full-text available
Authenticity, Purity Criteria, and Detection of Adulteration of Fats and Oils Using Modern Analytical Methods

Citations

Preprint
Full-text available
GPT, or Generative Pre-trained Transformer, revolutionizes the efficiency of reviewing research and development (R&D) publications by leveraging its natural language processing capabilities. Traditional methods of literature review are time-consuming and labor-intensive, often requiring researchers to sift through vast amounts of text to extract relevant information. GPT streamlines this process through its ability to comprehend, analyze, and summarize text swiftly. Firstly, GPT can quickly scan through a plethora of publications, identifying key concepts, methodologies, and findings. Its advanced language understanding enables it to grasp complex scientific language and extract essential information accurately. This rapid processing significantly reduces the time researchers spend on manual literature review. Secondly, GPT generates concise summaries of research articles, condensing lengthy texts into digestible snippets without sacrificing critical details. These summaries offer researchers a quick overview of the content, allowing them to prioritize articles based on relevance to their own work. Moreover, GPT can assist in identifying connections between different studies and trends within a particular field. By analyzing large volumes of literature, it can detect patterns, emerging topics, and gaps in knowledge, guiding researchers towards fruitful areas for further investigation. Additionally, GPT can aid in writing literature reviews by suggesting structured outlines and integrating relevant information seamlessly. This not only saves time but also enhances the quality and coherence of the review. Overall, GPT's ability to expedite the process of reviewing R&D publications is invaluable in accelerating the pace of scientific discovery and innovation. By automating tedious tasks and providing insightful analyses, GPT empowers researchers to focus their efforts on advancing knowledge and solving pressing challenges.
Preprint
Full-text available
The use of text-mining for knowledge synthesis, including literature reviews, has been gaining traction as modern Natural Language Processing tools have progressed. However, there is a trade off between accuracy and precision with workload times. The largest workload times for data extraction typically have better validation metrics. Manual data extraction is an example of this. Currently, there are a limited number of text-mining tools that have shown data extraction from research texts within minutes while maintaining acceptable validation metrics. By using a previously demonstrated generalized approach to determine research production with modern text-mining tools for extracting research locations and subtopics, a methodology with higher than average validation metrics and significantly improved workload times is shown. It was shown that coupling two reliable Named Entity Recognition tool kits for research location extraction, a Latent Dirichlet Allocation model for unsupervised subtopic detection, and a Large Language Model for subtopic clustering can extract research locations, from more than 1,000 public health research articles, and cluster their subtopics, in less than 5 minutes. Validation metrics return F1 Measures of 92% for research location extraction and 71% for subtopic clustering. Additionally, applying these results to a previously constructed spatiotemporal generalization process reproduces the generalized results with a correlation coefficient of 97%, which significantly varies from random chance (p < 0.001). The mapped fitted values are reliably close with the validation model’s fitted values.