Fig. 1. Workflow of profane words extraction.

Benchmarking Cyber Harassment Dialogue Comprehension through Emotion-Informed Manifestations-Determinants Demarcation

Article

Full-text available

Mar 2024

In the digital age, cybercrimes, particularly cyber harassment, have become pressing issues, targeting vulnerable individuals like children, teenagers, and women. Understanding the experiences and needs of the victims is crucial for effective support and intervention. Online conversations between victims and virtual harassment counselors (chatbots) offer valuable insights into cyber harassment manifestations (CHMs) and determinants (CHDs). However, the distinction between CHMs and CHDs remains unclear. This research is the first to introduce concrete definitions for CHMs and CHDs, investigating their distinction through automated methods to enable efficient cyber-harassment dialogue comprehension. We present a novel dataset, Cyber-MaD that contains Cyber harassment dialogues manually annotated with Manifestations and Determinants. Additionally, we design an Emotion-informed Contextual Dual attention Convolution Transformer (E-ConDuCT) framework to extract CHMs and CHDs from cyber harassment dialogues. The framework primarily: a) utilizes inherent emotion features through adjective-noun pairs modeled by an autoencoder, b) employs a unique Contextual Dual attention Convolution Transformer to learn contextual insights; and c) incorporates a demarcation module leveraging task-specific emotional knowledge and a discriminator loss function to differentiate manifestations and determinants. E-ConDuCT outperforms the state-of-the-art systems on the Cyber-MaD corpus, showcasing its potential in the extraction of CHMs and CHDs. Furthermore, its robustness is demonstrated on the emotion cause extraction task using the CARES_CEASE-v2.0 dataset of suicide notes, confirming its efficacy across diverse cause extraction objectives. Access the code and data at 1. https://www.iitp.ac.in/~ai-nlp-ml/resources.html#E-ConDuCT-on-Cyber-MaD, 2. https://github.com/Soumitra816/Manifestations-Determinants.

" Evaluating the Impact of Adaptive External Dictionaries on Cyberbullying Detection using Machine Learning: A Review"

Preprint

Full-text available

Oct 2023

Cyberbullying has escalated due to social media's rapid growth, endangering internet security. Correct these harmful habits. ML is used to research cyberbullying on Twitter. This model is enhanced with adaptive external dictionary (AED). Terms that are negative and positive are produced manually. The dynamic lists of positive and negative words produced by AED sentiment analysis. The dataset has positive and negative tweet columns. Social media's fast expansion has increased cyberbullying, threatening online safety. Recognizing and addressing these risky activities quickly requires a comprehensive system. Uses ML to detect Twitter cyberbullying (ML). This model detects better using Adaptive External Dictionary.47K Kaggle tweets made the AED. Manual refinement only produces negative and positive phrases in the first portion, relevant to our topic. AED sentiment analysis creates dynamic lists of Positive Words (PW) and Negative Words (NW) in this study. Tweets are columns. Combining internet data with positive and negative word counts identifies cyberbullying.

Improving Cyberbullying Detection Through Adaptive External Dictionary in Machine Learning

Preprint

Full-text available

Aug 2023

Cyberbullying has escalated due to social media's rapid growth, endangering internet security. Correct these harmful habits. ML is used to research cyberbullying on Twitter. This model is enhanced with adaptive external dictionary (AED). Terms that are negative and positive are produced manually. The dynamic lists of positive and negative words produced by AED sentiment analysis. The dataset has positive and negative tweet columns. Social media's fast expansion has increased cyberbullying, threatening online safety. Recognizing and addressing these risky activities quickly requires a comprehensive system. Uses ML to detect Twitter cyberbullying (ML). This model detects better using Adaptive External Dictionary.47K Kaggle tweets made the AED. Manual refinement only produces negative and positive phrases in the first portion, relevant to our topic. AED sentiment analysis creates dynamic lists of Positive Words (PW) and Negative Words (NW) in this study. Tweets are columns. Combining internet data with positive and negative word counts identifies cyberbullying.

To Be Ethical and Responsible Digital Citizens or Not: A Linguistic Analysis of Cyberbullying on Social Media

Article

Full-text available

Apr 2022

As a worldwide epidemic in the digital age, cyberbullying is a pertinent but understudied concern—especially from the perspective of language. Elucidating the linguistic features of cyberbullying is critical both to preventing it and to cultivating ethical and responsible digital citizens. In this study, a mixed-method approach integrating lexical feature analysis, sentiment polarity analysis, and semantic network analysis was adopted to develop a deeper understanding of cyberbullying language. Five cyberbullying cases on Chinese social media were analyzed to uncover explicit and implicit linguistic features. Results indicated that cyberbullying comments had significantly different linguistic profiles than non-bullying comments and that explicit and implicit bullying were distinct. The content of cases further suggested that cyberbullying language varied in the use of words, types of cyberbullying, and sentiment polarity. These findings offer useful insight for designing automatic cyberbullying detection tools for Chinese social networking platforms. Implications also offer guidance for regulating cyberbullying and fostering ethical and responsible digital citizens.

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Preprint

Full-text available

Dec 2021

The widespread of offensive content online such as hate speech poses a growing societal problem. AI tools are necessary for supporting the moderation process at online platforms. For the evaluation of these identification tools, continuous experimentation with data sets in different languages are necessary. The HASOC track (Hate Speech and Offensive Content Identification) is dedicated to develop benchmark data for this purpose. This paper presents the HASOC subtrack for English, Hindi, and Marathi. The data set was assembled from Twitter. This subtrack has two sub-tasks. Task A is a binary classification problem (Hate and Not Offensive) offered for all three languages. Task B is a fine-grained classification problem for three classes (HATE) Hate speech, OFFENSIVE and PROFANITY offered for English and Hindi. Overall, 652 runs were submitted by 65 teams. The performance of the best classification algorithms for task A are F1 measures 0.91, 0.78 and 0.83 for Marathi, Hindi and English, respectively. This overview presents the tasks and the data development as well as the detailed results. The systems submitted to the competition applied a variety of technologies. The best performing algorithms were mainly variants of transformer architectures.

Factors influencing negative cyber-bystander behavior: A systematic literature review

Article

Full-text available

Oct 2022

Cyber-aggression is global epidemic affecting citizens of cyberspace, without regards to physical, geographical and time constraints. Recent research has identified the significant role of cyber-bystanders in exacerbating and de-escalating incidents on cyber-aggression they come across. Additionally, frequent exposure to cyber-aggression is found to have been associated with negative effects on participants of cyber-aggression, ranging from self-esteem problems to mental health disorders such as depression and anxiety, and in the worst cases even suicidal behaviors and ideation. Moreover, past research had also identified that negative bystanders could potentially become aggressors themselves. Therefore, the current review is aimed at uncovering the common themes and factors that drive individuals to resort to negative bystander behavior. Hence, a systematic literature review using the PRISMA framework was carried out, involving articles published between January 2012 to March 2022, on online databases such as SCOPUS, Science Direct, SAGE Journals, Web of Science, and Springer Link. Results obtained through the synthesis of 27 selected articles, were grouped into three categories, namely situational factors, personal factors and social influence. Upon further synthesis of the results, it was noted that many of the factors had interacted with each other. Thus, practical suggestion for prevention and future research would include addressing these interactions in preventative methodologies and research interests.

Workflow of profane words extraction.

Context in source publication

Citations