Different sets of features and the corresponding models.

Source publication

Textual Entailment Using Machine Translation Evaluation Metrics

Conference Paper

Full-text available

Apr 2017

In this paper we propose a novel approach to determine Textual Entailment (TE) relation between a pair of text expressions. Different machine translation along with summary evaluation metrics and polarity feature have been used as features for different machine learning classifiers to take the entailment decision in this study. We consider three ma...

Context 1

... use the im- plementation as available in Weka toolkit 3 . The classifiers are trained with the features as discussed earlier and summarized in Table 2. The classifier assigns a prediction class to each T-H pair in test dataset of unknown class. ...

View in full-text

A hybrid deep semantic mining method considering fuzzy expressions for the automatic recognition of construction safety hazard information

Article

Apr 2024
ADV ENG INFORM

Safety hazards are a key consideration in construction management. The efficient recognition of safety hazard information can help managers formulate safety hazard management measures and improve the efficiency of construction safety management. However, construction site safety hazard data are stored in semistructured and unstructured text formats, which cannot be directly converted into understandable and usable information. Moreover, safety hazard text contains many fuzzy expressions, thereby increasing the difficulty of text semantic analysis; thus, how to accurately mine safety hazard information from complex and diverse text data is an urgent problem that must be solved. In consideration of this problem, we propose a bidirectional long short-term memory (BiLSTM) method with a fuzzy word vector and self-attention mechanism (FSABiLSTM) to automatically recognize safety hazard information. This method adopts TextRank and Word2vec to calculate the fuzzy word vector and process fuzzy expressions in safety hazard text. The safety hazard text semantic features are deeply extracted based on BiLSTM and a fuzzy word vector, and the extracted semantic features are analyzed via a self-attention mechanism. Actual construction safety hazard text is used to verify the reliability and applicability of the method, and the results indicate that the accuracy of this method, which outperforms existing machine learning methods, is 91.70%. In addition, the FSABiLSTM method can be used to automatically evaluate the risk degree of safety hazards; this use is beneficial to managing and controlling safety hazards. Concerning safety hazard text data, this study provides a new deep mining approach that can enhance safety management efficiency.

Rewarding Chatbots for Real-World Engagement with Millions of Users

Preprint

Full-text available

Mar 2023

The emergence of pretrained large language models has led to the deployment of a range of social chatbots for chitchat. Although these chatbots demonstrate language ability and fluency, they are not guaranteed to be engaging and can struggle to retain users. This work investigates the development of social chatbots that prioritize user engagement to enhance retention, specifically examining the use of human feedback to efficiently develop highly engaging chatbots. The proposed approach uses automatic pseudo-labels collected from user interactions to train a reward model that can be used to reject low-scoring sample responses generated by the chatbot model at inference time. Intuitive evaluation metrics, such as mean conversation length (MCL), are introduced as proxies to measure the level of engagement of deployed chatbots. A/B testing on groups of 10,000 new daily chatbot users on the Chai Research platform shows that this approach increases the MCL by up to 70%, which translates to a more than 30% increase in user retention for a GPT-J 6B model. Future work aims to use the reward model to realise a data fly-wheel, where the latest user conversations can be used to alternately fine-tune the language model and the reward model.

Recognizing Textual Entailment Using Weighted Dependency Relations

Chapter

Feb 2023

In this paper, we describe a hybrid approach for Recognizing Textual Entailment (RTE) that makes use of dependency parsing and semantic similarity measures. Dependency triplet matching is performed between dependency parsed Text (T) and Hypothesis (H). In case of dependency relation match, we also consider partial matching and semantic similarity between the associated words is calculated with the help of various semantic similarity measures. Importance of various dependency relations with respect to the TE task is computed in terms of their information gain and the dependency relations are weighted accordingly. This paper reports our experiments carried out on the RTE-1, RTE-2 and RTE-3 benchmark datasets using three approaches namely greedy approach, exhaustive search and greedy approach with weighted dependency relations. Experimental results show that weighted dependency relations significantly improve TE performance over the baseline.

A Literature Survey of Recent Advances in Chatbots

Article

Full-text available

Jan 2022

Chatbots are intelligent conversational computer systems designed to mimic human conversation to enable automated online guidance and support. The increased benefits of chatbots led to their wide adoption by many industries in order to provide virtual assistance to customers. Chatbots utilize methods and algorithms from two Artificial Intelligence domains: Natural Language Processing and Machine Learning. However, there are many challenges and limitations in their application. In this survey we review recent advances on chatbots, where Artificial Intelligence and Natural Language processing are used. We highlight the main challenges and limitations of current work and make recommendations for future research investigation.

A Literature Survey of Recent Advances in Chatbots

Preprint

Full-text available

Dec 2021

Chatbots are intelligent conversational computer systems designed to mimic human conversation to enable automated online guidance and support. The increased benefits of chatbots led to their wide adoption by many industries in order to provide virtual assistance to customers. Chatbots utilise methods and algorithms from two Artificial Intelligence domains: Natural Language Processing and Machine Learning. However, there are many challenges and limitations in their application. In this survey we review recent advances on chatbots, where Artificial Intelligence and Natural Language processing are used. We highlight the main challenges and limitations of current work and make recommendations for future research investigation

A Neural Framework for English-Hindi Cross-Lingual Natural Language Inference

Conference Paper

Nov 2020

Recognizing Textual Entailment (RTE) between two pieces of texts is a very crucial problem in Natural Language Processing (NLP), and it adds further challenges when involving two different languages, i.e. in cross-lingual scenario. The paucity of a large volume of datasets for this problem has become the key bottleneck of nourishing research in this line. In this paper, we provide a deep neural framework for cross-lingual textual entailment involving English and Hindi. As there are no large dataset available for this task, we first create this by translating the premises and hypotheses pairs of Stanford Natural Language Inference (SNLI) (https://nlp.stanford.edu/projects/snli/) dataset into Hindi. We develop a Bidirectional Encoder Representations for Transformers (BERT) based baseline on this newly created dataset. We perform experiments in both mono-lingual and cross-lingual settings. For the mono-lingual setting, we obtain the accuracy scores of 83% and 72% for English and Hindi languages, respectively. In the cross-lingual setting, we obtain the accuracy scores of 69% and 72% for English-Hindi and Hindi-English language pairs, respectively. We hope this dataset can serve as valuable resource for research and evaluation of Cross Lingual Textual Entailment (CLTE) models.

Different sets of features and the corresponding models.

Context in source publication

Citations