Figure 1 - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Content may be subject to copyright.
Test example of a dialogue created from Reddit with sentences judged as relevant and non-relevant by human annotators.
Source publication
We address the task of sentence retrieval for open-ended dialogues. The goal is to retrieve sentences from a document corpus that contain information useful for generating the next turn in a given dialogue. Prior work on dialogue-based retrieval focused on specific types of dialogues: either conversational QA or conversational search. To address a...
Contexts in source publication
Context 1
... each dialogue, 50 sentences were retrieved from Wikipedia using an unsupervised initial retrieval method. These sentences were judged by crowd workers for relevance, that is, whether they contained information useful for generating the next turn in the arXiv: 2205 Figure 1 depicts one such dialogue, with two sentences annotated by the raters, one as relevant and one as non-relevant. The dataset is available at https://github.com/SIGIR-2022/A-Datasetfor-Sentence-Retrieval-for-Open-Ended-Dialogues.git. ...
Context 2
... only conversational passage retrieval dataset we are familiar with is from TREC's CAsT tracks [7,8]. However, CAsT's queries reflect explicit intents, while we are also interested in more open dialogues where the information needs can be in the form of implicit intents, as shown for example in Figure 1. In these datasets, the user conducts a query session on a specific single topic. ...
Similar publications
Task-oriented dialogue systems aim to answer questions from users and provide immediate help. Therefore, how humans perceive their helpfulness is important. However, neither the human-perceived helpfulness of task-oriented dialogue systems nor its fairness implication has been studied yet. In this paper, we define a dialogue response as helpful if...
Pre-trained models have demonstrated superior power on many important tasks. However, it is still an open problem of designing effective pre-training strategies so as to promote the models' usability on dense retrieval. In this paper, we propose a novel pre-training framework for dense retrieval based on the Masked Auto-Encoder, known as RetroMAE....
Citations
We address the task of retrieving sentences for an open domain dialogue that contain information useful for generating the next turn. We propose several novel neural retrieval architectures based on dual contextual modeling: the dialogue context and the context of the sentence in its ambient document. The architectures utilize contextualized language models (BERT), fine-tuned on a large-scale dataset constructed from Reddit. We evaluate the models using a recently published dataset. The performance of our most effective model is substantially superior to that of strong baselines.KeywordsOpen domain dialogueDialogue retrievalSentence retrieval