ArticlePDF Available

Helping people find what they don't know

Authors:

Abstract

The article focuses on cognitive modeling for games and animation The article explores the possibility of personalization that offers suggestions and recommendations to a customer for performing a task on the Internet or Web services in a better way. When people engage in information searching, it is evident that their current state of knowledge is inadequate. Consequently, they are not able to identify salient characteristics of information objects. This situation suggests it might be appropriate for some part of the information system to recommend courses of action to information seekers. A specific aspect of difficulties people face in interacting with information systems is selecting correct words to represent their problem. This originates a paradox, as it is not simple to specify what one does not know. John Rocchio , a researcher on information systems, suggested the use of characteristics of the information objects to counter this. An alternative to this approach is to show the user new terms that might be useful for query reformulation. This study presents offering support by information systems to users who undertake specified searching, but can be extended to other domains as well.
Helping People Find What They Don't Know
Nicholas J. Belkin
Communications of the ACM Vol. 43, No. 8 (August 2000), Pages 58-61
Recommendation systems help users find the correct words for a successful search.
Imagine you are performing a task while interacting with a service hosted on the Internet or with
an automated speech recognition mobile phone service. What if during your interaction with this
service, a machine makes a recommendation suggesting how you could better perform your
current task? An important problem relating to personalization concerns understanding how a
machine can help an individual user via suggesting recommendations.
When people engage in information-seeking behavior, it's usually because they are hoping to
resolve some problem, or achieve some goal, for which their current state of knowledge is
inadequate. This suggests they don't really know what might be useful for them, and therefore
may not be able to specify the salient characteristics of potentially useful information objects.
Unfortunately, typical information systems require users to specify what they want the system to
retrieve. Furthermore, people engaging in large-scale information systems typically are
unfamiliar with the underlying operations of the systems, the vocabularies the systems use to
describe the information objects in their databases, and even the nature of the databases
themselves. This situation suggests it might be appropriate for some part of the information
system to recommend courses of action to information seekers, which could help them to better
understand their problems, and to use the system's resources more effectively. This is the general
challenge our research group at Rutgers has been addressing over the last several years [2, 4].
One specific aspect of the difficulties people face in interacting in information systems is
choosing the correct words to represent their information problems. In the typical information
system, which assumes a model of information seeking called "specified searching," the user in
the system is asked to generate a query, which is understood to be a specification of what she or
he wants to have retrieved. In order for the system to search and find appropriate responses, the
query must be couched in terms matching the way the information objects are represented in the
system. Whether such representation is based on the actual words used in the information objects
themselves (so-called "keyword representation"), or on a controlled vocabulary representing the
domain or the database (so-called "conceptual representation"), the problem for the user is the
same: How to guess what words to use for the query that will adequately represent the person's
problem and be the same as those used by the system in its representation. In information
retrieval research and practice, it is generally understood that accomplishing these two goals is a
multistage, interactive process of initial query formulation, which allows users to enter into
interaction with the system, and subsequent iterations of query reformulation, based upon the
results of the interaction [5, 8]. This is an extremely difficult problem as it is difficult for people
to specify what they don't know; there are many words that can be used to express the same
ideas; predicting how another will talk about a topic is uncertain at best; and, predicting what
another finds important, and worthy of representation, cannot be readily ascertained. For
instance, consider the person who wishes to find obituary information about some group of well-
known Americans. In a system relying on the words in the text for representation, using the term
"obituary" in the query will not be useful, since that word is never used in the text of an obituary.
However, words or phrases such as "died," "yesterday" (or any of the days of the week),
"mourned by," "survived by," are commonly used in obituaries. It will be the rare user who will
understand these characteristics of newspaper obituaries and be able to make use of them in an
initial query, or even in query reformulation. Similar arguments hold for the representation of
"well-known" and "American." How can a system help its user to overcome such problems?
In the mid-1960s, John Rocchio suggested a technique for addressing this problem called
"relevance feedback" [7]. For reasons already mentioned, a user is unlikely to begin an
interaction with the ideal query (that is, that query that best specifies what is to be searched for
and retrieved). Furthermore, because the user is unlikely to understand the complexities of
representation and matching within an information retrieval system, that person will be unlikely
to engage in effective query reformulation. However, we can assume the user will be able to
recognize, and indicate whether a retrieved information object is relevant or not to the problem.
Rocchio suggested the system could use the characteristics (that is, word frequencies and
distributions) of the information objects judged relevant or not in order to modify (reformulate)
the original query, until the query eventually became ideal, separating relevant from nonrelevant
objects in the best possible way. The user's role in this interaction is merely to indicate relevance
or nonrelevance of a retrieved object; the query reformulation takes place internal to the system,
and the user's only knowledge of that process is through the list of objects retrieved as a result of
the reformulated query. We can characterize this type of interaction as system-controlled with
respect to term recommendation. However, indicating relevance or nonrelevance gives the user
some measure of influence on query reformulation through her or his interaction with the system
results.
An alternative approach to system-support for query reformulation is for the system to show the
user—given the terms used in the original query, and/or the documents retrieved by the original
query—new terms that might be useful for query reformulation. These terms can be identified
through their empirical relationships to the query terms as determined by co-occurrence, for
instance, with the query terms in a document, or co-occurence in similar contexts in the database.
It is the user's task in such systems to examine the suggested terms, and to manually reformulate
the query given the information provided by the system. Such techniques are typically known as
"term suggestion" devices, and can be thought of as user-controlled, at least to the extent the user
controls how the query is reformulated. In this case, the actual terms suggested do not depend
upon the user's response to the system's results.
When people engage in information-seeking behavior, it's usually because they
are hoping to resolve some problem, or achieve some goal, for which their current
state of knowledge is inadequate.
At Rutgers, we have been investigating support for query reformulation (that is, recommendation
by the system of how a query might be better put) both with respect to relevance feedback versus
term recommendation, and with respect to user knowledge and control of such support. One of
our early results [6] showed that relevance feedback worked well in an interactive information
retrieval environment, but it also worked better with both increased knowledge of how it worked,
and with increased control by the user of its suggestions. That is, a version of relevance feedback
in which the user was informed of the basic algorithms used in query reformulation, and in
which the terms the system would use to reformulate the query based on the user's relevance
judgments were presented to the user for selection (a term suggestion device), performed
consistently better than one where the user knew only that marking documents relevant would
help the system to find similar documents. Perhaps more important, the subjects in the
experiment preferred the former to the latter by a wide margin, because they felt they had control
and knowledge of the query reformulation process. This led us to the conclusion that explicit
term suggestion is a better way to recommend system support for query reformulation than
automatic, behind-the-scenes query reformulation.
We recently compared our version of relevance feedback as a term suggestion device (in which
the user controls the suggested terms through marking documents relevant) with a version of
term suggestion in which the user has no control over which terms are suggested [3]. In both
systems, users had some knowledge of how the suggested terms were chosen. The primary
difference between the two was that users of the relevance feedback-based system had to make
decisions about whether a document was relevant before they were offered any suggested terms.
In the uncontrolled term suggestion system such terms were displayed at the same time as the
query results. Our results indicate that users were willing to give up the control they gained over
suggested terms through explicit relevance feedback, in favor of the reduced effort (that is, not
having to make both relevance and term selections decisions) on their part in the uncontrolled
term suggestion system.
What can we make of these results? It seems that user control over system recommendation for
query reformulation is important to users with respect to their main task—a good query
reformulation. But control (and, therefore, better understanding) of what terms are actually
suggested—a subsidiary task—is not very important. Rather, having to engage in the subsidiary
task distracts them from what they actually need to do. These conclusions must be understood
with several caveats, however. First, it does seem to be necessary that users have some
understanding of how the suggested terms are determined in order to be comfortable and
effective in using them. Also, the terms suggested need to be perceived as related to the context
of the search. Strange or unexpected terms made the subjects uncomfortable, and distracted them
from query reformulation, and from the search task. These conditions mean that in order to
accept and use the system recommendations effectively, the users need to have some trust in the
system with respect to the suggested terms. They also need to exert control over the system with
respect to the terms they thought would be useful. Trust with respect to the task not perceived as
salient allowed the users to accept the recommendation without question. But with respect to the
task that is clearly salient, the users were not willing to give up their autonomy to the system.
These results have clear implications for how recommender systems should operate in general.
The work described here concerns offering support to users of information systems who engage
in one particular kind of information-seeking activity—specified searching. Of course, people
engage in many other kinds of interactions with information, for instance, browsing, evaluating,
using, learning, both within a single information-seeking episode, and across episodes. At
Rutgers University, and in collaboration with colleagues elsewhere, we are engaged in a long-
term program researching how best to offer support to people in a variety of different
information-seeking behaviors [1, 4]. Query formulation and reformulation is just one problem
people face in one or more of such activities. Understanding the contents of databases, learning
about effective vocabularies, being able to evaluate the relevance of an information object
quickly and accurately are other kinds of important problems that people face in their
information seeking for which system recommendations could offer useful support. As we have
addressed several such challenges, we have seen results similar to those we found in our query
reformulation studies: With sufficient reason to trust the system recommendations, users are
willing to give up some measure of control, accepting suggestions while maintaining control
over how they are applied. We are attempting to apply these results in the design of cooperative,
collaborative, dialogue-based information systems where users and the rest of the system each
have their own roles and responsibilities, offering and accepting suggestions from one another,
as appropriate.
References
1. Belkin, N.J. Intelligent information retrieval: Whose intelligence? Herausforderungen an die
Informationswissenschaft. Proceedings des 5. Internationalen Sypmosiums für
Informationswissenschaft (ISI `96). J. Krause, M. Herfurth, and J. Marx, Eds. 1996.
Universitätsverlag Konstanz, 25–31.
2. Belkin, N.J. An overview of results from Rutgers' investigations of interactive information
retrieval. In Proceedings of the Clinic on Library Applications of Data Processing. P.A.
Cochrane and E.H. Johnson, eds. 1998. Graduate School of Library and Information Science,
University of Illinois at Urbana-Champaign, 45–62.
3. Belkin, N.J., Cool, C., Head, J., Jeng, J., Kelly, D., Lin, S.J., Lobash, L. Park, S.Y., Savage-
Knepshield, P., and Sikora, C. Relevance feedback versus Local Context Analysis as term
suggestion devices: In Proceedings of the Eighth Text Retrieval Conference TREC8.
(Washington, D.C., 2000). In press; trec.nist.gov/pubs/trec8/t8_proceedings.
4. Belkin, N.J., Cool, C., Stein, A., and Thiel, U. Cases, scripts and information seeking
strategies: On the design of interactive information retrieval systems. Expert Syst. Apps. 9
(1995), 379–395.
5. Efthimiadis, E. Query expansion. Annual Rev. Info. Sci. Tech. 31 (1996), 121–187.
6. Koenemann, J. Relevance feedback: usage, usability, utility. Ph.D. Dissertation (1996).
Rutgers University, Dept. of Psychology. New Brunswick, NJ.
7. Rocchio, J. Relevance feedback in information retrieval. The SMART Retrieval System:
Experiments in Automatic Document Processing. G. Salton, ed. (1971). Prentice-Hall,
Englewood Cliffs, NJ, 313–323.
8. Spink, A. and Losee, R.M. (1996) Feedback in information retrieval. Annual Rev. Info. Sci.
Tech. 31 (1996), 33–78.
... In most cases, individuals who engage in information-seeking behaviour have an identified problem that they need addressed (Belkin, 2000;Case, 2002). Depending on the context of this problem, the individual's knowledge level (i.e. ...
Article
Full-text available
Often, when an offender is sentenced their family and friends find themselves in a state of uncertainty. At this point, family and friends of prisoners need support and often find themselves alone to navigate and learn the correctional system to gain visitation approval. It is unknown how people new to visitation learns the rules and processes of prison visits to gain visitation access. This study explores 21 prison visitors' information-seeking behaviour to understanding how people new to prison visitation learns to navigate the system to obtain visitation approval and identify any factors that might impede their ability to information-seek, thus delaying or preventing visitation. Using Flexible Pattern Matching Analysis we identified five factors that can occur prior to individual's need to information-seek, and one key factor that was common during the visit experience that can impact peoples 'ability to information seek'. Implications for prison visitation policy and practice are discussed.
... In most cases, individuals who engage in information-seeking behaviour have an identified problem that they need addressed (Belkin, 2000;Case, 2002). Depending on the context of this problem, the individual's knowledge level (i.e. ...
Article
Often, when an offender is sentenced their family and friends find themselves in a state of uncertainty. At this point, family and friends of prisoners need support and often find themselves alone to navigate and learn the correctional system to gain visitation approval. It is unknown how people new to visitation learns the rules and processes of prison visits to gain visitation access. This study explores 21 prison visitors' information-seeking behaviour to understanding how people new to prison visitation learns to navigate the system to obtain visitation approval and identify any factors that might impede their ability to information-seek, thus delaying or preventing visitation. Using Flexible Pattern Matching Analysis we identified five factors that can occur prior to individual's need to information-seek, and one key factor that was common during the visit experience that can impact peoples 'ability to information seek'. Implications for prison visitation policy and practice are discussed.
... With regard to the respondents' information-seeking behavior, their most common strategy to retrieve information in a digital environment was keyword searching, followed by thematic search, visiting websites that have been previously saved in the browser, as well as searching with phrases. Keyword searching is the basic strategy for digital information seeking in the web, but is unsuitable in cases when the users have unclear search objectives, complex tasks to perform, or have insufficient former knowledge [45][46][47]. Therefore, the need to change the information search strategy is inevitable when it has not produced the desired results. ...
Article
Full-text available
Background The plethora of information in the contemporary digital age is enormous and beyond the capability of the average person to process all the information received. During the COVID-19 pandemic outbreak, huge amount of information is increasingly available in digital information sources and overwhelms the average person. The purpose of this research was to investigate public’s information seeking behaviour on COVID-19 in Greece. Method The study was conducted through a web-based survey, facilitated by the use of questionnaire posted on the Google Forms platform. The questionnaire consisted of closed-ended, 7-point Likert scale questions and multiple choice questions and was distributed to all over Greek Regions to almost 3.000 recipients, during the implementation of restrictive measures against the COVID-19 outbreak in Spring 2020. The data collected were subjected to a descriptive statistical analysis. The median was used to present the results. In order to perform analysis between genders, as well as age groups, the non-parametric criteria Mann-Whitney U and Kruskal-Wallis were applied to determine the existence of differences in participants’ beliefs. Results Responses by 776 individuals were obtained. Individuals dedicated up to 2 hours per day to be informed on COVID-19. Television, electronic press and news websites were reported by the participants as more reliable than social media, in obtaining information on COVID-19. Respondents paid attention to official sources of information (Ministry of Health, Civil Protection etc.). Family and friends played an additional role in the participants’ information on COVID-19, while the personal doctor, other health workers and pharmacists did not appear to be most preferred sources of information on COVID-19. Participants’ most common information seeking strategy in digital environment was keyword searching. Unreliable information, fake news and information overload were the most common difficulties that the participants encountered seeking information on COVID-19. The respondents’ views seemed to differ significantly among age groups. The older the participants, the more often they were informed by television (p < 0.001) and the less often by the internet (p < 0.001). Females appear to use more frequently internet (p < 0.001) and social media (p = 0.001) out of habit and visit more often the Ministry of Health (p < 0.001) and the Civil Protection (p=0.005) websites, compared to males. Most of the participants seemed to worry about the fake news phenomenon and agreed that fake news on COVID-19 is being spread in the media and especially social networks. Conclusion The study revealed that, during the COVID-19 pandemic in Greece, participants obtained information about the disease mainly by television, electronic press and news websites. On the contrary, the limited use of social media demonstrates the participants awareness of the spread of fake news on social media. This observed information seeking behavior might has contributed to individuals’ acceptance of the necessary behavioral changes that had led to the Greek success story in preventing spread of the disease.
... Traditional search engines use a bag-of-words model with a frequency-based ranking function such as BM25 (Robertson, 2009) to retrieve documents that match a query of one or more search terms. Obtaining useful search results requires well-formulated search queries (Aula, 2003), which can be a challenging task during exploratory search (Belkin, 2000) and constitutes a cognitive load (Gwizdka, 2010) that our application aims to ease. Document similarity search methods (Wan et al., 2008), by contrast, use entire documents as the search queries, circumventing the need to define keywords for the search. ...
Conference Paper
Each claim in a research paper requires all rel-evant prior knowledge to be discovered, as-similated, and appropriately cited. However,despite the availability of powerful search en-gines and sophisticated text editing software,discovering relevant papers and integrating theknowledge into a manuscript remain complextasks associated with high cognitive load. Todefine comprehensive search queries requiresstrong motivation from authors, irrespective oftheir familiarity with the research field. More-over, switching between independent appli-cations for literature discovery, bibliographymanagement, reading papers, and writing textburdens authors further and interrupts their cre-ative process. Here, we present a web applica-tion that combines text editing and literaturediscovery in an interactive user interface. Theapplication is equipped with a search enginethat couples Boolean keyword filtering withnearest neighbor search over text embeddings,providing a discovery experience tuned to anauthor’s manuscript and his interests. Our ap-plication aims to take a step towards more en-joyable and effortless academic writing.
... The sentiment phrases themselves were fetched from varied sources such as manually constructed dictionary (Das and Chen, 2007), WordNet (Kamps et al., 2010), and search engine hit (Turney, 2002). Various machine learning algorithms have also been applied to classify and summarise the reviews based on the polarity of users' sentiment (Nyaung and Thein, 2015;Billsus and Pazzani, 1999;Belkin, 2000;Khan, 2011). All methods performed relatively well but failed to provide high accuracy (Hu and Liu, 2004;Bauman et al., 2014) because the result suggested that customers provide mixed reviews, e.g., praising some features of the product but criticising others. ...
Article
Full-text available
In recent times, reviews of products by customers have been proliferating on the online platform. Majority of the reviews are lengthy, and going through the reviews before making a decision can be a tedious task for the user. In this paper, we extract the popular features from customers' reviews to analyse the possible opinions of these features. Choosing a product from the different combination of opinions for these features is treated as a multi-criteria decision making (MCDM) problem. Weighted sum method, a MCDM approach, is used to evaluate the priority score for each product. The product with the highest score is recommended to the user. Real-time dataset from Amazon is used to evaluate our system's performance. The experimental result shows that our proposed method produces a promising result which can help the user in the decision making process.
Article
Full-text available
The human-centered design perspective assumes that technology cannot by itself enable a firm to sustain its competitiveness. To be effective, technology must be effectively integrated with the human system. We consider the effectiveness of the nascent 3DS technology, currently used in the tool and die industry for scanning ties, and compare a case of deployment as a standalone technology compared to a case of integration with the human system. As observed by many researchers we found that technical methodologies need to carefully woven into the current social structure. The firm that did not carefully think about technology deployment and it social effects at best achieved minimal improvement due to attributes such as faster measurement speed. Greater benefits were found in the firm that used the technology to enable collaboration between die engineers and die makers allowing visualization for problem solving and continuous improvement. Technologies like 3DS need to be carefully enmeshed with the current work structure to maximize efficiency gain.
Article
O grande fluxo informacional gera novas demandas de artefatos e profissionais ligados à produção, gestão e design da informação. O presente artigo tem o objetivo de mostrar algumas interseções disciplinares conectadas pela informação, e também de verificar a aplicabilidade da Design Science Research em projetos e disciplinas ligadas à produção de artefatos informacionais e à produção de conhecimento científico. Esta pesquisa exploratória e de caráter bibliográfico utiliza aportes teóricos da Ciência da Informação, do Design da Informação, de Sistemas de Informação e da Design Science Research. Os levantamentos bibliográficos focados na relação interdisciplinar entre os campos permitiram uma melhor compreensão das potenciais cooperações das áreas em um projeto ou em uma disciplina ligada ao Design da Informação, assim como possibilita um vislumbre da aplicação da Design Science Research em processos de natureza científica e prática.*****The large information flow creates new artifacts and professional demands related to design, management and production of information. The purpose of this article is to show some subject intersections connected by the information, and also to make room for the discussion about the Design Science Research applicability in projects and subjects related to the informational artifacts and scientific knowledge production. This exploratory and bibliographic research uses theoretical contributions from Information Science, Information Design, Information Systems and from Design Science Research. The bibliographic surveys focused on the interdisciplinary relationship between the fields allowed a better comprehension of the potential cooperation of the areas in a project or a subject related to the Information Design, as it allows a glimpse of the Design Science Research application in scientific and practical proceedings.
Article
Full-text available
Query formulation and reformulation is recognized as one of the most difficult tasks that users in information retrieval systems are asked to perform. This study investigated the use of two different techniques for supporting query reformulation in interactive information retrieval: relevance feedback and Local Context Analysis, both implemented as term-suggestion devices. The former represents techniques which offer user control and understanding of term suggestion; the latter represents techniques which require relatively little user effort. Using the TREC-8 Interactive Track task and experimental protocol, we found that although there were no significant differences between two systems implementing these techniques in terms of user preference and performance in the task, subjects using the Local Context Analysis system had significantly fewer user-defined query terms than those in the relevance feedback system. We conclude that term suggestion without user guidance/control is the better of the two methods tested, for this task, since it required less effort for the same level of performance. We also found that both number of documents saved and number of instances identified by subjects were significantly correlated with the criterion measures of instance recall and precision, and conclude that this suggests that it is not necessary to rely on external evaluators for measurement of performance of interactive information retrieval in the instance identification task.
Article
Full-text available
Over the last four years, the Information Interaction Laboratory at Rutgers' School of Communication, Information and Library Studies has carried out a series of investigations concerned with various aspects of people's interactions with advanced information retrieval systems. We have been especially concerned with understanding not just what people do, and why, and with what effect, but also with what they would like to do, and how they attempt to accomplish it, and with what difficulties. These investigations have led to some quite interesting conclusions about the nature and structure of people's interactions with information, about support for cooperative human-computer interaction in query reformulation, and about the value
Article
Full-text available
The concept of 'intelligent' information retrieval was first mooted in the late 1970s, but had lost currency within the information retrieval community by at least the early 1990s. With the popularity of the concept of 'intelligent agents', it appears that the idea of intelligent in- formation retrieval is again in general vogue. In this paper, I attempt to show that the naive concept of intelligent information retrieval, based on the idea of agency, misses the essence of intelligence in the information retrieval system, and will inevitably lead to dysfunctional in- formation retrieval. As a counter-proposal, I suggest that true intelligence in information re- trieval resides in appropriate allocation of responsibility amongst all the actors in the infor- mation retrieval system, and that intelligent information retrieval will be achieved through effective support of people in their various interactions with information. 1. What could we mean by intelligent information retrieval?
Article
Article
The support of effective interaction of the user with the other components of the system is a central problem for information retrieval. In this paper, we present a theory of such interactions taking place within a space of information-seeking strategies, and discuss how such a concept can be used to design for effective interaction. In particular, we propose a model of information retrieval system design based on the ideas of: a multidimensional space of information-seeking strategies; dialogue structures for information seeking; cases of specific information-seeking dialogues; anti, scripts as distinguished prototypical cases. We demonstrate the use of this model by discussing in some detail the MERIT system, a prototype information retrieval system, that incorporates these design principles.
Article
Photocopy. s Thesis (Ph. D.)--Rutgers, The State University of New Jersey, 1993. Includes bibliographical references (p. 278-289). Includes vita. Order number 9401925.
Article
Query formulation and reformulation is recognized as one of the most difficult tasks that users in information retrieval systems are asked to perform. This study investigated the use of two different techniques for supporting query reformulation in interactive information retrieval: relevance feedback and Local Context Analysis, both implemented as term-suggestion devices. The former represents techniques which offer user control and understanding of term suggestion; the latter represents techniques which require relatively little user effort. Using the TREC-8 Interactive Track task and experimental protocol, we found that although there were no significant differences between two systems implementing these techniques in terms of user preference and performance in the task, subjects using the Local Context Analysis system had significantly fewer user-defined query terms than those in the relevance feedback system. We conclude that term suggestion without user guidance/control is the be...
Relevance feedback versus Local Context Analysis as term suggestion devices
  • N J Belkin
  • C Cool
  • J Head
  • J Jeng
  • D Kelly
  • S J Lin
  • L Lobash
  • S Y Park
  • P Savage-Knepshield
  • C Sikora
Belkin, N.J., Cool, C., Head, J., Jeng, J., Kelly, D., Lin, S.J., Lobash, L. Park, S.Y., Savage-Knepshield, P., and Sikora, C. Relevance feedback versus Local Context Analysis as term suggestion devices: In Proceedings of the Eighth Text Retrieval Conference TREC8. (Washington, D.C., 2000). In press; trec.nist.gov/pubs/trec8/t8_proceedings.