Article

Retrieval effectiveness of image search engines

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose The purpose of this study is to explore the retrieval effectiveness of three image search engines (ISE) – Google Images, Yahoo Image Search and Picsearch in terms of their image retrieval capability. It is an effort to carry out a Cranfield experiment to know how efficient the commercial giants in the image search are and how efficient an image specific search engine is. Design/methodology/approach The keyword search feature of three ISEs – Google images, Yahoo Image Search and Picsearch – was exploited to make search with keyword captions of photos as query terms. Selected top ten images were used to act as a testbed for the study, as images were searched in accordance with features of the test bed. Features to be looked for included size (1200 × 800), format of images (JPEG/JPG) and the rank of the original image retrieved by ISEs under study. To gauge the overall retrieval effectiveness in terms of set standards, only first 50 result hits were checked. Retrieval efficiency of select ISEs were examined with respect to their precision and relative recall. Findings Yahoo Image Search outscores Google Images and Picsearch both in terms of precision and relative recall. Regarding other criteria – image size, image format and image rank in search results, Google Images is ahead of others. Research limitations/implications The study only takes into consideration basic image search feature, i.e. text-based search. Practical implications The study implies that image search engines should focus on relevant descriptions. The study evaluated text-based image retrieval facilities and thereby offers a choice to users to select best among the available ISEs for their use. Originality/value The study provides an insight into the effectiveness of the three ISEs. The study is one of the few studies to gauge retrieval effectiveness of ISEs. Study also produced key findings that are important for all ISE users and researchers and the Web image search industry. Findings of the study will also prove useful for search engine companies to improve their services.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... They reveal that Yahoo Image Search outshines Google Images and Picsearch concerning precision and relative recall. However, in terms of image size, image format and image rank, Google Images precedes others [38]. Researching the effectiveness of search engines, Zaidi et al. [60] recommend a novice approach by which the relevant images can be filtered out from the retrieved data. ...
... The keyword annotation method was used, wherein the captions/ annotations of each image were run across the select search engines, and the retrieved images were tested against the set parameters. The first 50 (in the case of images) retrieved results were analysed for the study since the users look up for the first page (carrying 10 or 20 displayed results) [35,36,38,40,[42][43][44]48]. However, to have a more comprehensive result platform, 50 images were taken into consideration. ...
... The findings of the study are also in tune with that of Uyar and Karapinar [6], which reveal that Google and Bing retrieve images for popular entities with high precision values. However, Hussain et al. [38] reveal that Yahoo Image Search outshines Google Image Search concerning precision. Table 2 highlights the Relative Recall measure of all three ISEs, wherein Google Images scores the highest Relative Recall rate of 0.68, followed by Bing Images (0.06). ...
Article
The year 2020 brought a big concern for the global community because of COVID-19, which affected every sector of society, and tourism is no exception. Researchers across the globe are publishing their studies related to different dimensions of tourism in the context of COVID-19, and images have formed an essential component of their research. In tourism, images related to COVID-19 can open new dimensions for scholars. The main aim of the research is to measure the retrieval effectiveness of three image search engines (ISEs), that is, Bing Images, Google Images and Yahoo Image Search, concerning images related to COVID-19 and tourism. The study attempts to identify the capability of the ISEs to retrieve the desired and actual images related to COVID-19 and tourism. The PubMed Central (PMC) Database was consulted to retrieve the desired images and develop a testbed. The advanced search feature of PMC Database was explored by typing the search terms 'COVID-19' and 'Tourism' using 'AND' operator to make the search more comprehensive. Both the terms were searched against the 'Figure/Table' caption to retrieve papers carrying images related to COVID-19 and tourism. on more than one occasion. In contrast, Bing Images retrieves the original image at the first rank in two instances. Yahoo Images performs poorly over this metric as it does not retrieve any original image at the first rank on any other instance. The study cannot be generalised as the scope is only limited to the images indexed by PMC. Furthermore, the retrieval effectiveness of only three ISEs is measured. The study is the first to measure the retrieval effectiveness of ISEs in retrieving images related to the COVID-19 pandemic and tourism. The study can be extended across other image-indexing databases pertinent to tourism studies, and the retrieval effectiveness of other ISEs can also be considered.
... A saber, los símbolos oficiales, los murales, los retratos enmarcados, la firma, la mirada, los monumentos y las infraestructuras urbanas. Se empleó una segunda triangulación entre estos tres reconocedores de imágenes por ser los que mejor rendimiento han mostrado (CHesHmeH soHraBi y aDnani-saDati, 2022; Hussain et al., 2019), por su relevancia en la cultura visual digital y porque cada uno utiliza sistemas diferentes de interpretación algorítmica visual (Contreras y marín, 2022b). Los motores visuales se basan en algoritmos inteligentes para rastrear las imágenes catalogadas con metadatos similares, reconocer imágenes con similitudes formales (CBIR Sistem) y mostrar las imágenes priorizando y jerarquizando los resultados (SEO Sistem), según el tráfico de las páginas ajustándose a los términos de búsqueda (silva, 2019). ...
Article
Full-text available
Este artículo muestra los resultados de un estudio exploratorio sobre el diseño y control de las imágenes patrióticas durante los períodos de gobierno de los presidentes Hugo Rafael Chávez Frías (1999-2013) y Nicolás Maduro Moros (2013- actualidad). El objetivo es reconocer, identificar y clasificar los objetos visuales constitutivos del régimen escópico de Venezuela, desgranando el plan iconográfico desarrollado por el Estado para su fundación. Tras el rastreo y la recuperación de 240 fotografías del espacio virtual observamos que los gobiernos chavistas consiguen el reconocimiento de su ideología a partir de una renovada iconografía patriótica en el espacio visual urbano.
... According to studies by Tokgoz et al. [34], Hussain et al. [35] and CheshmehSohrabi et al. [36], Yahoo outperforms Google in image retrieval (users use one or more keywords to retrieve the relevant image or images they are looking for) while studies by Uluc et al. [37], Cakir et al. [38] and Adrakatti et al. [39] reveal contradictory results on the effectiveness of image search engines and indicate the outperformance of Google in image retrieval as compared with Yahoo. ...
Article
Full-text available
Search on the web, specifically fetching of the relevant content, has been paid attention to since the advent of the web and particularly in recent years due to the tremendous growth in the volume of data and web pages. This paper categorizes the search services from the early days of the web to the present into keyword search engines, semantic search engines, question answering systems, dialogue systems and chatbots. As the first generation of search engines, keyword search engines have adopted keyword-based techniques to find the web pages containing the query keywords and ranking search results. In contrast, semantic search engines try to find meaningful and accurate results on the meaning and relations of things. Question-answering systems aim to find precise answers to natural language questions rather than returning a ranked list of relevant sources. As a subset of question answering systems, dialogue systems target to interact with human users through a dialog expressed in natural language. As a subset of dialogue systems, chatbots try to simulate human-like conversations. The paper provides an overview of the typical aspects of the studied search services, including process models, data preparation and presentation, common methodologies and categories.
... This takes advantage of all the advanced features of Solr Server. Solr can add, delete, modify and query the data index by providing a standard HTTP interface [4,5]. The Solr index server is shown in Fig. 1. ...
Article
Full-text available
In order to improve the search performance of rich text content, a cloud search engine system based on rich text content is designed. On the basis of traditional search engine hardware system, several hardware devices such as Solr index server, collector, Chinese word segmentation device and searcher are installed, and the data interface is adjusted. On the basis of hardware equipment and database support, this paper uses the open source Apache Tika framework to obtain the metadata of rich text documents, implements word segmentation according to the rich text content and semantics, and calculates the weight of each keyword. Input search keywords, establish a text index, use BM25 algorithm to calculate the similarity between keywords and text, and output the search results of rich text according to the similarity calculation results. The experimental results show that the design system has high recall rate, high throughput, and the construction time of each data item index in different files is short, which improves the search efficiency and search accuracy.
... Evaluating an information retrieval system, especially a search engine, is one of the fundamental topics in the field of library and information science. In this regard, various researchers, such as Bar-Ilan (1998), Bilal (2012), Deka and Lahkar (2010), Demirci et al. (2007), Fattahi et al. (2016), Gordon and Pathak (1999), Hussain et al. (2019), Lewandowski (2015), Wani Zahid and Ahmad Sofi (2016) and Zeynali Tazehkandi and Nowkarizi (2020), evaluated search engines. ...
Article
Full-text available
Purpose - The purpose of this paper is to present a review on the use of the recall metric for evaluating information retrieval systems, especially search engines. Methodology - This review article investigates different researchers’ views about recall metrics. Findings - Five different definitions for recall were identified. For the first group, recall refers to completeness, but it does not specify where all the relevant documents are located. For the second group, recall refers to retrieving all the relevant documents from the collection. However,it seems that the term “collection” is ambiguous. For the third group (first approach), collection means the index of search engines and, for the fourth group (second approach), collection refers to the web. For the fifth group (third approach), ranking of the retrieved documents should also be accounted for in calculating recall. Practical implications - It can be said that in the first, second, and third approaches, the components of the retrieval algorithm, the retrieval algorithm and crawler, and the retrieval algorithm and crawler and ranker, respectively, are evaluated. To determine the effectiveness of search engines for the use of users, it is better to use the third approach in recall measurement. Originality/value - The value of this paper is to collect, identify, and analyse literature that isused in recall. In addition, different views of researchers about recall are identified.
Article
Search engine queries are the starting point for studies in different fields, such as health or political science. These studies usually aim to make statements about social phenomena. However, the queries used in the studies are often created rather unsystematically and do not correspond to actual user behavior. Therefore, the evidential value of the studies must be questioned. We address this problem by developing an approach (query sampler) to sample queries from commercial search engines, using keyword research tools designed to support search engine marketing. This allows us to generate large numbers of queries related to a given topic and derive information on how often each keyword is searched for, that is, the query volume. We empirically test our approach with queries from two published studies, and the results show that the number of queries and total search volume could be considerably expanded. Our approach has a wide range of applications for studies that seek to draw conclusions about social phenomena using search engine queries. The approach can be applied flexibly to different topics and is relatively straightforward to implement, as we provide the code for querying Google Ads API. Limitations are that the approach needs to be tested with a broader range of topics and thoroughly checked for problems with topic drift and the role of close variants provided by keyword research tools.
Article
This study has developed a combined indicator to evaluate the performance of different search engines. Documentary analysis, survey, and evaluative methods are employed in the present study. The research was conducted in two stages. First, a combined indicator was designed to measure search engines. To this end, 72 criteria for measuring the performance of search engines were identified, out of which 22 criteria were selected. Accordingly, 10 criteria were selected in six general classes through a survey of subject matter experts. Validation of our proposed combined indicator was obtained by Delphi method and using the opinions of experts in the fields of information science and information system. Second, web search engines were evaluated based on the proposed combined indicator. The statistical population of this part of the research consisted of two categories: (1) general web search engines, and (2) general subjects. The sample size of the first category contained four search engines Yahoo, Google, DuckDuckGo, and Bing, and the second category involved 40 search terms under 10 general categories. The results showed that the combined indicator had six general criteria: (1) relevance, (2) ranking, (3) novelty ratio, (4) coverage ratio, (5) ratio of unrelated documents, and (6) proportion of duplication hits. According to this indicator, Google is at the top, followed by Bing. This study proposes a new indicator for evaluating search engine performance, which can measure the efficiency of search engines. Therefore, its use to measure the performance of search engines is recommended to researchers and search engine developers.
Article
This experimental study used a checklist to evaluate the performance of seven search engines consisting of four Image General Search Engines (IGSEs) (namely, Google, Yahoo DuckDuckGo and Bing), and three Image Specialized Search Engines (ISSEs) (namely, Flicker, PicSearch, and GettyImages) in image retrieval. The findings indicated that the recall average of Image General Search Engines and Image Specialized Search Engines was found to be 76.32% and 24/51% with the precision average of 82/08% and 32/21%, respectively. As the results showed, Yahoo, Google and DuckDuckGo ranked at the top in image retrieval with no significant difference. However, a remarkable superiority with almost 50% difference was observed between the general and specialized image search engines. It was also found that an intense competition existed between Google, Yahoo and DuckDuckGo in image retrieval. The overall results can provide valuable insights for new search engine designers and users in choosing the appropriate search engines for image retrieval. Moreover, the results obtained through the applied equations could be used in assessing and evaluating other search tools, including search engines.
Chapter
Recent works in deep learning using Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) models have yielded state of the art results on a variety of image processing tasks. Multimodal representation, especially Image captioning is gaining popularity due to their primordial role in constricting heterogeneity gap among different modalities which are very helpful in cross-modality analysis tasks. The uncountable amounts of medical images, as well as medical documents, need to be processed to discover hidden knowledge. The purpose of this research is to present biomedical information retrieval system in order to know more about their strengths and weakness. Then we will propose our approach that tries to resolve some gaps and gives some solution to the existing systems and engine retrieval by giving an insight into the images captioning benefit in cross-modality retrieval.
Article
Full-text available
Purpose The purpose of this paper is to investigate image search and retrieval problems in selected search engines in relation to Persian writing style challenges. Design/methodology/approach This study is an applied one, and to answer the questions the authors used an evaluative research method. The aim of the research is to explore the morphological and semantic problems of Persian language in connection with image search and retrieval among the three major and widespread search engines: Google, Yahoo and Bing. In order to collect the data, a checklist designed by the researcher was used and then the data were analyzed by descriptive and inferential statistics. Findings The results indicate that Google, Yahoo and Bing search engines do not pay enough attention to morphological and semantic features of Persian language in image search and retrieval. This research reveals that six groups of Persian language features include derived words, derived/compound words, Persian and Arabic Plural words, use of dotted T and the use of spoken language and polysemy, which are the major problems in this area. In addition, the results suggest that Google is the best search engine of all in terms of compatibility with Persian language features. Originality/value This study investigated some new aspects of the above-mentioned subject through combining morphological and semantic aspects of Persian language with image search and retrieval. Therefore, this study is an interdisciplinary research, the results of which would help both to offer some solutions and to carry out similar research on this subject area. This study will also fill a gap in research studies conducted so far in this area in Farsi language, especially in image search and retrieval. Moreover, findings of this study can help to bridge the gap between the user’s questions and search engines (systems) retrievals. In addition, the methodology of this paper provides a framework for further research on image search and retrieval in databases and search engines.
Article
Full-text available
Sketch-based image retrieval (SBIR) has been studied since the early 1990s and has drawn more and more interest recently. Yet, a comprehensive review of the SBIR field is still absent. This survey tries to fill in this gap by reviewing the representative papers studying the SBIR problem. More importantly, this survey tries to answer two important questions which are generally not well discussed: what are the objectives of SBIR, and what is the general methodology of SBIR? The reviewed papers are organized in a chronological way and analyzed by answering these two important questions. As a novel trend, fine-grained SBIR has become the main topic for the recent research. The discussion on it is also integrated. From this survey, we hope that different perspectives can be observed, common values can be discovered and new ideas can be inspired.
Article
Full-text available
The Internet of Things (IoT) and Big Data are among the most popular emerging fields of computer science today. IoT devices are creating an enormous amount of data daily on a different scale; hence, search engines must meet the requirements of rapid ingestion and processing followed by accurate and fast extraction. Researchers and students from the field of computer science query the search engines on these topics to reveal a wealth of IoT-related information. In this study, we evaluate the relative performance of two search engines: Bing and Yandex. This work proposes an automatic scheme that populates a sustainable optimal rank list of search results with higher precision for IoT-related topics. The proposed scheme rewrites the seed query with the help of attribute terms extracted from the page corpus. Additionally, we use newness and geo-sensitivity-based boosting and dampening of web pages for the re-ranking process. To evaluate the proposed scheme, we use an evaluation matrix based on discounted cumulative gain (DCG), normalized DCG (nDCG), and mean average precision (MAPn). The experimental results show that the proposed scheme achieves scores of [email protected]/* */ = 0.60, DCG5 = 4.43, and nDCG5 = 0.95 for general queries; DCG5 = 4.14 and nDCG5 = 0.93 for time-stamp queries; and DCG5 = 4.15 and nDCG5 = 0.96 for geographical location-based queries. These outcomes validate the usefulness of the suggested system in helping a user to access IoT-related information.
Article
Full-text available
The rapid growth of digital image collections has prompted the need for development of software tools that facilitate efficient searching and retrieval of images from large image databases. Towards this goal, we propose a content-based image retrieval scheme for retrieval of images via their color, texture, and shape features. Using three specialized histograms (i.e. color, wavelet, and edge histograms), we show that a more accurate representation of the underlying distribution of the image features improves the retrieval quality. Furthermore, in an attempt to better represent the user’s information needs, our system provides an interactive search mechanism through the user interface. Users searching through the database can select the visual features and adjust the associated weights according to the aspects they wish to emphasize. The proposed histogram-based scheme has been thoroughly evaluated using two general-purpose image datasets consisting of 1000 and 3000 images, respectively. Experimental results show that this scheme not only improves the effectiveness of the CBIR system, but also improves the efficiency of the overall process.
Article
Full-text available
Image Retrieval could be a technique of looking out, browsing, and retrieving the pictures from a picture database. The Image search engines largely admit close text options. It’s troublesome for search engines to interpret users’ search intention solely by keywords and this ends up in ambiguous and reedy search results that are removed from satisfactory. It’s necessary to use content primarily based search so as to unravel the paradox in text-based image retrieval. The present paper discusses the idea and functioning of most popular nonprofit Reverse Image Search Engines. We have chosen Dr. S. R. Ranganthan image to ascertain the performance and analysis of the results from most well-liked reverse search engines.
Article
Full-text available
Textual data such as tags, sentence descriptions are combined with visual cues to reduce the semantic gap for image retrieval applications in today's Multimodal Image Retrieval (MIR) systems. However, all tags are treated as equally important in these systems, which may result in misalignment between visual and textual modalities during MIR training. This will further lead to degenerated retrieval performance at query time. To address this issue, we investigate the problem of tag importance prediction, where the goal is to automatically predict the tag importance and use it in image retrieval. To achieve this, we first propose a method to measure the relative importance of object and scene tags from image sentence descriptions. Using this as the ground truth, we present a tag importance prediction model by exploiting joint visual, semantic and context cues. The Structural Support Vector Machine (SSVM) formulation is adopted to ensure efficient training of the prediction model. Then, the Canonical Correlation Analysis (CCA) is employed to learn the relation between the image visual feature and tag importance to obtain robust retrieval performance. Experimental results on three real-world datasets show a significant performance improvement of the proposed MIR with Tag Importance Prediction (MIR/TIP) system over other MIR systems.
Article
Full-text available
This paper focuses on the methodologies to organize and structure image databases. Conventional relational database techniques are optimized to deal with textual and numeric data; however, they are not effective to handle image data. Some progresses have been made in developing new approaches to establish and use image databases, but the applications of these approaches are very labor-intensive, error-prone, and impractical to large-scale databases. In this paper, we propose a new approach to develop the structure of a large-scale image automatically. It is an integrated approach from existing technologies for the new application where the management of image data is focused. In addition, we present a solution to data indexing for the image database with different image types.
Article
Full-text available
Search engine retrieval effectiveness studies are usually small-scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,000 navigational queries from a major German search engine and comparing Google's and Bing's results based on this sample. Jurors were found through crowdsourcing, data was collected using specialised software, the Relevance Assessment Tool (RAT). We found that while Google outperforms Bing in both query types, the difference in the performance for informational queries was rather low. However, for navigational queries, Google found the correct answer in 95.3 per cent of cases whereas Bing only found the correct answer 76.6 per cent of the time. We conclude that search engine performance on navigational queries is of great importance, as users in this case can clearly identify queries that have returned correct results. So, performance on this query type may contribute to explaining user satisfaction with search engines.
Article
Full-text available
We surveyed the current research on the retrieval and utilization of images. This paper pro- vides an overview of the difficulties and possibilities of technological or technology-related research topics resulting from the huge amount of images currently available and highlights some of the important research topics. We looked at the ongoing research activities and ana- lyzed them from four aspects, information access and organization technology, the computing infrastructure that enables access to large-scale image resources, issues in human-system in- teraction and human factors related to using images, and the social aspect of image media. On the technical side, we noticed that as the number of digital images increases, so does the importance of the accuracy and scalability in relation to the image retrieval. The accuracy and scalability are in fact needed to cope with the current explosion in digital images. On the social side, not so long ago, image retrieval technologies were only experimental tools or used by experts within a limited domain. However, now the general public has access to a wide range of digital images, which means that image retrieval technologies are being used by var- ious users in a large diversity of social contexts. Thus, as we show in this paper, the accuracy and scalability are not the only important factors in the era of information explosion, but that researchers must also be concerned with the social aspects of these technologies.
Article
Full-text available
The literature of the evaluation of Internet search engines is reviewed. Although there have been many studies, there has been little consistency in the way such studies have been carried out. This problem is exacerbated by the fact that recall is virtually impossible to calculate in the fast changing Internet environment, and therefore the traditional Cranfield type of evaluation is not usually possible. A variety of alternative evaluation methods has been suggested to overcome this difficulty. The authors recommend that a standardised set of tools is developed for the evaluation of web search engines so that, in future, comparisons can be made between search engines more effectively, and that variations in performance of any given search engine over time can be tracked. The paper itself does not provide such a standard set of tools, but it investigates the issues and makes preliminary recommendations of the types of tools needed.
Article
Full-text available
Purpose – To describe a small-scale quantitative evaluation of the scholarly information search engine, Google Scholar. Design/methodology/approach – Google Scholar's ability to retrieve scholarly information was compared to that of three popular search engines: Ask.com, Google and Yahoo! Test queries were presented to all four search engines and the following measures were used to compare them: precision; Vaughan's Quality of Result Ranking; relative recall; and Vaughan's Ability to Retrieve Top Ranked Pages. Findings – Significant differences were found in the ability to retrieve top ranked pages between Ask.com and Google and between Ask.com and Google Scholar for scientific queries. No other significant differences were found between the search engines. This may be due to the relatively small sample size of eight queries. Results suggest that, for scientific queries, Google Scholar has the highest precision, relative recall and Ability to Retrieve Top Ranked Pages. However, it achieved the lowest score for these three measures for non-scientific queries. The best overall score for all four measures was achieved by Google. Vaughan's Quality of Result Ranking found a significant correlation between Google and scientific queries. Research limitations/implications – As with any search engine evaluation, the results pertain only to performance at the time of the study and must be considered in light of any subsequent changes in the search engine's configuration or functioning. Also, the relatively small sample size limits the scope of the study's findings. Practical implications – These results suggest that, although Google Scholar may prove useful to those in scientific disciplines, further development is necessary if it is to be useful to the scholarly community in general. Originality/value – This is a preliminary study in applying the accepted performance measures of precision and recall to Google Scholar. It provides information specialists and users with an objective evaluation of Google Scholar's abilities across both scientific and non-scientific disciplines and paves the way for a larger study.
Article
Full-text available
We analyzed transaction logs containing 51,473 queries posed by 18,113 users of Excite, a major Internet search service. We provide data on: (i) sessions — changes in queries during a session, number of pages viewed, and use of relevance feedback; (ii) queries — the number of search terms, and the use of logic and modifiers; and (iii) terms — their rank/frequency distribution and the most highly used search terms. We then shift the focus of analysis from the query to the user to gain insight to the characteristics of the Web user. With these characteristics as a basis, we then conducted a failure analysis, identifying trends among user mistakes. We conclude with a summary of findings and a discussion of the implications of these findings.
Article
Full-text available
This paper investigates the composition of search engine results pages. We define what elements the most popular web search engines use on their results pages (e.g., organic results, advertisements, shortcuts) and to which degree they are used for popular vs. rare queries. Therefore, we send 500 queries of both types to the major search engines Google, Yahoo, Live.com and Ask. We count how often the different elements are used by the individual engines. In total, our study is based on 42,758 elements. Findings include that search engines use quite different approaches to results pages composition and therefore, the user gets to see quite different results sets depending on the search engine and search query used. Organic results still play the major role in the results pages, but different shortcuts are of some importance, too. Regarding the frequency of certain host within the results sets, we find that all search engines show Wikipedia results quite often, while other hosts shown depend on the search engine used. Both Google and Yahoo prefer results from their own offerings (such as YouTube or Yahoo Answers). Since we used the .com interfaces of the search engines, results may not be valid for other country-specific interfaces.
Article
Full-text available
Use of test collections and evaluation measures to assess the effectiveness of information retrieval systems has its origins in work dating back to the early 1950s. Across the nearly 60 years since that work started, use of test collections is a de facto standard of evaluation. This monograph surveys the research conducted and explains the methods and measures devised for evaluation of retrieval systems, including a detailed look at the use of statistical significance testing in retrieval experimentation. This monograph reviews more recent examinations of the validity of the test collection approach and evaluation measures as well as outlining trends in current research exploiting query logs and live labs. At its core, the modern-day test collection is little different from the structures that the pioneering researchers in the 1950s and 1960s conceived of. This tutorial and review shows that despite its age, this long-standing evaluation method is still a highly valued tool for retrieval research.
Article
Full-text available
Purpose The purpose of this paper is to compare five major web search engines (Google, Yahoo, MSN, Ask.com, and Seekport) for their retrieval effectiveness, taking into account not only the results, but also the results descriptions. Design/methodology/approach The study uses real‐life queries. Results are made anonymous and are randomized. Results are judged by the persons posing the original queries. Findings The two major search engines, Google and Yahoo, perform best, and there are no significant differences between them. Google delivers significantly more relevant result descriptions than any other search engine. This could be one reason for users perceiving this engine as superior. Research limitations/implications The study is based on a user model where the user takes into account a certain amount of results rather systematically. This may not be the case in real life. Practical implications The paper implies that search engines should focus on relevant descriptions. Searchers are advised to use other search engines in addition to Google. Originality/value This is the first major study comparing results and descriptions systematically and proposes new retrieval measures to take into account results descriptions.
Conference Paper
Full-text available
This paper investigates the information retrieval effectiveness of major image search engines based on various query topics. Initially, major image search engines, namely, Google, Yahoo, Ask and MSN are selected. Then, seven appropriate topics are determined from the categories of the top search terms used on the web and five queries per topic are chosen. Each query is run on the selected image search engines separately and first forty images retrieved in each retrieval output are classified as being "relevant" or "non-relevant" to calculate precision ratios at various cut-off points for each pair of query and search engine. The results indicated that Google has the best overall retrieval effectiveness in topics "Automotive Manufacturers", "Broadcast Media", "Pharmaceutical and Medical Product" and "Movies" and was followed by MSN in topics "Food and Beverage Brands", "IT and Internet" and Ask in topic "Travel Destinations and Accommodations". All image search engines seem to have the lowest effectiveness for the topic "Food and Beverage Brands". The precision ratio of any one of the image search engines was not the same and changed for every topic.
Chapter
Content-based image retrieval system (CBIR) is a challenging domain which is used in various fields of research today, such as scientific research, medical, Internet, and other communication media. CBIR is an approach that allows a user to obtain an image depends on a query from large datasets holding a huge amount of images. Images play a big role in any of the media today, where communication and data transmission held using the specific formats of data. Thus, for making communication and information sharing via images, it is needful to perform its extraction and then further processing with information content. A survey has been done on various content-based image retrieval techniques which are derived by the various authors for the feature extraction of images and which are further used for classification.
Article
Current image retrieval techniques are mainly based on text or visual contents. However, both text-based and contents-based methods lack the capability of utilizing human intuition and KANSEI (impression). In this paper, we proposed an impression-based image retrieval method in order to realize the image retrieval according to our impression presented by impression keywords. We first propose a generic and specific impressions estimation method based on machine learning and then apply it to impression-based clothing fabric image retrieval. We use a semantic differential (SD) method to measure the user’s impressions such as brightness and warmth while they view a cloth fabric image. We also extract both global and local features of cloth fabric images such as color and texture using computer vision techniques. Then we use support vector regression to model the mapping functions between the generic impression (or specific impression) and image features. The learnt mapping functions are used to estimate the generic and specific impressions of cloth fabric images. The retrieval is done by comparing the query impression with the estimated impression of images in the database.
Article
Purpose – The purpose of this paper is to highlight the retrieval effectiveness of search engines taking into consideration both precision and relative recall. Design/methodology/approach – The study is based on search engines that are selected on the basis of Alexa (Actionable Analytics for the web) Rank. Alexa listed top 500 sites, namely, search engines, portals, directories, social networking sites, networking tools, etc. But the scope of study is confined to only general search engines on the basis of language which was confined to English. Therefore only two general search engines are selected for the study . Alexa reports Google.com as the most visited website worldwide and Yahoo.com as the fourth most visited website globally. A total of 15 queries were selected randomly from PG students of Department of Library and Information Science during a period of eight days (from May 8 to May 15, 2014) which are classified manually into navigational, informational and transactional queries. However, queries are largely distributed on the two selected search engines to check their retrieval effectiveness as a training data set in order to define some characteristics of each type. Each query was submitted to the selected search engines which retrieved a large number of results but only the first 30 results were evaluated to limit the study in view of the fact that most of the users usually look up under the first hits of a query. Findings – The study estimated the precision and relative recall of Google and Yahoo. Queries using concepts in the field of Library and Information Science were tested and were divided into navigational queries, informational queries and transactional queries. Results of the study showed that the mean precision of Google was high with (1.10) followed by Yahoo with (0.88). While as, mean relative recall of Google was high with (0.68) followed by Yahoo with (0.31), respectively. Research limitations/implications – The study highlights the retrieval effectiveness of only two search engines. Originality/value – The research work is authentic and does not contain any plagiarized work.
Article
Image search is the second most frequently used search service on the Web. However, there are very few studies investigating any aspect of it. In this study, we investigate the precision of Web image search engines of Google and Bing for popular and less popular entities using text-based queries. Furthermore, we investigate four additional aspects of Web image search engines that have not been studied before. We used 60 different queries in total from three different domains for popular and less popular categories. We examined the relevancy of the top 100 images for each query. Our results indicate that image search is a solved problem for popular entities. They deliver 97% precision on the average for popular entities. However, precision values are much lower for less popular entities. For the top 100 results, average precision is 48% for Google and 33% for Bing. The most important problem seems to be the worst cases in which the precision can be less than 10%. The results show that significant improvement is needed to better identify relevant images for less popular entities. One of the main issues is the association problem. When a Web page has query words and multiple images, both Google and Bing are having difficulty determining the relevant images.
Article
System performance assessment and comparison are fundamental for large-scale image search engine development. This article documents a set of comprehensive empirical studies to explore the effects of multiple query evidences on large-scale social image search. The search performance based on the social tags, different kinds of visual features and their combinations are systematically studied and analyzed. To quantify the visual query complexity, a novel quantitative metric is proposed and applied to assess the influences of different visual queries based on their complexity levels. Besides, we also study the effects of automatic text query expansion with social tags using a pseudo relevance feedback method on the retrieval performance. Our analysis of experimental results shows a few key research findings: (1) social tag-based retrieval methods can achieve much better results than content-based retrieval methods; (2) a combination of textual and visual features can significantly and consistently improve the search performance; (3) the complexity of image queries has a strong correlation with retrieval results’ quality—more complex queries lead to poorer search effectiveness; and (4) query expansion based on social tags frequently causes search topic drift and consequently leads to performance degradation.
Article
This paper suggests a theoretical basis for identifying and classifying the kinds of subjects a picture may have, using previously developed principles of cataloging and classification, and concepts taken from the philosophy of art, from meaning in language, and from visual perception. The purpose of developing this theoretical basis is to provide the reader with a means for evaluating, adapting, and applying presently existing indexing languages, or for devising new languages for pictorial materials; this paper does not attempt to invent or prescribe a particular indexing language.
Article
Five search engines, Alta Vista, Excite, Hotbot, Infoseek, and Lycos, are compared for precision on the first twenty results returned for fifteen queries. All searching was done from January 31 to March 12, 1997. In the study, steps are taken to ensure that bias has not unduly influenced the evaluation. Friedmann's randomized block design is used to perform multiple comparisons for significance. Analysis shows that Alta Vista, Excite and Infoseek are the top three services, with their relative rank changing depending on how one interpreted the concept of "relevant." Correspondence analysis shows that Lycos performed better on short, unstructured queries, while Hotbot performed better on structured queries.
Article
Purpose The purpose of this paper is to discuss the importance of usability and overall user satisfaction when comparing performance of different search engines. Design/methodology/approach The study described in this paper starts from an investigation of existing methodologies for evaluating search engines in order to find out what are the most important factors for users to decide which system to use when searching the World Wide Web. Findings This study confirmed that usability and popularity are closely linked. This study has shown that no one‐search engine holds the key to ultimate search results. Just as there is cultural, political and geographical differences in the world's population, there are a number of search engines to fit the individual needs of every net citizen. Whereas results, precision, recall and reliability are the factors which participants prize highly, regardless of all other aspects. It was found that the speed of search engine results has become a high priority to participants. Research limitations/implications Number of participants was limited and although some questions were confusing to some individuals, a majority of questionnaires were completed in a satisfactory fashion. Originality/value This paper describes a usability study involving different search engines looking at links between popularity and usability.
Article
Focuses on access to digital image collections by means of manual and automatic indexing. Contains six sections: (1) Studies of Image Systems and their Use; (2) Approaches to Indexing Images; (3) Image Attributes; (4) Concept-Based Indexing; (5) Content-Based Indexing; and (6) Browsing in Image Retrieval. Contains 105 references. (AEF)
Article
Search engines are essential for finding information on the World Wide Web. We conducted a study to see how effective eight search engines are. Expert searchers sought information on the Web for users who had legitimate needs for information, and these users assessed the relevance of the information retrieved. We calculated traditional information retrieval measures of recall and precision at varying numbers of retrieved documents and used these as the bases for statistical comparisons of retrieval effectiveness among the eight search engines. We also calculated the likelihood that a document retrieved by one search engine was retrieved by other search engines as well.
Article
Measuring the information retrieval effectiveness of World Wide Web search engines is costly because of human relevance judgments involved. However, both for business enterprises and people it is important to know the most effective Web search engines, since such search engines help their users find higher number of relevant Web pages with less effort. Furthermore, this information can be used for several practical purposes. In this study we introduce automatic Web search engine evaluation method as an efficient and effective assessment tool of such systems. The experiments based on eight Web search engines, 25 queries, and binary user relevance judgments show that our method provides results consistent with human-based evaluations. It is shown that the observed consistencies are statistically significant. This indicates that the new method can be successfully used in the evaluation of Web search engines.
Article
The decisions that must be made by an investigator in carrying out an information retrieval experiment are described. Guidance is provided on a number of issues, specifically determining the need for testing, choosing the type of test (laboratory or operational), defining the variables, developing or using databases, finding queries, processing queries, assigning treatments to experimental units, collecting the data, analyzing the data, and presenting the results.
Conference Paper
An explosion of digital photography technologies that permit quick and easy uploading of any image to the web, coupled with the proliferation of personal, recreational users of the internet over the past several years have resulted in millions of images being uploaded on the World Wide Web every day. Most of the uploaded images are not readily accessible as they are not organized so as to allow efficient searching, retrieval, and ultimately browsing. Currently major commercial search engines utilize a process known as Annotation Based Image Retrieval to execute search requests focused on retrieving an image. Despite the fact that the information sought is an image, the ABIR technique primarily relies on textual information associated with an image to complete the search and retrieval process. Using the game of cricket as the domain, this article compares the performance of three commonly used search engines for image retrieval: Google, Yahoo and MSN Live. Factors used for the evaluation of these search engines include query types, number of images retrieved, and the type of search engine. Results of the empirical evaluation show that while the Google search engine performed better than Yahoo and MSN Live in situations where there is no refiner, the performance of all three search engines dropped drastically when a refiner was added. Further research is needed to overcome the problems of manual annotation embodied in the annotationbased image retrieval problem.
Article
This paper presents an application of the model described in Part I to the evaluation of Web search engines by undergraduates. The study observed how 36 undergraduate used four major search engines to find information for their own individual problems and how they evaluated these engines based on actual interaction with the search engines. User evaluation was based on 16 performance measures representing five evaluation criteria: relevance, efficiency, utility, user satisfaction, and connectivity. Non-performance (user-related) measures were also applied. Each participant searched his/ her own topic on all four engines and provided satisfaction ratings for system features and interaction and reasons for satisfaction. Each also made relevance judgements of retrieved items in relation to his/her own information need and participated in post-search interviews to provide reactions to the search results and overall performance. The study found significant differences in precision PR1, relative recall, user satisfaction with output display, time saving, value of search results, and overall performance among the four engines and also significant engine by discipline interactions on all these measures. In addition, the study found significant differences in user satisfaction with response time among four engines, and significant engine by discipline interaction in user satisfaction with search interface. None of the four search engines dominated in every aspect of the multidimensional evaluation. Content analysis of verbal data identified a number of user criteria and users evaluative comments based on these criteria. Results from both quantitative analysis and content analysis provide insight for system design and development, and useful feedback on strengths and weaknesses of search engines for system improvement.
Article
In this paper, we present a general approach for statistically evaluating precision of search engines on the Web. Search engines are evaluated in two steps based on a large number of sample queries: (a) computing relevance scores of hits from each search engine, and (b) ranking the search engines based on statistical comparison of the relevance scores. In computing relevance scores of hits, we study four relevance scoring algorithms. Three of them are variations of algorithms widely used in the traditional information retrieval field. They are cover density ranking, Okapi similarity measurement, and vector space model algorithms. In addition, we develop a new three-level scoring algorithm to mimic commonly used manual approaches. In ranking the search engines in terms of precision, we apply a statistical metric called probability of win. In our experiments, six popular search engines, AltaVista, Fast, Google, Go, iWon, and NorthernLight, were evaluated based on queries from two domains of interest: parallel and distributed processing, and knowledge and data engineering. The first query set contains 1726 queries collected from the index terms of papers published in the IEEE Transactions on Knowledge and Data Engineering. The second set contains 1383 queries collected from the index terms of papers published in the IEEE Transactions on Parallel and Distributed Systems. Search engines were queried and compared in two different search modes: the default search mode and the exact phrase search mode. Our experimental results show that these six search engines performed differently under different search modes and scoring methods. Overall, Google was the best. NorthernLight was mostly second in the default search mode, whereas iWon was mostly second in the exact phrase search mode.
Article
Purpose – The purpose of the paper is to evaluate the performance and efficiency of the five most used search engines, i.e. Google, Yahoo!, Live, Ask, and AOL, in retrieving internet resources at specific points of time using a large number of complex queries. Design/methodology/approach – In order to examine the performance of the five search engines, five sets of experiments were conducted using 50 complex queries within two different time frames. The data were evaluated using Excel and SPSS software. Findings – The paper results highlight the fact that different web search engines, which use different technology to find and present web information, yield different first page search results. The overall analysis of the findings of different measures reveals that Google has a significantly higher rate of performance in retrieving web resources as compared with the other four search engines. Yahoo! is the second best in terms of retrieval performance. The other three search engines did not performed satisfactorily compared with Google and Yahoo! Originality/value – The paper will provide important insight into the effectiveness of major search engines and their ability to retrieve relevant internet resources. This paper has produced key findings that are important for all web search engine users and researchers, and the web industry. The findings will also assist search companies to improve their services.
Article
Search engines have become the most important medium for Internet users to find pages on the web. They help customers to decrease their information overload, and enhance the sales of commercial web sites in different ways. For these reasons, the exploration of and changes in (human) online searching behaviour has become a subject of particular importance. This paper will help search engine and web site administrators and developers to monitor online searching behaviour properly and to derive strategies from the information gained. We define standard parameters against which search engines can be measured and compared. These parameters also reflect the online searching behaviour of search engine users. Therefore, an overview of studies conducted in the last few years is given. Statistics used in different papers are compared to extract standard parameters for online searching behaviour. In the next step, search queries from four different search engines covering periods of between 10 and 13 months are compared using these parameters. Using an automatic process, we retrieved around 99% of all search queries from three different search engine live tickers. For the first time, different data sets over a long period are compared. We observe that some patterns stay stable in a number of different search engines, and shed light on patterns that shorter analyses could not adequately examine. Our observations do not support the assumption that web search has become more business driven. For this reason, we introduce the concept of evergreens in search queries. One implication is that search engines should simplify web search interfaces for users since Boolean operators and special search features are rarely used. We also present the evergreen topics in search queries.
Article
This study investigates the accuracy of search engine hit counts for search queries. We investigate the accuracy of hit counts for Google, Yahoo and Microsoft Live Search, and the accuracy of single and multiple term queries. In addition, we investigate the consistency of hit count estimates for 15 days. The results show that all three provide estimates for the number of matching documents and the estimation patterns of their counting algorithms differ greatly. The accuracy of hit counts for multiple word queries has not been studied before. The results of our study show that the number of words in queries affects the accuracy of estimations significantly. The percentages of accurate hit count estimations are reduced almost by half when going from single word to two word query tests in all three search engines. With the increase in the number of query words, the error in estimation increases and the number of accurate estimations decreases.
Book
Class-tested and coherent, this textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. It gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Slides and additional exercises (with solutions for lecturers) are also available through the book's supporting website to help course instructors prepare their lectures.
Article
This paper presents the results of a research conducted about five search engines- AltaVista, Google, HotBot, Scirus and Bioweb -for retrieving scholarly information using Biotechnology related search terms. The search engines are evaluated taking the first ten results pertaining to 'scholarly information' for estimation of precision and recall. It shows that Scirus is most comprehensive in retrieving 'scholarly information' followed by Google and HotBot. It also reveals that the search engines (except Bioweb) perform well on structured queries while Bioweb performs better on unstructured queries.
Article
The goal of this study was to investigate the retrieval effectiveness of three popular German Web search services. For this purpose the engines Altavista.de, Google.de and Lycos.de were compared with each other in terms of the precision of their top 20 results. The test panelists were based on a collection of 50 randomly selected queries, and relevance assessments were made by independent jurors. Relevance assessments were acquired separately a) for the search results themselves and b) for the result descriptions on the search engine results pages. The basic findings were: 1.) Google reached the best result values. Statistical validation showed that Google performed significantly better than Altavista, but there was no significant difference between Google and Lycos. Lycos also attained better values than Altavista, but again the differences reached no significant value. In terms of top 20 precision, the experiment showed similar outcomes to the preceding retrieval test in 2002. Google, followed by Lycos and then Altavista, still performs best, but the gaps between the engines are closer now. 2.) There are big deviations between the relevance assignments based on the judgement of the results themselves and those based on the judgements of the result descriptions on the search engine results pages.
Article
In this demonstration, we present SIMPLIcity, an image retrieval system for picture libraries and biomedical image databases. The system uses a wavelet-based approach for feature extraction, real-time region segmentation, the Integrated Region Matching #IRM# metric, and image classi#cation methods. Tested on large-scale picture libraries and a database of pathology images, the system has demonstrated accurate and fast retrieval. It is also exceptionally robust to image alterations.
Article
The effectiveness of twenty public search engines is evaluated using TREC-inspired methods and a set of 54 queries taken from real Web search logs. The World Wide Web is taken as the test collection and a combination of crawler and text retrieval system is evaluated. The engines are compared on a range of measures derivable from binary relevance judgments of the first seven live results returned. Statistical testing reveals a significant difference between engines and high inter-correlations between measures. Surprisingly, given the dynamic nature of the Web and the time elapsed, there is also a high correlation between results of this study and a previous study by Gordon and Pathak. For nearly all engines, there is a gradual decline in precision at increasing cutoff after some initial fluctuation. Performance of the engines as a group is found to be inferior to the group of participants in the TREC-8 Large Web task, although the best engines approach the median of those systems. Shortcomings of current Web search evaluation methodology are identified and recommendations are made for future improvements. In particular, the present study and its predecessors deal with queries which are assumed to derive from a need to find a selection of documents relevant to a topic. By contrast, real Web search reflects a range of other information need types which require different judging and different measures. The authors wish to acknowledge that this work was carried out partly within the Cooperative Research Centre for Advanced Computational Systems established under the Australian Government's Cooperative Research Centres Program. 1 1
Search engines for the world wide web: a comparative study and evaluation methodology
  • H Chu
  • M Rosenthal
Chu, H. and Rosenthal, M. (1996), "Search engines for the world wide web: a comparative study and evaluation methodology", ASIS Annual Conference Proceedings, Baltimore, MD, pp. 127-135, available at: www.asis.org/annual-96/ElectronicProceedings/chu.html
A survey of feature extraction for content-based image retrieval system
  • N Ghosh
  • S Agrawal
  • M Motwani
  • B Tiwari
  • V Tiwari
  • K Das
  • D Mishra
  • J Bansal
Ghosh, N., Agrawal, S. and Motwani, M. (2018), "A survey of feature extraction for content-based image retrieval system", in Tiwari, B., Tiwari, V., Das, K., Mishra, D. and Bansal, J. (Eds), Proceedings of International Conference on Recent Advancement on Computer and Communication. Lecture Notes in Networks and Systems, Springer, Singapore, Vol. 34.
An analysis in comparison related to the problem of developing webbased information systems
  • S Kumar
  • E Kumar
Kumar, S. and Kumar, E. (2012), "An analysis in comparison related to the problem of developing webbased information systems", International Journal of Information Technology and Knowledge Management, Vol. 5 No. 1, p. 124.