Examples of the taxonomy mapping after applying the semi-automatic assignment.

Source publication

Classifying search engine queries using the web as background knowledge

Article

Full-text available

Dec 2005

The performance of search engines crucially depends on their ability to capture the meaning of a query most likely intended by the user. We study the problem of mapping a search engine query to those nodes of a given subject taxonomy that characterize its most likely meanings. We describe the architecture of a classification system that uses a web...

Context 1

... directory node is assigned up to three categories of the subject taxonomy, resulting in an n : m mapping between the taxonomies. Some examples of the resulting mapping table are shown in Table 2. The web directory organizes country specific pages in the same structure as the main taxonomy. ...

View in full-text

Context 2

... mapping recommendation system posts these queries to the web di- rectory search engine; the categories of retrieved web pages are added to a candidate list of recommended mappings. After manually inspecting these recommendations for the KDD Cup task and accepting some 200 of them, we ob- tain a mapping that entails 763 rules following the schema visualized in Table 2. ...

View in full-text

Assessment on Recurrent Applications of Machine Learning and its Behaviours

Article

Nov 2019

Artificial intelligence (AI) and machine learning are at the moment measured to be the unique widespread inventions. Artificial Intelligence rummage-sale to stand an unbelievable conception from science fiction, but nowadays it’s flattering a day-to-day authenticity. On the other hand, a neural network emulates the procedure of actual neurons in the brain that are parquet the track near innovations in machine learning, baptised deep learning. Machine learning can cosiness us living cheerier, improved, and additional dynamic conscious, if the power of the Deep learning concepts and its proper utilization as an industrial revolution that harness mental and cognitive ability. Currently lots of research papers deal with the Artificial Intelligence of deep learning in various real time applications that includes intelligent gaming, smart driving, and environmental protection and so on. Irrespective of all applications an intelligent decision making must be done timely to improve the accuracy in one end and simultaneously on the other end to consume energy and system efficiency. This paper presents the various applications using deep learning efficiently by better decision making and also how to visualize the problems in order to take a conclusion for better solution. The analysis of such real time problems is done by logically in the form of using artificial neurons through supervised and unsupervised data.

A hybrid deep neural network model for query intent classification

Article

May 2019
J INTELL FUZZY SYST

Discovering Rhetoric Agreement between a Request and Response

Article

Dec 2017

Boris Galitsky

To support a natural flow of a conversation between humans and automated agents, rhetoric structures of each message has to be analyzed. We classify a pair of paragraphs of text as appropriate for one to follow another, or inappropriate, based on both topic and communicative discourse considerations. To represent a multi-sentence message with respect to how it should follow a previous message in a conversation or dialogue, we build an extension of a discourse tree for it. Extended discourse tree is based on a discourse tree for RST relations with labels for communicative actions, and also additional arcs for anaphora and ontology-based relations for entities. We refer to such trees as Communicative Discourse Trees (CDTs). We explore syntactic and discourse features that are indicative of correct vs incorrect request-response or question-answer pairs. Two learning frameworks are used to recognize such correct pairs: deterministic, nearest-neighbor learning of CDTs as graphs, and a tree kernel learning of CDTs, where a feature space of all CDT sub-trees is subject to SVM learning. We form the positive training set from the correct pairs obtained from Yahoo Answers, social network, corporate conversations including Enron emails, customer complaints and interviews by journalists. The corresponding negative training set is artificially created by attaching responses for different, inappropriate requests that include relevant keywords. The evaluation showed that it is possible to recognize valid pairs in 70% of cases in the domains of weak request-response agreement and 80% of cases in the domains of strong agreement, which is essential to support automated conversations. These accuracies are comparable with the benchmark task of classification of discourse trees themselves as valid or invalid, and also with classification of multi-sentence answers in factoid question-answering systems. The applicability of proposed machinery to the problem of chatbots, social chats and programming via NL is demonstrated. We conclude that learning rhetoric structures in the form of CDTs is the key source of data to support answering complex questions, chatbots and dialogue management.

Introducing Privacy In Current Web Search Engines

Thesis

Mar 2017

Albin Petit

During the last few years, the technological progress in collecting, storing and processing a large quantity of data for a reasonable cost has raised serious privacy issues. Privacy concerns many areas, but is especially important in frequently used services like search engines (e.g., Google, Bing, Yahoo!). These services allow users to retrieve relevant content on the Internet by exploiting their personal data. In this context, developing solutions to enable users to use these services in a privacy-preserving way is becoming increasingly important. In this thesis, we introduce SimAttack an attack against existing protection mechanism to query search engines in a privacy-preserving way. This attack aims at retrieving the original user query. We show with this attack that three representative state-of-the-art solutions do not protect the user privacy in a satisfactory manner. We therefore develop PEAS a new protection mechanism that better protects the user privacy. This solution leverages two types of protection: hiding the user identity (with a succession of two nodes) and masking users' queries (by combining them with several fake queries). To generate realistic fake queries, PEAS exploits previous queries sent by the users in the system. Finally, we present mechanisms to identify sensitive queries. Our goal is to adapt existing protection mechanisms to protect sensitive queries only, and thus save user resources (e.g., CPU, RAM). We design two modules to identify sensitive queries. By deploying these modules on real protection mechanisms, we establish empirically that they dramatically improve the performance of the protection mechanisms.

GSEReleC# – The Most Optimized C# Implementation of SEReleC using Google

Article

Full-text available

Feb 2016

JSEReleC – An Open Source Optimized SEReleC

Article

Full-text available

Jan 2016

A feature-free search query classification approach using semantic distance

Article

Sep 2012
EXPERT SYST APPL

WebSEReleC – Optimized Web Implementation of SEReleC Using Google

Article

Full-text available

Aug 2012

The World Wide Web has immense resources for all kind of people for their specific needs. Searching on the Web using search engines such as Google, Bing, Ask have become an extremely common way of locating information. Searches are factorized by using either term or keyword sequentially or through short sentences. The challenge for the user is to come up with a set of search terms/keywords/sentence which is neither too large (making the search too specific and resulting in many false negatives) nor too small (making the search too general and resulting in many false positives) to get the desired result. No matter, how the user specifies the search query, the results retrieved, organized and presented by the search engines are in terms of millions of linked pages of which many of them might not be useful to the user fully. In fact, the end user never knows that which pages are exactly matching the query and which are not, till one check the pages individually. This task is quite tedious and a kind of drudgery. This is because of lack of refinement and any meaningful classification of search result. Providing the accurate and precise result to the end users has become Holy Grail for the search engines like Google, Bing, Ask etc. There are number of implementations arrived on web in order to provide better result to the users in the form of DuckDuckGo, Yippy, Dogpile etc. This research proposes development of a meta-search engine, called WebSEReleC (Web-based SEReleC) that provides an interface for refining and classifying the search engines' results so as to narrow down the search results in a sequentially linked manner resulting in drastic reduction of number of pages using power of Google.

EGG (Enhanced Guided Google) — A meta search engine for combinatorial keyword search

Article

Full-text available

Dec 2011

The World Wide Web has immense resources for all kind of people for their specific needs. Searching on the Web using search engines such as Google, Bing, Ask have become an extremely common way of locating information. Searches are factorized by using either term or keyword sequentially or through short sentences. The challenge for the user is to come up with a set of search terms/keywords/sentence which is neither too large (making the search too specific and resulting in many false negatives) nor too small (making the search too general and resulting in many false positives) to get the desired result. No matter, how the user specifies the search query, the results retrieved, organized and presented by the search engines are in terms of millions of linked pages of which many of them might not be useful to the user fully. In fact, the end user never knows which pages are exactly matching the query and which the pages are not, till one checks it individually by referring that page. Providing the accurate and precise result to the end users has become Holy Grail for the search engines like Google, Bing, Ask etc. This research proposes a meta-search engine called EGG that is intended to use power of the Google for more accurate and combinatorial search. This is achieved through simple manipulation and automation of Google functions that are accessible from EGG through the Google.

Topic-Sensitive Search Engine Evaluation.

Article

Nov 2011
ONLINE INFORM REV

Purpose This work aims to investigate the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation. Design/methodology/approach The authors reweight queries used in two TREC tasks to make them match three real background topic distributions, and show that the performance rankings of retrieval systems are quite different. Findings It is found that search engines tend to perform similarly on queries about the same topic; and search engine performance is sensitive to the topic distribution of queries used in evaluation. Originality/value Using experiments with multiple real‐world query logs, the paper demonstrates weaknesses in the current evaluation model of retrieval systems.

Examples of the taxonomy mapping after applying the semi-automatic assignment.

Contexts in source publication

Citations