Comparison of Search Techniques

Source publication

Ingrid: A self-configuring information navigation infrastructure

Article

Full-text available

Jan 1996

This paper presents Ingrid, an architecture for a fully distributed, fully self-configuring information navigation infrastructure that is designed to scale to global proportions. Unlike current designs, Ingrid is not a hierarchy of large index servers. Rather, links are automatically placed between individual resources based on their topic similari...

Context 1

... and Harvest are just two of many examples of the latter category. As illustrated in Figure 4, the goal of Ingrid is to allow searching of the whole web, but necessarily with less depth than can be achieved with a single-database search engine. Thus, the functionality of Ingrid is complementary with that of limited-coverage single-search ...

View in full-text

Context 2

... examples of search engines that attempt to index all web resources are Lycos and WWWW. As shown in Figure 4, these whole-web search engines and Ingrid are attempting to do roughly the same job, and are therefore essentially competing technologies. Thus, we wish to briefly justify the work of Ingrid in light of whole-web search engines. The primary justification for work on Ingrid is scaling. It is not clear that the single-database approach will be able to keep pace with the growth of the web. So far, Lycos has apparently been able to keep pace, as it seems to consistently be indexing approximately 75% of the estimated 4 million (as of July 1995) total URLs. On one hand, 4 million documents barely scratches the surface of the total number of documents that can be expected to be available over the web in the future. On the other hand, Lycos has probably barely scratched the surface of what a "single" search-engine can do, given massive parallelism, huge memory farms, and the like. In short, the ability of a single-database search engine to be able the index the entire web, and the associated costs, are unknown. Likewise, the ability of Ingrid to search the entire web is also unknown. Thus, it seems prudent to experiment with both ...

View in full-text

The ChromoZoom web
interface. (A) Select tracks to be displayed. (B)...

ChromoZoom: A flexible, fluid, web-based genome browser

Article

Full-text available

Dec 2012

Unlabelled: Current web-based genome browsers require repetitious user input to scroll over long distances, alter the drawing density of elements or zoom through multiple orders of magnitude. Generally, either the server or the client is responsible for the majority of data processing, resulting in either servers having to receive and handle data...

The Application of the Campus Experimental Project Management System Based on Intranet Technology

Article

Full-text available

Dec 2012

According to the characteristics of the management workflow in the department of experimental management of teaching administration, a solution of the campus experimental project management system based on Intranet technology has been designed. The solution, by adopting Internet technology and the system structure of Brower/Server to analysis and d...

A real time mobile-based face recognition with fisherface methods

Article

Full-text available

Mar 2018

Face Recognition is a field research in Computer Vision that study about learning face and determine the identity of the face from a picture sent to the system. By utilizing this face recognition technology, learning process about people's identity between students in a university will become simpler. With this technology, student won't need to bro...

A model for indexing and resource discovery in the information mesh /

Article

Full-text available

Aug 2007

Lewis D. (Lewis David) Girod

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996. Includes bibliographical references (leaves 166-169).

Designing for Scale and Differentiation

Article

Oct 2003
COMPUT COMMUN REV

Karen R. Sollins

Naïve pictures of the Internet frequently portray a small collection of hosts or LAN's connected by a "cloud" of connectivity. The truth is more complex. The IP-level structure of the Internet is composed from a large number of constituent networks, each of which differs in some or all of transmission technologies, routing protocols, administrative models, security policies, QoS capabilities, pricing mechanisms, and similar attributes. On top of this, a whole new structure of application-layer overlays and content distribution networks, equally diverse in the sorts of ways mentioned above, is rapidly evolving. Virtually any horizontal slice through the current Internet structure reveals a loosely coupled federation of separately defined, operated, and managed entities, interconnected to varying degrees, and often differing drastically in internal requirements and implementation. Intuitively, it is natural to think of each of these entities as existing in a region of the network, with each region having coherent internal technology and policies, and each region managing its interactions with other regions of the net according to some defined set of rules and policies.In this paper, we propose that a key design element in an architecture for extremely large scale, wide distribution and heterogeneous networks is a grouping and partitioning mechanism we call the region. Furthermore we postulate that such a mechanism can provide increased functionality and management of existing unresolved problems in current networks. The paper both describes a proposed definition of the region concept and explores the utility of such a mechanism through a series of examples. We claim that there is significant added benefit to generalizing the idea of the region.

2.1 Distributed Search Issues......................... 5

Article

Jan 2003

The metaknowledge-based intelligent routing system (MIRS)

Article

Aug 2000
DATA KNOWL ENG

This paper addresses the issue of locating relevant information in a network of heterogeneous, unfederated information bases of various types, including structured databases, text, audio, picture and video files. The problem is to determine where the required information resides in a network, in locations unknown to the user. The objective is to construct a user-friendly, intelligent, search and routing mechanism in order to find the most relevant information bases in the network. We introduce a mechanism for presenting queries, routing queries, updating knowledge, and learning in a metaknowledge base (MKB). This has been named the metaknowledge-based intelligent routing system (MIRS). MIRS finds the location of the desired information by its ability to “understand” the user’s query and to access information by content, rather than by address. MIRS behaves like a distributed search engine, working with a distributed metaknowledge index-file. There is no need for periodic web-crawling, web-robots, or agents of any sort. The network itself encapsulates the knowledge and routing algorithms that provide the user access-by-content to the relevant information. Contrary to web servers, the specific MIRS servers are not linked by hypertext links, but rather by knowledge links, randomly acquired or expertly built. The system also differs from the usual search engines in that it is capable of handling different types of media (e.g., text, database, multimedia) and applies natural language parsing techniques to understand the intention of the user, as well as potentially use a user-profile to enhance the original query before distributing it over the network. The “metadata” describing the information bases are spread across a network of routing and information servers and are modified as a result of search operations and introduction of new information bases into the system.

Themenspezifische Informationssuche im Internet mit Hilfe mobiler Programme

Article

Full-text available

Jul 2000

Wolfgang Theilmann

Das Forschungsgebiet der Informationssuche (Information Retrieval) wird durch das Aufkommen des Internets bzw. des World Wide Web mit völlig neuen Herausforderungen konfrontiert. Im Gegensatz zu herkömmlichen Datenbeständen zeichnet sich das Internet durch seinen immensen Umfang, eine hohe Dynamik, die Heterogenität seiner Inhalte sowie die Verteilung über Rechner auf der ganzen Erde aus. Um auch in diesem Informationsraum präzise, effizient und umfassend nach Informationen suchen zu können, wird in dieser Arbeit ein Konzept zur Spezialisierung von Suchmaschinen auf einzelne Themengebiete vorgeschlagen. Solche Suchmaschinen erkennen die für sie relevanten Dokumente anhand einer speziellen Filterfunktion und können ihren Benutzern eine themenspezifische Benutzungsschnittstelle und Suchfunktionalität bieten. Um die für eine spezialisierte Suchmaschine relevanten Dokumente effizient zu lokalisieren, wird die Technologie der mobilen Programme eingesetzt. Anstatt alle zu untersuchenden Dokumente zur Suchmaschine zu übertragen, werden mobile Filterprogramme zu den Datenbeständen gesandt, untersuchen diese 'vor Ort' und liefern lediglich die relevanten Dokumente zurück. Es werden Verfahren vorgestellt, mit denen die Aussendung der mobilen Programme so koordiniert werden kann, dass die resultierenden Kommunikationskosten minimiert werden. Da von diesen Aussendungsverfahren Kenntnisse über die netzwerktechnische Entfernung zwischen den beteiligten Rechnern benötigt werden, wird zudem ein Ansatz vorgestellt, der die Schätzung beliebiger Netzwerkdistanzen im Internet auf skalierbare und effiziente Weise ermöglicht. Die Tragfähigkeit der Konzepte zur Schätzung von Netzwerkdistanzen und zur Aussendung mobiler Programme wird anhand umfangreicher Messungen evaluiert. Zudem wird in einer Fallstudie der Nutzen der themenspezifischen Suchmaschinen sowie der in ihrem Kontext erfolgende Einsatz mobiler Filterprogramme analysiert.

Robust hyperlinks cost just five words each

Article

Jan 2000

We propose robust hyperlinks as a solution to the problem of broken hyperlinks. A robust hyperlink is a URL augmented with a small "signature", computed from the referenced document. The signature can be submitted as a query to web search engines to locate the document. It turns out that very small signatures are sufficient to readily locate individual documents out of the many millions on the web. Robust hyperlinks exhibit a number of desirable qualities: They can be computed and exploited automatically, are small and cheap to compute (so that it is practical to make all hyperlinks robust), do not require new server or infrastructure support, can be rolled out reasonably well in the existing URL syntax, can be used to automatically retrofit existing links to make them robust, and are easy to understand. In particular, one can start using robust hyperlinks now, as servers and web pages are mostly compatible as is, while clients can increase their support in the future. Robust hyperlinks are one example of using the web to bootstrap new features onto itself.

Web-based Information Access

Article

Full-text available

Sep 1999

Tiziana Catarci

The need of friendly environments for effective information access is further enforced by the growth of the global Internet, which is causing a dramatic change in both the kind of people who access the information and the types of information itself (ranging from unstructured multimedia data to traditional record-oriented data). To cope with these new demands, the interaction techniques traditionally offered to the users have to evolve and eventually integrate in a powerful interface to the global information infrastructure. The new interaction mechanisms must be especially friendly and easy-to-use, since, given the enormous quantity of information sources available on the Internet, most of the users remain "permanent novices" with respect to each one of the sources they have access to. This tutorial offers a survey of the main approaches adopted for letting the users effectively interact with the Web. Thus, it covers topics related with both extracting the information of interest spre...

Reorganization in network regions for optimality and fairness /

Article

This thesis proposes a reorganization algorithm, based on the region abstraction, to exploit the natural structure in overlays that stems from common interests. Nodes selfishly adapt their connectivity within the overlay in a distributed fashion such that the topology evolves to clusters of users with shared interests. Our architecture leverages the inherent heterogeneity of users and places within the system their incentives and ability to affect the network. As such, it is not dependent on the altruism of any other nodes in the system. Of particular interest is the optimality and fairness of our design. We rigorously define ideal and fair networks and develop a continuum of optimality measures by which to evaluate our algorithm. Further, to evaluate our algorithm within a realistic context, validate assumptions and make design decisions, we capture data from a portion of a live file-sharing network. More importantly, we discover, name, quantify and solve several previously unrecognized subtle problems in a content-based self-organizing network as a direct result of simulations using the trace data. We motivate our design by examining the dependence of existing systems on benevolent Super-Peers. Through simulation we find that the current architecture is highly dependent on the filtering capability and the willingness of the SuperPeer network to absorb the majority of the query burden. The remainder of the thesis is devoted to a world in which SuperPeers no longer exist or are untenable. In our evaluation, we introduce four reasons for utility suboptimal self-reorganizing networks: anarchy (selfish behavior), indifference, myopia and ordering. We simulate the level of utility and happiness achieved in existing architectures. Then we systematically tear (cont.) down implicit assumptions of altruism while showing the resulting negative impact on utility. From a selfish equilibrium, with much lower global utility, we show the ability of our algorithm to reorganize and restore the utility of individual nodes, and the system as a whole, to similar levels as realized in the SuperPeer network. Simulation of our algorithm shows that it reaches the predicted optimal utility while providing fairness not realized in other systems. Further analysis includes an epsilon equilibrium model where we attempt to more accurately represent the actual reward function of nodes. We find that by employing such a model, over 60% of the nodes are connected. In addition, this model converges to a utility 34% greater than achieved in the SuperPeer network while making no assumptions on the benevolence of nodes or centralized organization. Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004. Includes bibliographical references (p. 92-95).

Recent topics in electronic commerce technology

Article

Jan 1997

The Digital Revolution has been spreading with the popularization of the Internet and is likely to change the social structure. Electronic Commerce (EC) is a virtual environment supporting social activities from information navigation/advertisement, payment and settlement, and distribution via open networks and digital information. This paper reviews the shift in emphasis from "atoms" (physical objects) to "bits" (digital information) and the current problems in EC services. Then we survey the recent trend of technologies supporting EC: information retrieval and advertisement, payment and settlement, and information distribution.

Automated categorisation of Web resources: A profile of selected projects, research, products, and services

Article

Jan 1996

Gerry McKiernan

In recognition of the need to provide better access to Web resources, a number of prototypes, products and services have emerged that provide some form of automated categorisation of Net resources. Several representative current efforts that apply established as well as more innovative methods of automated classification, organisation or other method of categorisation are profiled.

Comparison of Search Techniques

Contexts in source publication

Similar publications

Citations