Example of a fully connected overlay topology over a wide area network.

Source publication

Unstructured peer-to-peer networks-next generation of performance and reliability

Article

Full-text available

In this poster we will present our work on the design of efficient and reliable unstructured peer-to-peer (P2P) systems. Our work focuses on creating well-connected unstructured P2P overlay that can per-form efficient searching and message routing. We show that well designed systems can tolerate high node failures (>25%) while maintaining connectiv...

Context 1

... this poster we will present our work on the design of efficient and reliable unstructured peer-to-peer (P2P) systems. Our work focuses on creating well-connected unstructured P2P overlay that can per- form efficient searching and message routing. We show that well designed systems can tolerate high node failures ( > 25%) while maintaining connectivity and still resolving searches with few mes- sages. This is important in applications such as file sharing and con- tent distribution where there are many thousands of participating nodes that are widely dispersed and network coniditions are highly variable. Structured P2P systems are not suitable for these applica- tions since such applications require multi-attribute and wild card searching. We show that carefully constructed overlays can resolve this type of search within 4 hops for large networks ( > 10,000 nodes) with low object replication ratios ( < 1%). Peer-to-peer (P2P) file sharing networks such as Gnutella [2], Kazaa [3] and BitTorrent [1] have become increasingly popular. The pop- ularity of P2P networks has fueled interest in leveraging them to build large scale distributed applications such as distributed data storage [10], cooperative backup [8], and distributed multicast [6]. Consider the problem of distributing large ( > 10 MB) multimedia files across a wide area network to many users. A centralized ap- proach requires the publisher of the content to have the server and network infrastructure to host the content. This requires enough bandwidth to handle a large number of users simultaneously down- loading the content. Such infrastructure does not scale well as the number of users increases and requires a large financial com- mitment that small content publishers will likely not be able to make. Additionally, such centralized infrastructures are suscepti- ble to failure as well as targeted attacks. Ideally, the content pub- lisher would place a few replicas of the content on different nodes in the network and users would get redirected to the replica that is nearest to them. If the object becomes popular, more replicas of it should be created to limit the bandwidth consumption on wide area network links by allowing nodes to download replicas that are near them. On the user side of the application, objects and replicas in the system need to be discovered efficiently. This includes efficiently locating the nearest replica. The decentralized and self-organizing nature of P2P networks are ideally suited to solving this type of problem. In this poster we will present our work exploring the use of un- structured P2P systems as viable platforms for distributed sharing and distribution of content. Specifically, we will present our work in exploiting the inherent flexibility of the unstructured P2P model to create well connected and fault-tolerant overlays with efficient searching and routing mechanisms. Our work shows that these ef- ficient P2P networks can significantly outperform current unstruc- tured P2P networks by overcoming the key limitations of these sys- tems. Unstructured P2P overlays are inherently flexible in their neighbor selection and routing mechanisms. They can leverage proximity information of the underlying network to localize the communi- cation pattern of the system. They can create topologies that are resilient to random node failures as well as withstand targeted ma- licious attacks. However, traditional unstructured P2P overlays do not exploit many of these benefits. Analysis of popular unstruc- tured P2P networks shows that current systems create topologies [12] and utilize search mechanisms [9] that do not match the under- lying network characteristics. In particular, these systems exhibit preferential connection tendencies toward highly connected nodes as witnessed in power-law networks [4]. Such overlays are vulner- able to node failures of these highly connected nodes. Additionally, these overlays exhibit high communication costs since the peer se- lection process ignores network proximity and thus nodes tend to select neighbors that are distant in terms of network latency. Creating an unstructured P2P topology with desirable proximity awareness properties poses several problems. First, transient net- work conditions and node lifespans require a distributed solution where each node makes independent decisions with limited depen- dence on global information from other nodes. Also, some nodes may naturally appear desirable to many nodes; the topology gen- eration mechanism should balance the use of proximity informa- tion with the capacity of the node to service these neighbors. The topology should also maintain global connectivity, especially in the face of node failures. Our aim is to examine the performance of the different algorithms used for peer selection in unstructured P2P systems and determine which algorithms yield overlays that best achieve the goals of low communication cost, good connectivity, and efficient searching. We are interested in determining whether a given overlay has desir- able connectivity properties. The connectivity and compactness of the overlay affect the fault-tolerance of the overlay to node failures as well as the ability to reach many nodes quickly and thus affect- ing the efficiency of search mechanisms. However, there is a need to balance good connectivity with network scalability As an exam- ple, consider a ring topology as shown in Figure 1. Ring topologies are sparse ( O ( n ) edges), so each node only needs to maintain two connections. However, the network can be easily partitioned with just two node failures. Further, because each node maintains only two connections, high capacity nodes will be underutilized. On the other hand, a fully connected topology as shown in Figure 2 can tolerate many faults before the network becomes partitioned. How- ever, because such a network is dense and has many edges ( O ( n 2 ) edges), each node must maintain many connections. This solution is not scalable. It also forces low capacity nodes to maintain more connections than they may be able to handle. The process of creating and maintaining the overlay in P2P sys- tems is decentralized and distributed. Each node must make local decisions without requiring each node to have global information about the system. To create the overlay, nodes find peers that are already in the network, they then evaluate which peers would be better neighbors, and then connect to those peers. We can describe a P2P system with the following abstract ...

View in full-text

OPTIMIZING FILE REPLICATION AND CONSISTENCY MAINTENANCE IN P2P SYSTEM USING PRIORITY BASED POLLING STRATEGY

Article

Full-text available

Jan 2011

The current peer-to-peer (p2p) systems facilitate static file sharing, while future peer-to-peer systems support sharing of files that are frequently modified by their users. Replication techniques are essential to reduce the load of the nodes hosting these files. Maintaining consistency between frequently updated files and their replicas is an ess...

Knowledge and Cache based Adaptive Query Searching in Unstructured P2P Networks

Article

Jun 2012

Efficient search is a challenging task in unstructured peer-to-peer networks. In this paper, Knowledge and Cache based Adaptive Query Searching (KCAQS) is proposed that adaptively performs a query searching through either directed flooding or biased random walk based on the number of hop counts in query message. In addition, knowledge intended forwarding is deployed for forwarding a query to the high quality peers through probabilistic knowledge predicted from the previously requested queries. Searched results are properly cached in the peers along the returning path. Synchronized caching is performed to properly update the responses of each peer to its connected corresponding high degree connectivity peer in the overlay network. Due to caching of the same files in many peers, most of the cached responses may become redundant. In order to avoid redundant data, cache consistency is sustained through the flexible polling mechanism where a proper cache update is performed through Additive Decrease Multiplicative Increase (ADMI) algorithm based on file utility. Our experimental study shows that the proposed searching scheme significantly reduces the network search traffic and communication overhead. Performance metrics such as success rate, access latency, network traffic response time and cache hit ratio are evaluated for the proposed scheme.

Exploiting the TTL Rule in Unstructured Peer-to-Peer Networks

Conference Paper

Full-text available

Jun 2006

Peer-to-Peer networks exist with the volunteering cooperation of various entities on the Internet. Their self-structure nature has the important characteristic that they make no use of central entities to run as coordinators and the benefits of this cooperation can be enjoyed equally by all the members of the community with the assumption that they all make right use of the protocol. In this study we examine what the consequences on the community are in the case of existence of misbehaving nodes which can abuse the network resources for their personal benefit and we also analyze the cost and the benefit of some proposed solution that could be used to bring into account the above problem.

On The Issues Of Supporting On-Demand Streaming Application Over Peer-to-Peer Networks

Article

K Kalapriya

Bandwidth and resource constraints at the server side is a limitation for deployment of streaming media applications. Resource constraints at the server side often leads to saturation of resources during sudden increase in requests. End System Multicast (ESM) is used to overcome the problem of resource saturation. Resources such as storage, bandwidth available at the end systems are utilized to deliver streaming media. In ESM, the end-systems (also known as peers) form a network which is commonly known as Peer-to-Peer (P2P) network. These peers that receive the stream in turn act as routable components and forward the stream to other requests. These peers do not possess server like characteristics. The peers diﬀer from the server in the following ways: (a) they join and exit the system at will (b) unlike servers, they are not reliable source of media. This induces instability in the network. Therefore, streaming media solution over such unstable peer network is a challenging task. Two kinds of media streaming is supported by ESM, namely, live streaming media and on-demand streaming media. ESM is well studied to support live streaming media. In this thesis we explore the eﬀectiveness of using ESM to support on-demand streaming media over P2P network. There are two major issues to support on-demand streaming video.They are: (a)unlike live streaming, every request should be served from the beginning of the stream and (b) instability in the network due to peer characteristics (particularly transience of peers). In our work, late arriving peers can join the existing stream if the initial segments can be served to these peers. In this scheme, a single stream is used to serve multiple requests and therefore the throughput increases. We propose patching mechanism in which the initial segments of media are temporarily cached in the peers as patches. The peers as they join, contribute storage and this storage space is used to cache the initial segments. The patching mechanism is controlled by Expanding Window Control Protocol (EWCP). EWCP deﬁnes a “virtual window” that logically represents the aggregated cache contributed by the peers. The window expands as the peer contribute more resources. Larger the window size more is the number of clients that can be served by a single stream. GAP is formed when contiguous segments of media is lost. GAP limits the expansion of the virtual window. We explore the conditions that lead to the formation of GAP. GAP is formed due to the transience and non-cooperation of peers. Transience of peers coupled with real time nature of the application requires fast failure recovery algorithms and methods to overcome loss of media segments. We propose an eﬃcient peer management protocol that provides constant failure recovery time. We explore several redundancy techniques to overcome the problem of loss of video segments during transience of peers. Peer characteristics (duration, resource contribution etc.) have signiﬁcant impact on performance.The design of peer management protocol must include peer characteristics to increase its eﬀectiveness. In this thesis we present detailed analysis of the relationship between the peer characteristics and performance. Our results indicate that peer characteristics and realtime nature of the application control the performance of the system. Based on our study, we propose algorithms that considers these parameters and increase the performance of the system. Finally, we bring all the pieces of our work together into a comprehensive system architecture for streaming media over P2P networks. We have implemented a prototype Black-Board System (BBS), a distance program utility that reﬂects the main concepts of our work. We show that algorithms that exploit peer characteristics performs well in P2P networks.

A Scalable Autonomous File-based Replica Management Framework

Article

Dong Li

Resource search in unstructured peer-to-peer system based on multiple-tree overlay structure

Article

Sep 2007

We propose a multiple-tree overlay structure for resource discovery in unstructured P2P systems. Peers that have similar interests or hold similar type of resources will be grouped into a tree-like cluster. We exploit the heterogeneity of peers in each cluster by connecting peers with more capacities closer to the root of the tree. The capacity of a peer can be defined in different ways (e.g. higher network bandwidth, larger disk space, more data items of a certain type etc.) according to different needs of users or applications.

Peer Interest-based Discovery for Decentralized Peer-to-Peer Systems

Conference Paper

Dec 2010

The success of content distribution oriented peer-to-peer systems heavily depends on the resource discovery mechanism. In case of large-scale distributed systems, this mechanism must be scalable and robust. The paper proposes a structured solution for resource discovery in decentralized peer-to-peer systems, which is guided by peer interest in collaborating with other peers. The problem of discovering peers of interest has many applications in file sharing, in data-aware scheduling, and in optimizing the files and documents downloads. Moreover, if trust is added as another parameter to define peers of interest, the interest-based discovery is useful in trusted P2P applications. We focused on developing the overlay network to ensure a very small number of messages required to retrieve, insert or delete a file even in the case of a very large network containing millions of nodes. In the experimental validation we used Over Sim, a simulation tool for P2P systems. The experimental results highlight the good performance obtained regarding message communication and system's scalability.

A Methodology for the Systematic Evaluation of ANN Classifiers for BSN Applications

Conference Paper

Jun 2010

While many BSN applications require that sensor nodes be able to operate for extended periods of time, they also often require the wireless transmission of copious amounts of sensor data to a data aggregator or base station, where the raw data is processed into application-relevant information. The energy requirements of such streaming can be prohibitive, given the competing considerations of form factor and battery life requirements. Making intelligent decisions on the node about which data to store or transmit, and which to ignore, is a promising method of reducing energy consumption. Artificial neural network (ANN) classifiers are among several competitive techniques for such data selection. However, no systematic metrics exist for determining if an ANN classifier is suited for a particular resource constrained computing environment of a typical BSN node. An especially difficult task is assessing, at the design stage, which classifier architectures are feasible on a given resource-constrained node, what computational resources are required to execute a given classifier, and what classification performance might be achieved by a particular classifier on a given set of resources. This paper describes techniques for quantifying and predicting the performance of ANN classifiers on wearable sensor nodes using scalable synthetic test data. Additionally, the paper shows a comparison of synthetic data with gait data collected using an inertial BSN node, and classification results of the gait data using a cerebellar model arithmetic computer (CMAC) architecture show excellent agreement with theoretical predictions.

SOPSys: Self-Organizing Decentralized Peer-to-Peer System Based on Well Balanced Multi-Way Trees

Conference Paper

Nov 2010

This paper proposes SOPSys, a decentralized, self-organizing peer-to-peer architecture. The overlay network is extremely scalable being organized as a well balanced multi-way tree according to the trust value of each peer. The root is only responsible for maintaining and publishing a list of existing nodes and does not take part into the joining or routing process. The joining algorithm preserves the balance of the tree, guaranteeing a reduced join and discovery cost. The number of exchanged messages for these phases is logarithmic time: O(logk N), where k represents the branching factor of the overlay tree, and N represents the total number of nodes. The conducted experiments have proven that the joining algorithm assures that the overlay tree is well balanced and thus offers high scalability.

Design and implementation of an efficient search mechanism based on the hybrid P2P model for ubiquitous computing systems

Conference Paper

Feb 2006

To realize ubiquitous computing using quite a huge number of computing resources on the Internet, this paper presents an effective computing resource search mechanism based on peer-to-peer (P2P) ad-hoc networking that finds appropriate resource providers for each resource user. The main idea of the proposed mechanism is to classify the attributes of computing resources into two groups: a static attribute group and a dynamic attribute group. The attributes classified into the static attribute group are managed in a centralized manner, while the attributes in the dynamic attribute group are managed in a decentralized manner. In resource searching, this hybrid-type resource management mechanism can limit the candidates based on the static attributes firsts, and then the appropriate resources can be searched according to the dynamic attributes only within the area restricted by the static attributes. Therefore, the mechanism can prevent the network traffic explosion due to the flooding search in native P2P systems. Performance evaluation is carried out theoretically and experimentally. The results of theoretical evaluation indicate that the proposed model is well-suited for the resource search required by the ubiquitous computing, compared with both a typical server-client model and a P2P-based model. Prototyping the proposed mechanism experimentally shows its feasibility.

Associative search in peer to peer networks: Harnessing latent semantics

Conference Paper

Jun 2007

The success of a P2P file-sharing network highly depends on the scalability and versatility of its search mechanism. Two particularly desirable search features are scope (ability to find infrequent items) and support for partial-match queries (queries that contain typos or include a subset of keywords). While centralized-index architectures (such as Napster) can support both these features, existing decentralized architectures seem to support at most one: prevailing unstructured P2P protocols (such as Gnutella and FastTrack) deploy a "blind" search mechanism where the set of peers probed is unrelated to the query; thus they support partial-match queries but have limited scope. On the other extreme, the recently-proposed distributed hash tables (DHTs) such as CAN and CHORD, couple index location with the item's hash value, and thus have good scope but can not effectively support partial-match queries. Another hurdle to DHTs deployment is their tight control of the overlay structure and the information (part of the index) each peer maintains, which makes them more sensitive to failures and frequent joins and disconnects. We develop a new class of decentralized P2P architectures. Our design is based on unstructured architectures such as gnutella and FastTrack, and retains many of their appealing properties including support for partial match queries, and relative resilience to peer failures. Yet, we obtain orders of magnitude improvement in the efficiency of locating rare items. Our approach exploits associations inherent in human selections to steer the search process to peers that are more likely to have an answer to the query. We demonstrate the potential of associative search using models, analysis, and simulations.

Example of a fully connected overlay topology over a wide area network.

Context in source publication

Similar publications

Citations