Summary of Distinguishing Features

Source publication

Differences Between Distributed and Parallel Systems

Article

Full-text available

Feb 1970

Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are signific...

Context 1

... if hot-swapability is available, the system does not grow beyond the connguration established at boot time. Table 1 enumerates the diierences in the order they were presented in the previous three sections. In Section 2 we looked at diierences that can be observed by users and applications. ...

View in full-text

MPJ: Enabling Parallel Simulations in Java

Article

Full-text available

Java is emerging as a popular platform for scientific and engineering simulations. Its success can be attributed to its portable nature, good performance, and inherent support for security, threads, objects, and visualisation (graphics). In this paper, we present a message passing system called MPJ, which is an implementation of MPI in pure Java. M...

LogGPO: An accurate communication model for performance prediction of MPI programs

Article

Full-text available

Oct 2009

Message passing interface (MPI) is the de facto standard in writing parallel scientific applications on distributed memory systems. Performance prediction of MPI programs on current or future parallel systems can help to find system bottleneck or optimize programs. To effectively analyze and predict performance of a large and complex MPI program,...

Parallel Batch Pattern Training Algorithm for Deep Neural Network

Conference Paper

Full-text available

Jul 2014

The development of parallel batch pattern training algorithm for deep multilayered neural network architecture and its parallelization efficiency research on many-core system are presented in this paper. The model of a deep neural network and batch pattern training algorithm are theoretically described. The algorithmic description of the parallel b...

PUMA: An operating system for massively parallel systems

Conference Paper

Full-text available

Feb 1994

This paper presents an overview of PUMA (Performance-oriented, User-managed Messaging Architecture), a message passing kernel. Message passing in PUMA is based an portals/spl minus/an opening in the address space of an application process. Once an application process has established a portal, other processes can write values into the portal using a...

A case study in synchronous parallel discrete event simulation

Article

Full-text available

This paper considers the suitability of SPED, a synchronous parallel discrete event simulator, for the study of message passing networks. The simulation algorithm is described, and its potential performance is assessed showing that, under some simplifying assumptions, SPED might offer speedups directly proportional to the number of processors used...

ALGORITHM FOR SECURE PROCESSING OF BIG DATA BY CLOUD SERVICES USING BIG DATA TECHNOLOGIES

Article

Jun 2020

A Comparative Analysis of Distributed and Parallel Computing

Article

Full-text available

Feb 2019

In the age of emerging technologies, the amount of data is increasing very rapidly. Due to massive increase of data the level of computations are increasing. Computer executes instructions sequentially. But the time has now changed and innovation has been advanced. We are currently managing gigantic data centers that perform billions of executions on consistent schedule. Truth be- hold, if we dive deep into the processor engineering and mechanism, even a successive machine works parallel. Parallel computing is growing faster as a substitute of distributing computing. The performance to functionality ratio of parallel systems is high. Also, the I/O usage of parallel systems is lower because of ability to perform all operations simultaneously. On the other hand, the performance to functionality ratio of distributed systems is low. The I/O usage of distributed systems is higher because of incapability to perform all operations simultaneously. In this paper, an overview of distributed and parallel computing is described. The basic concept of these two computing is discussed. In addition to this, pros and cons of distributed and parallel computing models are described. Through many aspects, we can conclude that parallel systems are better than distributed systems.

Development of a New Framework for Distributed Processing of Geospatial Big Data

Article

Aug 2017

This paper presents the Big Data phenomenon, introduces the importance of new processing techniques to provide solutions to handle Big Data and Geospatial Big Data. Recently, volume and variety of available data are evolving as never before, exceeding the capabilities of traditional algorithm performance and hardware/software environment in the aspect of data management and computation (Manyika et al., 2011; IDC, 2012; Evans and Hagen, 2013). Hence, improved efficiency is required to exploit the available information derived from Geospatial Big Data. Consequently, geospatial analysis needs to be reformed to exploit the capabilities of current and emerging computing environments via new data management and processing concepts. To understand the evaluation of the techniques, the differences and the requirements we need to go in deep into the Big Data solutions (including data, analytics and infrastructure, computing background). Existing Big Data definitions are provided and summarized within a figure to serve a complex perspective. After giving summary of existing Geospatial Big Data definitions I have provided my complex, synthetized version. Geospatial Big Analytics are introduced focusing on image processing algorithms (local, focal, zonal, and global) and their parallelization aspects. © 2018 Hungarian Society of Surveying, Mapping and Remote sensing. All rights reserved.

An extensible operating system design for large-scale parallel machines

Article

Full-text available

Running untrusted user-level code inside an operating system kernel has been studied in the 1990's but has not really caught on. We believe the time has come to resurrect kernel extensions for operating systems that run on highly-parallel clusters and supercomputers. The reason is that the usage model for these machines differs significantly from a desktop machine or a server. In addition, vendors are starting to add features, such as floating-point accelerators, multicore processors, and reconfigurable compute elements. An operating system for such machines must be adaptable to the requirements of specific applications and provide abstractions to access next-generation hardware features, without sacrificing performance or scalability.

Sandia Line of LWKs

Chapter

Oct 2019

Sandia National Laboratories has been engaged in operating systems research for high-performance computing for more than two decades. The focus has always been extremely parallel systems and the most efficient systems software possible to complete the scientific work demanded by the laboratories’ mission. This chapter provides a chronological overview of the operating systems developed at Sandia and the University of New Mexico. Along the way we highlight why certain design decisions were made, what we have learned from our failures, and what has worked well. We summarize these lessons at the end of the chapter, but hope that the more detailed explanations in the text may be useful to future HPC OS designers.

Evaluating decomposition strategies to enable scalable scheduling for a real-world multi-line steel scheduling problem

Conference Paper

Nov 2017

Data mining in distributed environment: a survey

Article

Full-text available

Jul 2017

Due to the rapid growth of resource sharing, distributed systems are developed, which can be used to utilize the computations. Data mining ( DM ) provides powerful techniques for finding meaningful and useful information from a very large amount of data, and has a wide range of real‐world applications. However, traditional DM algorithms assume that the data is centrally collected, memory‐resident, and static. It is challenging to manage the large‐scale data and process them with very limited resources. For example, large amounts of data are quickly produced and stored at multiple locations. It becomes increasingly expensive to centralize them in a single place. Moreover, traditional DM algorithms generally have some problems and challenges, such as memory limits, low processing ability, and inadequate hard disk, and so on. To solve the above problems, DM on distributed computing environment [also called distributed data mining (DDM)] has been emerging as a valuable alternative in many applications. In this study, a survey of state‐of‐the‐art DDM techniques is provided, including distributed frequent itemset mining, distributed frequent sequence mining, distributed frequent graph mining, distributed clustering, and privacy preserving of distributed data mining. We finally summarize the opportunities of data mining tasks in distributed environment. WIREs Data Mining Knowl Discov 2017, 7:e1216. doi: 10.1002/widm.1216 This article is categorized under: Application Areas > Business and Industry Fundamental Concepts of Data and Knowledge > Motivation and Emergence of Data Mining Technologies > Computer Architectures for Data Mining

Big Data Processing Algorithms

Chapter

Jun 2015

VenkataSwamy Martha

Information has been growing large enough to realize the need to extend traditional algorithms to scale. Since the data cannot fit in memory and is distributed across machines, the algorithms should also comply with the distributed storage. This chapter introduces some of the algorithms to work on such distributed storage and to scale with massive data. The algorithms, called Big Data Processing Algorithms, comprise random walks, distributed hash tables, streaming, bulk synchronous processing (BSP), and MapReduce paradigms. Each of these algorithms is unique in its approach and fits certain problems. The goal of the algorithms is to reduce network communications in the distributed network, minimize the data movements, bring down synchronous delays, and optimize computational resources. Data to be processed where it resides, peer-to-peer-based network communications, computational and aggregation components for synchronization are some of the techniques being used in these algorithms to achieve the goals. MapReduce has been adopted in Big Data problems widely. This chapter demonstrates how MapReduce enables analytics to process massive data with ease. This chapter also provides example applications and codebase for readers to start hands-on with the algorithms.

Summary of Distinguishing Features

Context in source publication

Similar publications

Citations