Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Stable Sparse Orthogonal Factorization Of Ill-Conditioned Banded Matrices For Parallel Computing, Qian Huang Aug 2017

Stable Sparse Orthogonal Factorization Of Ill-Conditioned Banded Matrices For Parallel Computing, Qian Huang

Dissertations - ALL

Sequential and parallel algorithms based on the LU factorization or the QR factorization have been intensely studied and widely used in the problems of computation with large-scale ill-conditioned banded matrices. Great concerns on existing methods include ill-conditioning, sparsity of factor matrices, computational complexity, and scalability. In this dissertation, we study a sparse orthogonal factorization of a banded matrix motivated by parallel computing. Specifically, we develop a process to factorize a banded matrix as a product of a sparse orthogonal matrix and a sparse matrix which can be transformed to an upper triangular matrix by column permutations. We prove that the …


Common Runtime Support For High Performance Languages, Geoffrey C. Fox Jan 1998

Common Runtime Support For High Performance Languages, Geoffrey C. Fox

Northeast Parallel Architecture Center

Widespread adoption of parallel computing depends on the availability of improved software environments. An essential component of these environments will be high-level languages. Several languages for exploiting data-parallelism (or task-parallelism) have been developed, or are under development. The stated goal of this project has been to provide a public domain infrastructure for runtime support of these high-level languages. The targeted languages include parallel versions of Fortran and C++, but our intention has been to provide uniform runtime support for many source languages.


A Load Balancing Technique For Multiphase Computations, Jerrell Watts, Marc Rieffel, Stephen Taylor Jan 1997

A Load Balancing Technique For Multiphase Computations, Jerrell Watts, Marc Rieffel, Stephen Taylor

College of Engineering and Computer Science - Former Departments, Centers, Institutes and Projects

Parallel computations comprised of multiple, tightly interwoven phases of computation may require a different approach to dynamic load balancing than single-phase computations. This paper presents a load sharing method based on the view of load as a vector, rather than as a scalar. This approach allows multiphase computations to achieve higher efficiency on large-scale multicomputers than possible with traditional techniques. Results are presented for two large-scale particle simulations running on 128 nodes of an Intel Paragon and on 256 processors of a Cray T3D, respectively.


Java For Parallel Computing And As A General Language For Scientific And Engineering Simulation And Modeling, Geoffrey C. Fox, Wojtek Furmanski Jan 1997

Java For Parallel Computing And As A General Language For Scientific And Engineering Simulation And Modeling, Geoffrey C. Fox, Wojtek Furmanski

Northeast Parallel Architecture Center

We discuss the role of Java and Web technologies for general simulation. We classify the classes of concurrency typical in problems and analyze separately the role of Java in user interfaces, coarse grain software integration, and detailed computational kernels. We conclude that Java could become a major language for computational science, as it potentially offers good performance, excellent user interfaces, and the advantages of object-oriented structure.


Basic Issues And Current Status Of Parallel Computing -- 1995, Geoffrey C. Fox Jan 1995

Basic Issues And Current Status Of Parallel Computing -- 1995, Geoffrey C. Fox

Northeast Parallel Architecture Center

The best enterprises have both a compelling need pulling them forward and an innovative technological solution pushing them on. In high-performance computing, we have the need for increased computational power in many applications and the inevitable long-term solution is massive parallelism. In the short term, the relation between pull and push may seem unclear as novel algorithms and software are needed to support parallel computing. However, eventually parallelism will be present in all computers -- including those in your children's video game, your personal computer or workstation, and the central supercomputer.


Parallel Incremental Graph Partitioning Using Linear Programming, Chao Wei Ou, Sanjay Ranka Jan 1994

Parallel Incremental Graph Partitioning Using Linear Programming, Chao Wei Ou, Sanjay Ranka

College of Engineering and Computer Science - Former Departments, Centers, Institutes and Projects

Partitioning graphs into equally large groups of nodes while minimizing the number of edges between different groups is an extremely important problem in parallel computing. For instance, efficiently parallelizing several scientific and engineering applications requires the partitioning of data or tasks among processors such that the computational load on each node is roughly the same, while communication is minimized. Obtaining exact solutions is computationally intractable, since graph-partitioning is an NP-complete. For a large class of irregular and adaptive data parallel applications (such as adaptive meshes), the computational structure changes from one phase to another in an incremental fashion. In incremental …


A Communication System For High-Performance Distributed Computing, Salim Hariri, Jongbaek Park, Manish Parashar, Geoffrey C. Fox Jan 1994

A Communication System For High-Performance Distributed Computing, Salim Hariri, Jongbaek Park, Manish Parashar, Geoffrey C. Fox

Northeast Parallel Architecture Center

With the current advances in computer and networking technology coupled with the availability of software tools for parallel and distributed computing, there has been increased interests in high-performance distributed computing (HPDC). We envision that HPDC environments with supercomputing capabilities will be available in the near future. However, a number of issues have to be resolved before future network-based applications can exploit fully the potential of HPDC environment. In this paper, we present an architecture of a high-speed local area network and a communication system that provides HPDC applications with high bandwidth and low latency. We also characterize the message-passing primitives …


A Message Passing Interface For Parallel And Distributed Computing, Salim Hariri, Jongbaek Park, Fang-Kuo Yu, Manish Parashar Jan 1993

A Message Passing Interface For Parallel And Distributed Computing, Salim Hariri, Jongbaek Park, Fang-Kuo Yu, Manish Parashar

Northeast Parallel Architecture Center

The proliferation of high performance workstations and the emergence of high speed networks have attracted a lot of interest in parallel and distributed computing (PDC). We envision that PDC environments with supercomputing capabilities will be available in the near future. However, a number of hardware and software issues have to be resolved before the full potential of these PDC environments can be exploited. The presented research has the following objectives: (1) to characterize the message-passing primitives used in parallel and distributed computing; (2) to develop a communication protocol that supports PDC; and (3) to develop an architectural support for PDC …


Complete Exchange On A Wormhole Routed Mesh, Rajeev Thakur, Alok Choudhary, Geoffrey C. Fox Jan 1993

Complete Exchange On A Wormhole Routed Mesh, Rajeev Thakur, Alok Choudhary, Geoffrey C. Fox

Northeast Parallel Architecture Center

The complete exchange (or all-to-all personalized) communication pattern occurs frequently in many important parallel computing applications. We discuss several algorithms to perform complete exchange on a two dimensional mesh connected computer with wormhole routing. We propose algorithms for both powerof -two and non power-of-two meshes as well as an algorithm which works for any arbitrary mesh. We have developed analytical models to estimate the performance of the algorithms on the basis of system parameters. These models take into account the effects of link contention and other characteristics of the communication system. Performance results on the Intel Touchstone Delta are presented …


Software Issues And Performance Of A Parallel Model For Stock Option Pricing, Kim Mills, Gang Cheng, Michael Vinson, Sanjay Ranka Jan 1992

Software Issues And Performance Of A Parallel Model For Stock Option Pricing, Kim Mills, Gang Cheng, Michael Vinson, Sanjay Ranka

Northeast Parallel Architecture Center

The finance industry is beginning to adopt parallel computing for numerical computation, and will soon be in a position to use parallel supercomputers. This paper examines software issues and performance of a stock option pricing model running on the Connection Machine-2 and DECmpp-12000. Pricing models incorporating stochastic volatility with American call (early exercise) are computationally intensive and require substantial communication. Three parallel versions of a stock option pricing model were developed which varied in data distribution, load balancing, and communication. The performance of this set of increasingly refined models ranged over no improvement, 10 times, and 100 times faster than …


A Practical Hierarchical Model Of Parallel Computation Ll: Binary Tree And Fft Algorithms, Todd Heywood, Sanjay Ranka Oct 1991

A Practical Hierarchical Model Of Parallel Computation Ll: Binary Tree And Fft Algorithms, Todd Heywood, Sanjay Ranka

Electrical Engineering and Computer Science - Technical Reports

A companion paper has introduced the Hierarchical PRAM (H-PRAM) model of parallel computation, which achieves a good balance between simplicity of usage and reflectivity of realistic parallel computers. In this paper, we demonstrate the usage of the model by designing and analyzing various algorithms for computing the complete binary tree, and the FFT/butterfly graph. By concentrating on two problems, we are able to demonstrate the results of different combinations of organizational strategies and different types of sub-models of the H-PRAM. The philosophy in algorithm design is to maximize the number of processors P that are efficiently usable with respect to …


An Evolutionary Approach To Load Balancing Parallel Computations, N. Mansouri, Geoffrey C. Fox Apr 1991

An Evolutionary Approach To Load Balancing Parallel Computations, N. Mansouri, Geoffrey C. Fox

Electrical Engineering and Computer Science - Technical Reports

We present a new approach to balancing the workload in a multicomputer when the problem is decomposed into subproblems mapped to the processors. It is based on a hybrid genetic algorithm. A number of design choices for genetic algorithms are combined in order to ameliorate the problem of premature convergence that is often encountered in the implementation of classical genetic algorithms. The algorithm is hybridized by including a hill climbing procedure which significantly improves the efficiency of the evolution. Moreover, it makes use of problem specific information to evade some computational costs and to reinforce favorable aspects of the genetic …


A Practical Hierarchial Model Of Parallel Computation: The Model, Todd Heywood, Sanjay Ranka Feb 1991

A Practical Hierarchial Model Of Parallel Computation: The Model, Todd Heywood, Sanjay Ranka

Electrical Engineering and Computer Science - Technical Reports

We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a sub-model, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hierarchical PRAM (H-PRAM) model controls conceptual complexity in the face of asynchrony in two ways. First, by providing the simplifying assumption of synchronization to the design of algorithms, but allowing the algorithms to work asynchronously with each other; and organizing this "control asynchrony" via an implicit hierarchy relation. Second, by allowing the restriction of "communication asynchrony" in …


Mapping Finite Element Graphs On Hypercubes, Yeh-Ching Chung, Sanjay Ranka Jan 1990

Mapping Finite Element Graphs On Hypercubes, Yeh-Ching Chung, Sanjay Ranka

Electrical Engineering and Computer Science - Technical Reports

In parallel computing, it is important to map a parallel program onto a parallel computer such that the total execution time of a parallel program is minimized. In general, a parallel program and a parallel computer can be represented by a task graph (TG) and a processor graph (PG), respectively. For a TG, nodes represent tasks of a parallel program and edges denote the data communication needed between tasks. The weights associated with nodes and edges represent the computational load and communication cost, respectively. For a PG, nodes and edges denote processors and communication channels, respectively. By using the graph …