Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Parallel Processing

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh Nov 2019

A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh

Doctoral Dissertations

High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell's equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don't scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new …


High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami Jun 2019

High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami

LSU Doctoral Dissertations

Genome sequencing technology has witnessed tremendous progress in terms of throughput and cost per base pair, resulting in an explosion in the size of data. Typical de Bruijn graph-based assembly tools demand a lot of processing power and memory and cannot assemble big datasets unless running on a scaled-up server with terabytes of RAMs or scaled-out cluster with several dozens of nodes. In the first part of this work, we present a distributed next-generation sequence (NGS) assembler called Lazer, that achieves both scalability and memory efficiency by using partitioned de Bruijn graphs. By enhancing the memory-to-disk swapping and reducing the …


Parallel Algorithms For Time Dependent Density Functional Theory In Real-Space And Real-Time, James Kestyn Oct 2018

Parallel Algorithms For Time Dependent Density Functional Theory In Real-Space And Real-Time, James Kestyn

Doctoral Dissertations

Density functional theory (DFT) and time dependent density functional theory (TDDFT) have had great success solving for ground state and excited states properties of molecules, solids and nanostructures. However, these problems are particularly hard to scale. Both the size of the discrete system and the number of needed eigenstates increase with the number of electrons. A complete parallel framework for DFT and TDDFT calculations applied to molecules and nanostructures is presented in this dissertation. This includes the development of custom numerical algorithms for eigenvalue problems and linear systems. New functionality in the FEAST eigenvalue solver presents an additional level of …


A Parallel Spectral Method Approach To Model Plasma Instabilities, Kevin S. Scheiman Jan 2018

A Parallel Spectral Method Approach To Model Plasma Instabilities, Kevin S. Scheiman

Browse all Theses and Dissertations

The study of solar-terrestrial plasma is concerned with processes in magnetospheric, ionospheric, and cosmic-ray physics involving different particle species and even particles of different energy within a single species. Instabilities in space plasmas and the earth's atmosphere are driven by a multitude of free energy sources such as velocity shear, gravity, temperature anisotropy, electron, and, ion beams and currents. Microinstabilities such as Rayleigh-Taylor and Kelvin-Helmholtz instabilities are important for the understanding of plasma dynamics in presence of magnetic field and velocity shear. Modeling these turbulences is a computationally demanding processes; requiring large memory and suffer from excessively long runtimes. Previous …


Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning Jan 2015

Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning

Browse all Theses and Dissertations

Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …


Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw Jul 2013

Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw

Student Work

In the context of the big picture as energy demands rise due to growing economies and growing populations, there will be greater emphasis on sustainable supply, conservation, and efficient usage of this vital resource. Even at a smaller level, the need for minimizing energy consumption continues to be compelling in embedded, mobile, and server systems such as handheld devices, robots, spaceships, laptops, cluster servers, sensors, etc. This is due to the direct impact of constrained energy sources such as battery size and weight, as well as cooling expenses in cluster-based systems to reduce heat dissipation. Energy management therefore plays a …


Parallel And Distributed Performance Of A Depth Estimation Algorithm, Brian R. Calder Apr 2013

Parallel And Distributed Performance Of A Depth Estimation Algorithm, Brian R. Calder

Center for Coastal and Ocean Mapping

Expansion of dataset sizes and increasing complexity of processing algorithms have led to consideration of parallel and distributed implementations. The rationale for distributing the computational load may be to thin-provision computational resources, to accelerate data processing rate, or to efficiently reuse already available but otherwise idle computational resources. Whatever the rationale, an efficient solution of this type brings with it questions of data distribution, job partitioning, reliability, and robustness. This paper addresses the first two of these questions in the context of a local cluster-computing environment. Using the CHRT depth estimator, it considers active and passive data distribution and their …


Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar Jan 2013

Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar

USF Tampa Graduate Theses and Dissertations

Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security.

We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just …


Hydrographic Data Processing On A Robust, Network-Coupled Parallel Cluster, Rohit Venugopal, Brian R. Calder Feb 2012

Hydrographic Data Processing On A Robust, Network-Coupled Parallel Cluster, Rohit Venugopal, Brian R. Calder

Center for Coastal and Ocean Mapping

Increasing data volumes and adoption of computer-assisted hydrographic data processing algorithms necessitate higher data processing rates if gains in efficiency achieved in the last decade are to be maintained and enhanced. Recent advances in desktop computer architectures have made multi-core and multi-processor systems readily available, and some advances have been made in implementing multi-threaded versions of common hydrographic data processing algorithms. In many cases, however, although the algorithms might be ideal for parallel implementation (so called ‘embarrassingly parallel’ tasks), limitations in memory, disc and network bandwidth within a single system can have significant limitations on the scalability of these solutions. …


A Scalable Architecture For Simplifying Full-Range Scientific Data Analysis, Wesley James Kendall Dec 2011

A Scalable Architecture For Simplifying Full-Range Scientific Data Analysis, Wesley James Kendall

Doctoral Dissertations

According to a recent exascale roadmap report, analysis will be the limiting factor in gaining insight from exascale data. Analysis problems that must operate on the full range of a dataset are among the most difficult. Some of the primary challenges in this regard come from disk access, data managment, and programmability of analysis tasks on exascale architectures. In this dissertation, I have provided an architectural approach that simplifies and scales data analysis on supercomputing architectures while masking parallel intricacies to the user. My architecture has three primary general contributions: 1) a novel design pattern and implmentation for reading multi-file …


Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li Nov 2011

Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li

Theses and Dissertations

This dissertation deals with developing parallel processing algorithms for Graphic Processing Unit (GPU) in order to solve machine learning problems for large datasets. In particular, it contributes to the development of fast GPU based algorithms for calculating distance (i.e. similarity, affinity, closeness) matrix. It also presents the algorithm and implementation of a fast parallel Support Vector Machine (SVM) using GPU. These application tools are developed using Compute Unified Device Architecture (CUDA), which is a popular software framework for General Purpose Computing using GPU (GPGPU). Distance calculation is the core part of all machine learning algorithms because the closer the query …


A Dynamic Energy-Aware Model For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali Jul 2010

A Dynamic Energy-Aware Model For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

High Performance Computing (HPC) resources are housed in large datacenters, which consume huge amounts of energy and are quickly demanding attention from businesses as they result in high operating costs. On the other hand HPC environments have been very useful to researchers in many emerging areas in life sciences such as Bioinformatics and Medical Informatics. In this paper, we provide a dynamic model for energy aware scheduling (EAS) in a HPC environment; we use a widely used bioinformatics tool named BLAT (BLAST-like alignment tool) running in a HPC environment as our case study. Our proposed EAS model incorporates 2-Phases: an …


Dynamic Energy Aware Task Scheduling For Periodic Tasks Using Expected Execution Time Feedback, Sachin Pawaskar, Hesham Ali Feb 2008

Dynamic Energy Aware Task Scheduling For Periodic Tasks Using Expected Execution Time Feedback, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

Scheduling dependent tasks is one of the most challenging problems in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. An interesting application of scheduling is in the area of energy awareness for mobile battery operated devices where minimizing the energy utilized is the most important scheduling policy consideration. A number of heuristics have been developed for this consideration. In this paper, we study the scheduling problem for a particular battery model. In the proposed work, we show how to enhance a well know approach of accounting for …


On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali Jan 2008

On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a …


A Distributed Reconstruction Of Ekg Signals, Gabriel Cordova Jan 2008

A Distributed Reconstruction Of Ekg Signals, Gabriel Cordova

Open Access Theses & Dissertations

In this Thesis the parallel computing methodology is applied to an algorithm used in the reconstruction of electrocardiographic (EKG) measurements. The reconstructions are being performed to obtain a better understanding of the source and behavior of the electrical activity that generates the EKG measurements. The contribution of this Thesis is to identify and eliminate inefficiencies present in the current reconstruction algorithm. Additionally, this Thesis reduces the computation times of the EKG reconstruction by applying distributed computing through the use of Remote Procedure Calls (RPC). Lastly, it provides an analysis of the speed-up and efficiency of the distribution implemented using parallel …


Dynamic Energy Aware Task Scheduling Using Run-Queue Peek, Sachin Pawaskar, Hesham Ali Feb 2007

Dynamic Energy Aware Task Scheduling Using Run-Queue Peek, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

Scheduling dependent tasks is one of the most challenging problems in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. An interesting application of scheduling is in the area of energy awareness for mobile battery operated devices where minimizing the energy utilized is the most important scheduling policy consideration. A number of heuristics have been developed for this consideration. In this paper, we study the scheduling problem for a particular battery model. In the proposed work, we show how to enhance a well know approach of accounting for …


A Maximal Chain Approach For Scheduling Tasks In A Multiprocessor Systems, Sachin Pawaskar, Hesham Ali Nov 2004

A Maximal Chain Approach For Scheduling Tasks In A Multiprocessor Systems, Sachin Pawaskar, Hesham Ali

Computer Science Faculty Proceedings & Presentations

Scheduling dependent tasks is one of the most challenging versions of the scheduling problem in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. As a result, researchers have studied restricted forms of the problem by constraining either the task graph representing the parallel tasks or the computer model. Also, in an attempt to solve the problem in the general case, a number of heuristics have been developed. In this paper, we study the scheduling problem for a fixed number of processors m. In the proposed work, we …


Optimal Parallel Lexicographic Sorting Using A Fine-Grained Decomposition, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod Varshney Jan 1991

Optimal Parallel Lexicographic Sorting Using A Fine-Grained Decomposition, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod Varshney

Electrical Engineering and Computer Science - Technical Reports

Though non-comparison based sorting techniques like radix sorting can be done with less "work" than conventional comparison-based methods, they are not used for long keys. This is because even though parallel radix sorting algorithms process the keys in parallel, the symbols in the keys are processed sequentially. In this report, we give an optimal algorithm for lexicographic sorting that can be used to sort n m-bit keys on an EREW model in Ө (log nlogm) time with Ө (mn) "work". This algorithm is not only as fast as any optimal non-comparison based algorithm, but can also be executed with less …


Optimal Parallel Solutions To The Neighbor Localization Problem And Integer Sorting: A Fine Grained Approach, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod K. Varshney Oct 1990

Optimal Parallel Solutions To The Neighbor Localization Problem And Integer Sorting: A Fine Grained Approach, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod K. Varshney

Electrical Engineering and Computer Science - Technical Reports

In this report, a fine-grained decomposition approach is used to obtain an optimal parallel solution to the Neighbor Localization Problem, which in turn is œ used to sort n θ(log n)-bit numbers optimally on an EREW model. The model of computation used is the EREW Reconfigurable PRAM (R-PRAM) that permits the use of “very small” processors. The main result of this report is a parallel EREW R-PRAM algorithm that sorts n θ(log n)-bit numbers in θ(log n) time with θ(n log n) “work”. The proposed algorithm is asymptotically optimal in time and efficiency. If a weaker variant of the R-PRAM …


Parallel Implementation Of A Recursive Least Squares Neural Network Training Method On The Intel Ipsc/2, James Edward Steck, Bruce M. Mcmillin, K. Krishnamurthy, M. Reza Ashouri, Gary G. Leininger Jun 1990

Parallel Implementation Of A Recursive Least Squares Neural Network Training Method On The Intel Ipsc/2, James Edward Steck, Bruce M. Mcmillin, K. Krishnamurthy, M. Reza Ashouri, Gary G. Leininger

Computer Science Faculty Research & Creative Works

An algorithm based on the Marquardt-Levenberg least-square optimization method has been shown by S. Kollias and D. Anastassiou (IEEE Trans. on Circuits Syst. vol.36, no.8, p.1092-101, Aug. 1989) to be a much more efficient training method than gradient descent, when applied to some small feedforward neural networks. Yet, for many applications, the increase in computational complexity of the method outweighs any gain in learning rate obtained over current training methods. However, the least-squares method can be more efficiently implemented on parallel architectures than standard methods. This is demonstrated by comparing computation times and learning rates for the least-squares method implemented …


Experimentation With Large-Grained Parallelism Using Local Area Networks, Ralph W. Wilkerson, Douglas E. Meyer Jan 1990

Experimentation With Large-Grained Parallelism Using Local Area Networks, Ralph W. Wilkerson, Douglas E. Meyer

Computer Science Faculty Research & Creative Works

HIGHLAND, a distributed-memory parallel processing environment for heterogeneous local area networks, has been developed. Designed as both a teaching and a research tool, its purpose is to provide an effective mechanism by which a number of networked UNIX workstations, dissimilar in both vendor and performance, can be directly manipulated as a single, unified, multiprocessing system. Utilizing the MIT X-windows environment, HIGHLAND supports a highly interactive graphical interface through which a programmer can create, modify, and control complex systems of communicating processes