Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (17)
- Engineering (6)
- Bioinformatics (3)
- Computer Engineering (3)
- Life Sciences (3)
-
- Electrical and Computer Engineering (2)
- Oceanography and Atmospheric Sciences and Meteorology (2)
- Physics (2)
- Aerospace Engineering (1)
- Applied Mathematics (1)
- Atomic, Molecular and Optical Physics (1)
- Computational Biology (1)
- Computer and Systems Architecture (1)
- Electromagnetics and Photonics (1)
- Electronic Devices and Semiconductor Manufacturing (1)
- Genetics and Genomics (1)
- Genomics (1)
- Materials Science and Engineering (1)
- Mechanical Engineering (1)
- Nanoscience and Nanotechnology (1)
- Nanotechnology Fabrication (1)
- Numerical Analysis and Computation (1)
- Numerical Analysis and Scientific Computing (1)
- Other Computer Sciences (1)
- Partial Differential Equations (1)
- Quantum Physics (1)
- Semiconductor and Optical Materials (1)
- Software Engineering (1)
- Systems Architecture (1)
- Institution
- Publication Year
- Publication
- Publication Type
Articles 1 - 21 of 21
Full-Text Articles in Physical Sciences and Mathematics
A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh
A Parallel Direct Method For Finite Element Electromagnetic Computations Based On Domain Decomposition, Javad Moshfegh
Doctoral Dissertations
High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell's equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don't scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new …
High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami
High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami
LSU Doctoral Dissertations
Genome sequencing technology has witnessed tremendous progress in terms of throughput and cost per base pair, resulting in an explosion in the size of data. Typical de Bruijn graph-based assembly tools demand a lot of processing power and memory and cannot assemble big datasets unless running on a scaled-up server with terabytes of RAMs or scaled-out cluster with several dozens of nodes. In the first part of this work, we present a distributed next-generation sequence (NGS) assembler called Lazer, that achieves both scalability and memory efficiency by using partitioned de Bruijn graphs. By enhancing the memory-to-disk swapping and reducing the …
Parallel Algorithms For Time Dependent Density Functional Theory In Real-Space And Real-Time, James Kestyn
Parallel Algorithms For Time Dependent Density Functional Theory In Real-Space And Real-Time, James Kestyn
Doctoral Dissertations
Density functional theory (DFT) and time dependent density functional theory (TDDFT) have had great success solving for ground state and excited states properties of molecules, solids and nanostructures. However, these problems are particularly hard to scale. Both the size of the discrete system and the number of needed eigenstates increase with the number of electrons. A complete parallel framework for DFT and TDDFT calculations applied to molecules and nanostructures is presented in this dissertation. This includes the development of custom numerical algorithms for eigenvalue problems and linear systems. New functionality in the FEAST eigenvalue solver presents an additional level of …
A Parallel Spectral Method Approach To Model Plasma Instabilities, Kevin S. Scheiman
A Parallel Spectral Method Approach To Model Plasma Instabilities, Kevin S. Scheiman
Browse all Theses and Dissertations
The study of solar-terrestrial plasma is concerned with processes in magnetospheric, ionospheric, and cosmic-ray physics involving different particle species and even particles of different energy within a single species. Instabilities in space plasmas and the earth's atmosphere are driven by a multitude of free energy sources such as velocity shear, gravity, temperature anisotropy, electron, and, ion beams and currents. Microinstabilities such as Rayleigh-Taylor and Kelvin-Helmholtz instabilities are important for the understanding of plasma dynamics in presence of magnetic field and velocity shear. Modeling these turbulences is a computationally demanding processes; requiring large memory and suffer from excessively long runtimes. Previous …
Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning
Browser Based Visualization For Parameter Spaces Of Big Data Using Client-Server Model, Kurtis M. Glendenning
Browse all Theses and Dissertations
Visualization is an important task in data analytics, as it allows researchers to view abstract patterns within the data instead of reading through extensive raw data. Allowing the ability to interact with the visualizations is an essential aspect since it provides the ability to intuitively explore data to find meaning and patterns more efficiently. Interactivity, however, becomes progressively more difficult as the size of the dataset increases. This project begins by leveraging existing web-based data visualization technologies and extends their functionality through the use of parallel processing. This methodology utilizes state-of-the-art techniques, such as Node.js, to split the visualization rendering …
Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw
Energy Awareness And Scheduling In Mobile Devices And High End Computing, Sachin S. Pawaskaw
Student Work
In the context of the big picture as energy demands rise due to growing economies and growing populations, there will be greater emphasis on sustainable supply, conservation, and efficient usage of this vital resource. Even at a smaller level, the need for minimizing energy consumption continues to be compelling in embedded, mobile, and server systems such as handheld devices, robots, spaceships, laptops, cluster servers, sensors, etc. This is due to the direct impact of constrained energy sources such as battery size and weight, as well as cooling expenses in cluster-based systems to reduce heat dissipation. Energy management therefore plays a …
Parallel And Distributed Performance Of A Depth Estimation Algorithm, Brian R. Calder
Parallel And Distributed Performance Of A Depth Estimation Algorithm, Brian R. Calder
Center for Coastal and Ocean Mapping
Expansion of dataset sizes and increasing complexity of processing algorithms have led to consideration of parallel and distributed implementations. The rationale for distributing the computational load may be to thin-provision computational resources, to accelerate data processing rate, or to efficiently reuse already available but otherwise idle computational resources. Whatever the rationale, an efficient solution of this type brings with it questions of data distribution, job partitioning, reliability, and robustness. This paper addresses the first two of these questions in the context of a local cluster-computing environment. Using the CHRT depth estimator, it considers active and passive data distribution and their …
Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar
Efficient And Private Processing Of Analytical Queries In Scientific Datasets, Anand Kumar
USF Tampa Graduate Theses and Dissertations
Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security.
We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just …
Hydrographic Data Processing On A Robust, Network-Coupled Parallel Cluster, Rohit Venugopal, Brian R. Calder
Hydrographic Data Processing On A Robust, Network-Coupled Parallel Cluster, Rohit Venugopal, Brian R. Calder
Center for Coastal and Ocean Mapping
Increasing data volumes and adoption of computer-assisted hydrographic data processing algorithms necessitate higher data processing rates if gains in efficiency achieved in the last decade are to be maintained and enhanced. Recent advances in desktop computer architectures have made multi-core and multi-processor systems readily available, and some advances have been made in implementing multi-threaded versions of common hydrographic data processing algorithms. In many cases, however, although the algorithms might be ideal for parallel implementation (so called ‘embarrassingly parallel’ tasks), limitations in memory, disc and network bandwidth within a single system can have significant limitations on the scalability of these solutions. …
A Scalable Architecture For Simplifying Full-Range Scientific Data Analysis, Wesley James Kendall
A Scalable Architecture For Simplifying Full-Range Scientific Data Analysis, Wesley James Kendall
Doctoral Dissertations
According to a recent exascale roadmap report, analysis will be the limiting factor in gaining insight from exascale data. Analysis problems that must operate on the full range of a dataset are among the most difficult. Some of the primary challenges in this regard come from disk access, data managment, and programmability of analysis tasks on exascale architectures. In this dissertation, I have provided an architectural approach that simplifies and scales data analysis on supercomputing architectures while masking parallel intricacies to the user. My architecture has three primary general contributions: 1) a novel design pattern and implmentation for reading multi-file …
Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li
Fast Parallel Machine Learning Algorithms For Large Datasets Using Graphic Processing Unit, Qi Li
Theses and Dissertations
This dissertation deals with developing parallel processing algorithms for Graphic Processing Unit (GPU) in order to solve machine learning problems for large datasets. In particular, it contributes to the development of fast GPU based algorithms for calculating distance (i.e. similarity, affinity, closeness) matrix. It also presents the algorithm and implementation of a fast parallel Support Vector Machine (SVM) using GPU. These application tools are developed using Compute Unified Device Architecture (CUDA), which is a popular software framework for General Purpose Computing using GPU (GPGPU). Distance calculation is the core part of all machine learning algorithms because the closer the query …
A Dynamic Energy-Aware Model For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali
A Dynamic Energy-Aware Model For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
High Performance Computing (HPC) resources are housed in large datacenters, which consume huge amounts of energy and are quickly demanding attention from businesses as they result in high operating costs. On the other hand HPC environments have been very useful to researchers in many emerging areas in life sciences such as Bioinformatics and Medical Informatics. In this paper, we provide a dynamic model for energy aware scheduling (EAS) in a HPC environment; we use a widely used bioinformatics tool named BLAT (BLAST-like alignment tool) running in a HPC environment as our case study. Our proposed EAS model incorporates 2-Phases: an …
Dynamic Energy Aware Task Scheduling For Periodic Tasks Using Expected Execution Time Feedback, Sachin Pawaskar, Hesham Ali
Dynamic Energy Aware Task Scheduling For Periodic Tasks Using Expected Execution Time Feedback, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
Scheduling dependent tasks is one of the most challenging problems in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. An interesting application of scheduling is in the area of energy awareness for mobile battery operated devices where minimizing the energy utilized is the most important scheduling policy consideration. A number of heuristics have been developed for this consideration. In this paper, we study the scheduling problem for a particular battery model. In the proposed work, we show how to enhance a well know approach of accounting for …
On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali
On The Tradeoff Between Speedup And Energy Consumption In High Performance Computing – A Bioinformatics Case Study, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
High Performance Computing has been very useful to researchers in the Bioinformatics, Medical and related fields. The bioinformatics domain is rich in applications that require extracting useful information from very large and continuously growing sequence of databases. Automated techniques such as DNA sequencers, DNA microarrays & others are continually growing the dataset that is stored in large public databases such as GenBank and Protein DataBank. Most methods used for analyzing genetic/protein data have been found to be extremely computationally intensive, providing motivation for the use of powerful computers or systems with high throughput characteristics. In this paper, we provide a …
A Distributed Reconstruction Of Ekg Signals, Gabriel Cordova
A Distributed Reconstruction Of Ekg Signals, Gabriel Cordova
Open Access Theses & Dissertations
In this Thesis the parallel computing methodology is applied to an algorithm used in the reconstruction of electrocardiographic (EKG) measurements. The reconstructions are being performed to obtain a better understanding of the source and behavior of the electrical activity that generates the EKG measurements. The contribution of this Thesis is to identify and eliminate inefficiencies present in the current reconstruction algorithm. Additionally, this Thesis reduces the computation times of the EKG reconstruction by applying distributed computing through the use of Remote Procedure Calls (RPC). Lastly, it provides an analysis of the speed-up and efficiency of the distribution implemented using parallel …
Dynamic Energy Aware Task Scheduling Using Run-Queue Peek, Sachin Pawaskar, Hesham Ali
Dynamic Energy Aware Task Scheduling Using Run-Queue Peek, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
Scheduling dependent tasks is one of the most challenging problems in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. An interesting application of scheduling is in the area of energy awareness for mobile battery operated devices where minimizing the energy utilized is the most important scheduling policy consideration. A number of heuristics have been developed for this consideration. In this paper, we study the scheduling problem for a particular battery model. In the proposed work, we show how to enhance a well know approach of accounting for …
A Maximal Chain Approach For Scheduling Tasks In A Multiprocessor Systems, Sachin Pawaskar, Hesham Ali
A Maximal Chain Approach For Scheduling Tasks In A Multiprocessor Systems, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
Scheduling dependent tasks is one of the most challenging versions of the scheduling problem in parallel and distributed systems. It is known to be computationally intractable in its general form as well as several restricted cases. As a result, researchers have studied restricted forms of the problem by constraining either the task graph representing the parallel tasks or the computer model. Also, in an attempt to solve the problem in the general case, a number of heuristics have been developed. In this paper, we study the scheduling problem for a fixed number of processors m. In the proposed work, we …
Optimal Parallel Lexicographic Sorting Using A Fine-Grained Decomposition, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod Varshney
Optimal Parallel Lexicographic Sorting Using A Fine-Grained Decomposition, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod Varshney
Electrical Engineering and Computer Science - Technical Reports
Though non-comparison based sorting techniques like radix sorting can be done with less "work" than conventional comparison-based methods, they are not used for long keys. This is because even though parallel radix sorting algorithms process the keys in parallel, the symbols in the keys are processed sequentially. In this report, we give an optimal algorithm for lexicographic sorting that can be used to sort n m-bit keys on an EREW model in Ө (log nlogm) time with Ө (mn) "work". This algorithm is not only as fast as any optimal non-comparison based algorithm, but can also be executed with less …
Optimal Parallel Solutions To The Neighbor Localization Problem And Integer Sorting: A Fine Grained Approach, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod K. Varshney
Optimal Parallel Solutions To The Neighbor Localization Problem And Integer Sorting: A Fine Grained Approach, Ramachandran Vaidyanathan, Carlos R.P. Hartmann, Pramod K. Varshney
Electrical Engineering and Computer Science - Technical Reports
In this report, a fine-grained decomposition approach is used to obtain an optimal parallel solution to the Neighbor Localization Problem, which in turn is œ used to sort n θ(log n)-bit numbers optimally on an EREW model. The model of computation used is the EREW Reconfigurable PRAM (R-PRAM) that permits the use of “very small” processors. The main result of this report is a parallel EREW R-PRAM algorithm that sorts n θ(log n)-bit numbers in θ(log n) time with θ(n log n) “work”. The proposed algorithm is asymptotically optimal in time and efficiency. If a weaker variant of the R-PRAM …
Parallel Implementation Of A Recursive Least Squares Neural Network Training Method On The Intel Ipsc/2, James Edward Steck, Bruce M. Mcmillin, K. Krishnamurthy, M. Reza Ashouri, Gary G. Leininger
Parallel Implementation Of A Recursive Least Squares Neural Network Training Method On The Intel Ipsc/2, James Edward Steck, Bruce M. Mcmillin, K. Krishnamurthy, M. Reza Ashouri, Gary G. Leininger
Computer Science Faculty Research & Creative Works
An algorithm based on the Marquardt-Levenberg least-square optimization method has been shown by S. Kollias and D. Anastassiou (IEEE Trans. on Circuits Syst. vol.36, no.8, p.1092-101, Aug. 1989) to be a much more efficient training method than gradient descent, when applied to some small feedforward neural networks. Yet, for many applications, the increase in computational complexity of the method outweighs any gain in learning rate obtained over current training methods. However, the least-squares method can be more efficiently implemented on parallel architectures than standard methods. This is demonstrated by comparing computation times and learning rates for the least-squares method implemented …
Experimentation With Large-Grained Parallelism Using Local Area Networks, Ralph W. Wilkerson, Douglas E. Meyer
Experimentation With Large-Grained Parallelism Using Local Area Networks, Ralph W. Wilkerson, Douglas E. Meyer
Computer Science Faculty Research & Creative Works
HIGHLAND, a distributed-memory parallel processing environment for heterogeneous local area networks, has been developed. Designed as both a teaching and a research tool, its purpose is to provide an effective mechanism by which a number of networked UNIX workstations, dissimilar in both vendor and performance, can be directly manipulated as a single, unified, multiprocessing system. Utilizing the MIT X-windows environment, HIGHLAND supports a highly interactive graphical interface through which a programmer can create, modify, and control complex systems of communicating processes