Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (33)
- Engineering (17)
- Electrical and Computer Engineering (8)
- Computer Engineering (7)
- Mathematics (5)
-
- Numerical Analysis and Scientific Computing (5)
- Software Engineering (5)
- Statistics and Probability (5)
- Artificial Intelligence and Robotics (4)
- Computer and Systems Architecture (4)
- Multivariate Analysis (3)
- Aeronautical Vehicles (2)
- Aerospace Engineering (2)
- Electromagnetics and Photonics (2)
- Theory and Algorithms (2)
- Algebra (1)
- Applied Statistics (1)
- Aviation (1)
- Aviation Safety and Security (1)
- Bioinformatics (1)
- Databases and Information Systems (1)
- Earth Sciences (1)
- Education (1)
- Engineering Science and Materials (1)
- Geology (1)
- Geophysics and Seismology (1)
- Industrial Engineering (1)
- Life Sciences (1)
- Longitudinal Data Analysis and Time Series (1)
- Institution
-
- Selected Works (8)
- University of South Florida (6)
- City University of New York (CUNY) (3)
- Boise State University (2)
- Singapore Management University (2)
-
- University of Texas at El Paso (2)
- Clemson University (1)
- Kennesaw State University (1)
- Loyola University Chicago (1)
- Macalester College (1)
- Missouri University of Science and Technology (1)
- SelectedWorks (1)
- University of Arkansas, Fayetteville (1)
- University of Mississippi (1)
- University of Nebraska at Omaha (1)
- University of New Mexico (1)
- University of Tennessee, Knoxville (1)
- Utah State University (1)
- Wayne State University (1)
- Western University (1)
- Publication Year
- Publication
-
- Ole J Mengshoel (8)
- USF Tampa Graduate Theses and Dissertations (6)
- Dissertations, Theses, and Capstone Projects (2)
- Open Access Theses & Dissertations (2)
- Research Collection School Of Computing and Information Systems (2)
-
- All Dissertations (1)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (1)
- Boise State University Theses and Dissertations (1)
- Computer Science Faculty Proceedings & Presentations (1)
- Computer Science Faculty Research & Creative Works (1)
- Computer Science: Faculty Publications and Other Works (1)
- Dissertations and Theses (1)
- Doctoral Dissertations (1)
- Electrical and Computer Engineering ETDs (1)
- Electronic Theses and Dissertations (1)
- Electronic Thesis and Dissertation Repository (1)
- Graduate Theses and Dissertations (1)
- Master of Science in Computer Science Theses (1)
- Mathematics, Statistics, and Computer Science Honors Projects (1)
- Panagiotis T Metaxas (1)
- Student Research Initiative (1)
- Wayne State University Dissertations (1)
- Publication Type
Articles 1 - 30 of 37
Full-Text Articles in Physical Sciences and Mathematics
Hpc-Enabled Fast And Configurable Dynamic Simulation, Analysis, And Learning For Complex Power System Adaptation And Control, Cong Wang
All Dissertations
This dissertation presents an HPC-enabled fast and configurable dynamic simulation, analysis, and learning framework for complex power system adaptation and control. Dynamic simulation for a large transmission system comprising thousands of buses and branches implies the latency of complicated numerical computations. However, faster-than-real-time execution is often required to provide timely support for power system planning and operation. The traditional approaches for speeding up the simulation demand extensive computing facilities such as CPU-based multi-core supercomputers, resulting in heavily resource-dependent solutions. In this work, by coupling the Message Passing Interface (MPI) protocol with an advanced heterogeneous programming environment, further acceleration can be …
Unoapi: Balancing Performance, Portability, And Productivity (P3) In Hpc Education, Konstantin Laufer, George K. Thiruvathukal
Unoapi: Balancing Performance, Portability, And Productivity (P3) In Hpc Education, Konstantin Laufer, George K. Thiruvathukal
Computer Science: Faculty Publications and Other Works
oneAPI is a major initiative by Intel aimed at making it easier to program heterogeneous architectures used in high-performance computing using a unified application programming interface (API). While raising the abstraction level via a unified API represents a promising step for the current generation of students and practitioners to embrace high- performance computing, we argue that a curriculum of well- developed software engineering methods and well-crafted exem- plars will be necessary to ensure interest by this audience and those who teach them. We aim to bridge the gap by developing a curriculum—codenamed UnoAPI—that takes a more holistic approach by looking …
Novel Hybrid Resampling Algorithms For Parallel/Distributed Particle Filters, Xudong Zhang
Novel Hybrid Resampling Algorithms For Parallel/Distributed Particle Filters, Xudong Zhang
Dissertations, Theses, and Capstone Projects
Particle filters, also known as sequential Monte Carlo (SMC) methods, use the Bayesian inference and the stochastic sampling technique to estimate the states of dynamic systems from given observations. Parallel/Distributed particle filters were introduced to improve the performance of sequential particle filters by using multiple processing units (PUs). The classical resampling algorithm used in parallel/distributed particle filters is a centralized scheme, called centralized resampling, which needs a central unit (CU) to serve as a hub for data transfers. As a result, the centralized resampling procedures produce extra communication costs, which lowers the speedup factors in parallel computing. Even though some …
Optimal Communication Structures For Concurrent Computing, Andrii Berdnikov
Optimal Communication Structures For Concurrent Computing, Andrii Berdnikov
Doctoral Dissertations
This research focuses on communicative solvers that run concurrently and exchange information to improve performance. This “team of solvers” enables individual algorithms to communicate information regarding their progress and intermediate solutions, and allows them to synchronize memory structures with more “successful” counterparts. The result is that fewer nodes spend computational resources on “struggling” processes. The research is focused on optimization of communication structures that maximize algorithmic efficiency using the theoretical framework of Markov chains. Existing research addressing communication between the cooperative solvers on parallel systems lacks generality: Most studies consider a limited number of communication topologies and strategies, while the …
Graphmp: I/O-Efficient Big Graph Analytics On A Single Commodity Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Graphmp: I/O-Efficient Big Graph Analytics On A Single Commodity Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Research Collection School Of Computing and Information Systems
Recent studies showed that single-machine graph processing systems can be as highly competitive as cluster-based approaches on large-scale problems. While several out-of-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge …
System Support Of Concurrent Database Query Processing On A Gpu, Hao Li
System Support Of Concurrent Database Query Processing On A Gpu, Hao Li
USF Tampa Graduate Theses and Dissertations
The unrivaled computing capabilities of modern GPUs meet the demand of processing massive amounts of data seen in many application domains. While traditional HPC systems support applications as standalone entities that occupy the entire GPU, we propose a GPU-based DBMS (G-DBMS) that can run multiple tasks concurrently. To that end, system-level management mechanisms like resource allocation and buffer manager are needed to build such a concurrent database query processing system and fully unleash the GPUs’ computing power. However, CUDA does not provide enough OS-level functionalities to support it. Thus our research is focusing on implementing the optimization of resource allocation …
A Generic Implementation Of Fast Fourier Transforms For The Bpas Library, Colin S. Costello
A Generic Implementation Of Fast Fourier Transforms For The Bpas Library, Colin S. Costello
Electronic Thesis and Dissertation Repository
In this thesis we seek to realize an efficient implementation of a generic parallel fast Fourier transform (FFT). The FFT will be used in support of fast multiplication of polynomials with coefficients in a finite field. Our goal is to obtain a relatively high performing parallel implementation that will run over a variety of finite fields with different sized characteristic primes. To this end, we implement and compare two Cooley-Tukey Six-Step fast Fourier transforms and a Cooley-Tukey Four-Step variant against a high performing specialized FFT already implemented in the Basic Polynomial Algebra Subprograms (BPAS) library. We use optimization techniques found …
Advanced Parallel Algorithms In Computational Electromagnetics, Shu Wang
Advanced Parallel Algorithms In Computational Electromagnetics, Shu Wang
Electrical and Computer Engineering ETDs
The rapid development of high performance computing has pushed the computational electromagnetic(CEM) towards high accuracy, high fidelity and extreme computational scales. There is a great need for existing CEM solvers to have enhanced parallelism and scaling capability. The purpose of this dissertation is to investigate advanced parallel algorithms for both frequency and time domain solvers.
In frequency domain, this work first develop the underpinnings of parallel preconditioning technique and high-order transmission condition in the context of multi-solver scheme. The result is a computing resource-aware and implementation wise compact solver. Then this work targeted at developing efficient algorithms for cases where …
Relational Joins On Gpus For In-Memory Database Query Processing, Ran Rui
Relational Joins On Gpus For In-Memory Database Query Processing, Ran Rui
USF Tampa Graduate Theses and Dissertations
Relational join processing is one of the core functionalities in database management systems. Implementing join algorithms on parallel platforms, especially modern GPUs, has gain a lot of momentum in the past decade. This dissertation addresses the following issues on GPU join algorithms. First, we present empirical evaluations of a state-of-the-art work on GPU-based join processing. Since 2008, the compute capabilities of GPUs have increased following a pace faster than that of the multi-core CPUs. We run a comprehensive set of experiments to study how join operations can benefit from such rapid expansion of GPU capabilities. We also present improved GPU …
Algorithms And Framework For Computing 2-Body Statistics On Graphics Processing Units, Napath Pitaksirianan
Algorithms And Framework For Computing 2-Body Statistics On Graphics Processing Units, Napath Pitaksirianan
USF Tampa Graduate Theses and Dissertations
Various types of two-body statistics (2-BS) are regarded as essential components of low-level data analysis in scientific database systems. In relational algebraic terms, a 2-BS is essentially a Cartesian product between two datasets (or two instances of the same dataset) followed by a user-defined aggregate. The quadratic complexity of these computations hinders the timely processing of data. Thus using modern parallel hardware has become an obvious solution to meet such challenges. This dissertation presents our recent work in designing and optimizing parallel algorithms for 2-BS computation on Graphics Processing Units (GPUs). The unique architecture, however, provides abundant opportunities for optimizing …
A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong
A Bayesian Framework For Estimating Seismic Wave Arrival Time, Hua Zhong
Graduate Theses and Dissertations
Because earthquakes have a large impact on human society, statistical methods for better studying earthquakes are required. One characteristic of earthquakes is the arrival time of seismic waves at a seismic signal sensor. Once we can estimate the earthquake arrival time accurately, the earthquake location can be triangulated, and assistance can be sent to that area correctly. This study presents a Bayesian framework to predict the arrival time of seismic waves with associated uncertainty. We use a change point framework to model the different conditions before and after the seismic wave arrives. To evaluate the performance of the model, we …
Three Environmental Fluid Dynamics Papers, Eden Furtak-Cole
Three Environmental Fluid Dynamics Papers, Eden Furtak-Cole
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
Three papers are presented, applying computational fluid dynamics methods to fluid flows in the geosciences. In the first paper, a numerical method is developed for single phase potential flow in the subsurface. For a class of monotonically advancing flows, the method provides a computational savings as compared to classical methods and can be applied to problems such as forced groundwater recharge. The second paper investigates the shear stress reducing action of an erosion control roughness array. Incompressible Naiver-Stokes simulations are performed for multiple wind angles to understand the changing aerodynamics of individual and grouped roughness elements. In the third paper, …
Parallelizing Tabu Search Based Optimization Algorithm On Gpus, Vinaya Malleypally
Parallelizing Tabu Search Based Optimization Algorithm On Gpus, Vinaya Malleypally
USF Tampa Graduate Theses and Dissertations
There are many combinatorial optimization problems such as traveling salesman problem, quadratic-assignment problem, flow shop scheduling, that are computationally intractable. Tabu search based simulated annealing is a stochastic search algorithm that is widely used to solve combinatorial optimization problems. Due to excessive run time, there is a strong demand for a parallel version that can be applied to any problem with minimal modifications. Existing advanced and/or parallel versions of tabu search algorithms are specific to the problem at hand. This leads to a drawback of optimization only for that particular problem. In this work, we propose a parallel version of …
Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Graphmp: An Efficient Semi-External-Memory Big Graph Processing System On A Single Machine, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Xiaokui Xiao
Research Collection School Of Computing and Information Systems
Recent studies showed that single-machine graph processing systems can be as highly competitive as clusterbased approaches on large-scale problems. While several outof-core graph processing systems and computation models have been proposed, the high disk I/O overhead could significantly reduce performance in many practical cases. In this paper, we propose GraphMP to tackle big graph analytics on a single machine. GraphMP achieves low disk I/O overhead with three techniques. First, we design a vertex-centric sliding window (VSW) computation model to avoid reading and writing vertices on disk. Second, we propose a selective scheduling method to skip loading and processing unnecessary edge …
Accelerating The Discontinuous Galerkin Cell-Vertex Scheme (Dg-Cvs) Solver On Cpu-Gpu Heterogeneous Systems, Xiaoqi Hu
Electronic Theses and Dissertations
Dg-Cvs (Discontinuous Galerkin Cell-Vertex Scheme) is an efficient, accurate and robust numerical solver for general hyperbolic conservation laws. It can solve a broad range of conservation laws such as the shallow water equation and Magnetohydrodynamics equations. Dg-Cvs is a Riemann-Solver-free high order space-time method for arbitrary space conservation laws. It fuses the discontinuous Galerkin (dg) method and the conservation element/solution element (ce/se) method to take advantage of the best features of both methods. Thanks to the ce/se method, the time derivative of the solution is treated as an independent unknown, which is amendable to gpu's parallel execution. In this thesis, …
Definition Of A Method For The Formulation Of Problems To Be Solved With High Performance Computing, Ramya Peruri
Definition Of A Method For The Formulation Of Problems To Be Solved With High Performance Computing, Ramya Peruri
Master of Science in Computer Science Theses
Computational power made available by current technology has been continuously increasing, however today’s problems are larger and more complex and demand even more computational power. Interest in computational problems has also been increasing and is an important research area in computer science. These complex problems are solved with computational models that use an underlying mathematical model and are solved using computer resources, simulation, and are run with High Performance Computing. For such computations, parallel computing has been employed to achieve high performance. This thesis identifies families of problems that can best be solved using modelling and implementation techniques of parallel …
Efficient Algorithms And Applications In Topological Data Analysis, Junyi Tu
Efficient Algorithms And Applications In Topological Data Analysis, Junyi Tu
USF Tampa Graduate Theses and Dissertations
Topological Data Analysis (TDA) is a new and fast growing research field developed over last two decades. TDA finds many applications in computer vision, computer graphics, scientific visualization, molecular biology, and material science, to name a few. In this dissertation, we make algorithmic and application contributions to three data structures in TDA: contour trees, Reeb graphs, and Mapper. From the algorithmic perspective, we design a parallel algorithm for contour tree construction and implement it in OpenCL. We also design and implement critical point pairing algorithms to compute persistence diagrams directly from contour trees, Reeb graphs, and Mapper. In terms of …
A Dynamic Run-Profile Energy-Aware Approach For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali
A Dynamic Run-Profile Energy-Aware Approach For Scheduling Computationally Intensive Bioinformatics Applications, Sachin Pawaskar, Hesham Ali
Computer Science Faculty Proceedings & Presentations
High Performance Computing (HPC) resources are housed in large datacenters, which consume exorbitant amounts of energy and are quickly demanding attention from businesses as they result in high operating costs. On the other hand HPC environments have been very useful to researchers in many emerging areas in life sciences such as Bioinformatics and Medical Informatics. In an earlier work, we introduced a dynamic model for energy aware scheduling (EAS) in a HPC environment; the model is domain agnostic and incorporates both the deadline parameter as well as energy parameters for computationally intensive applications. Our proposed EAS model incorporates 2-phases. In …
Enforcing Security Policies On Gpu Computing Through The Use Of Aspect-Oriented Programming Techniques, Bader Albassam
Enforcing Security Policies On Gpu Computing Through The Use Of Aspect-Oriented Programming Techniques, Bader Albassam
USF Tampa Graduate Theses and Dissertations
This thesis presents a new security policy enforcer designed for securing parallel computation on CUDA GPUs. We show how the very features that make a GPGPU desirable have already been utilized in existing exploits, fortifying the need for security protections on a GPGPU. An aspect weaver was designed for CUDA with the goal of utilizing aspect-oriented programming for security policy enforcement. Empirical testing verified the ability of our aspect weaver to enforce various policies. Furthermore, a performance analysis was performed to demonstrate that using this policy enforcer provides no significant performance impact over manual insertion of policy code. Finally, future …
Large-Scale Spatial Data Management On Modern Parallel And Distributed Platforms, Simin You
Large-Scale Spatial Data Management On Modern Parallel And Distributed Platforms, Simin You
Dissertations, Theses, and Capstone Projects
Rapidly growing volume of spatial data has made it desirable to develop efficient techniques for managing large-scale spatial data. Traditional spatial data management techniques cannot meet requirements of efficiency and scalability for large-scale spatial data processing. In this dissertation, we have developed new data-parallel designs for large-scale spatial data management that can better utilize modern inexpensive commodity parallel and distributed platforms, including multi-core CPUs, many-core GPUs and computer clusters, to achieve both efficiency and scalability. After introducing background on spatial data management and modern parallel and distributed systems, we present our parallel designs for spatial indexing and spatial join query …
Towards The Scalability And Hybrid Parallelization Of A Spatially Variant Lattice Algorithm, Henry Roger Moncada Lopez
Towards The Scalability And Hybrid Parallelization Of A Spatially Variant Lattice Algorithm, Henry Roger Moncada Lopez
Open Access Theses & Dissertations
The purpose of this research is to design a faster implementation of the spatially variant algorithm that improves its performance when it is running on a parallel computer system.
The spatially variant algorithm is used to synthesize a spatially variant lattice for a periodic electromagnetic structure. The algorithm has the ability to spatially vary the unit cell orientation and exploit its directional dependencies. The algorithm produces a lattice that is smooth, continuous and free of defects. The lattice spacing remains strikingly uniform when the unit cell orientation, lattice spacing, fill fraction and more are spatially varied. This is important for …
Sparse Matrix Diagonalization In The Nrlmol Electronic Structure Code, Md Mahmudulla Hassan
Sparse Matrix Diagonalization In The Nrlmol Electronic Structure Code, Md Mahmudulla Hassan
Open Access Theses & Dissertations
Density functional theory (DFT) based simulations are playing a major role in quantum mechanical studies of materials ranging from molecules, nanoparticles to the biological systems as they offer insights that are not directly accessible from experiments and also due to their ability to make sufficiently accurate predictions. The DFT implementation in the NRLMOL electronic structure code employs Gaussian basis sets to express the Kohn-Sham orbitals. A major computationally demanding task in the electronic structure calculations is solution of the generalized eigenvalue problem, that is the determination of nontrivial solutions (λ, c) of Hc = λOc where H and O are …
Towards Real-Time, On-Board, Hardware-Supported Sensor And Software Health Management For Unmanned Aerial Systems, Johann M. Schumann, Kristin Y. Rozier, Thomas Reinbacher, Ole J. Mengshoel, Timmy Mbaya, Corey Ippolito
Towards Real-Time, On-Board, Hardware-Supported Sensor And Software Health Management For Unmanned Aerial Systems, Johann M. Schumann, Kristin Y. Rozier, Thomas Reinbacher, Ole J. Mengshoel, Timmy Mbaya, Corey Ippolito
Ole J Mengshoel
Parallel Design Patterns And Program Performance, Yu Zhao
Parallel Design Patterns And Program Performance, Yu Zhao
Mathematics, Statistics, and Computer Science Honors Projects
With the rapid advancement of parallel and distributed computing (PDC), three types of hardware and their corresponding software (hardware-software pairs) are becoming more and more popular: Distributed Memory Systems with the Message Passing Interface (MPI) library, Shared Memory Systems with the OpenMP library and Co-processor Systems with a general purpose parallel computing library. Alongside the development of both hardware and software aspects of PDC, the process of designing parallel programs has also improved significantly over the years. A consequence of this is that researchers have been able to describe many parallel design patterns, which are recurring solutions to well-known problems …
The Design, Analysis, & Application Of Multi-Modal Real-Time Embedded Systems, Masud Ahmed
The Design, Analysis, & Application Of Multi-Modal Real-Time Embedded Systems, Masud Ahmed
Wayne State University Dissertations
For many hand-held computing devices (e.g., smartphones), multiple operational modes are preferred because of their flexibility. In addition to their designated purposes, some of these devices provide a platform for different types of services, which include rendering of high-quality multimedia. Upon such devices, temporal isolation among co-executing applications is very important to ensure that each application receives an acceptable level of quality-of-service. In order to provide strong guarantees on services, multimedia applications and real-time control systems maintain timing constraints in the form of deadlines for recurring tasks. A flexible real-time multi-modal system will ideally provide system designers the option to …
Towards Real-Time, On-Board, Hardware-Supported Sensor And Software Health Management For Unmanned Aerial Systems, Johann Schumann, Kristin Y. Rozier, Thomas Reinbacher, Ole J. Mengshoel, Timmy Mbaya, Corey Ippolito
Towards Real-Time, On-Board, Hardware-Supported Sensor And Software Health Management For Unmanned Aerial Systems, Johann Schumann, Kristin Y. Rozier, Thomas Reinbacher, Ole J. Mengshoel, Timmy Mbaya, Corey Ippolito
Ole J Mengshoel
Opencuda+Mpi, Kenny Ballou, Nilab Mohammad Mousa
Opencuda+Mpi, Kenny Ballou, Nilab Mohammad Mousa
Student Research Initiative
The introduction and rise of General Purpose Graphics Computing has significantly impacted parallel and high-performance computing. It has introduced challenges when it comes to distributed computing with GPUs. Current solutions target specifics: specific hardware, specific network topology, a specific level of processing. Those restrictions on GPU computing limit scientists and researchers in various ways. The goal of OpenCUDA+MPI project is to develop a framework that allows researchers and scientists to write a general algorithm without the overhead of worrying about the specifics of the hardware and the cluster it will run against while taking full advantage of parallel and distributed …
Optimizing Parallel Belief Propagation In Junction Trees Using Regression, Lu Zheng, Ole J. Mengshoel
Optimizing Parallel Belief Propagation In Junction Trees Using Regression, Lu Zheng, Ole J. Mengshoel
Ole J Mengshoel
Exploring Multiple Dimensions Of Parallelism In Junction Tree Message Passing, Lu Zheng, Ole J. Mengshoel
Exploring Multiple Dimensions Of Parallelism In Junction Tree Message Passing, Lu Zheng, Ole J. Mengshoel
Ole J Mengshoel
Scaling Bayesian Network Parameter Learning With Expectation Maximization Using Mapreduce, Erik B. Reed, Ole J. Mengshoel