Open Access. Powered by Scholars. Published by Universities.®

Systems Architecture Commons

Open Access. Powered by Scholars. Published by Universities.®

Numerical Analysis and Scientific Computing

Theses/Dissertations

Institution
Keyword
Publication Year
Publication

Articles 1 - 26 of 26

Full-Text Articles in Systems Architecture

Dynamically Finding Optimal Kernel Launch Parameters For Cuda Programs, Taabish Jeshani Apr 2023

Dynamically Finding Optimal Kernel Launch Parameters For Cuda Programs, Taabish Jeshani

Electronic Thesis and Dissertation Repository

In this thesis, we present KLARAPTOR (Kernel LAunch parameters RAtional Program estimaTOR), a freely available tool to dynamically determine the values of kernel launch parameters of a CUDA kernel. We describe a technique for building a helper program, at the compile-time of a CUDA program, that is used at run-time to determine near-optimal kernel launch parameters for the kernels of that CUDA program. This technique leverages the MWP-CWP performance prediction model, runtime data parameters, and runtime hardware parameters to dynamically determine the launch parameters for each kernel invocation. This technique is implemented within the KLARAPTOR tool, utilizing the LLVM Pass …


Modeling Repairable System Failure Data Using Nhpp Realiability Growth Mode., Eunice Ofori-Addo Jan 2023

Modeling Repairable System Failure Data Using Nhpp Realiability Growth Mode., Eunice Ofori-Addo

EWU Masters Thesis Collection

Stochastic point processes have been widely used to describe the behaviour of repairable systems. The Crow nonhomogeneous Poisson process (NHPP) often known as the Power Law model is regarded as one of the best models for repairable systems. The goodness-of-fit test rejects the intensity function of the power law model, and so the log-linear model was fitted and tested for goodness-of-fit. The Weibull Time to Failure recurrent neural network (WTTE-RNN) framework, a probabilistic deep learning model for failure data, is also explored. However, we find that the WTTE-RNN framework is only appropriate failure data with independent and identically distributed interarrival …


Exploring High Performance And Energy Efficient Graph Processing On Gpu, Robert P. Watling Jan 2023

Exploring High Performance And Energy Efficient Graph Processing On Gpu, Robert P. Watling

Dissertations, Master's Theses and Master's Reports

Parallel graph processing is central to analytical computer science applications, and GPUs have proven to be an ideal platform for parallel graph processing. Existing GPU graph processing frameworks present performance improvements but often neglect two issues: the unpredictability of a given input graph and the energy consumption of the graph processing. Our prototype software, EEGraph (Energy Efficiency of Graph processing), is a flexible system consisting of several graph processing algorithms with configurable parameters for vertex update synchronization, vertex activation, and memory management along with a lightweight software-based GPU energy measurement scheme. We observe relationships between different configurations of our software, …


Detecting Selfish Mining Attacks Against A Blockchain Using Machine Learing, Matthew A. Peterson Dec 2022

Detecting Selfish Mining Attacks Against A Blockchain Using Machine Learing, Matthew A. Peterson

<strong> Theses and Dissertations </strong>

Selfish mining is an attack against a blockchain where miners hide newly discovered blocks instead of publishing them to the rest of the network. Selfish mining has been a potential issue for blockchains since it was first discovered by Eyal and Sirer. It can be used by malicious miners to earn a disproportionate share of the mining rewards or in conjunction with other attacks to steal money from network users. Several of these attacks were launched in 2018, 2019, and 2020 with the attackers stealing as much as $18 Million. Developers made several different attempts to fix this issue, but …


Gpgpu Microbenchmarking For Irregular Application Optimization, Dalton R. Winans-Pruitt Aug 2022

Gpgpu Microbenchmarking For Irregular Application Optimization, Dalton R. Winans-Pruitt

Theses and Dissertations

Irregular applications, such as unstructured mesh operations, do not easily map onto the typical GPU programming paradigms endorsed by GPU manufacturers, which mostly focus on maximizing concurrency for latency hiding. In this work, we show how alternative techniques focused on latency amortization can be used to control overall latency while requiring less concurrency. We used a custom-built microbenchmarking framework to test several GPU kernels and show how the GPU behaves under relevant workloads. We demonstrate that coalescing is not required for efficacious performance; an uncoalesced access pattern can achieve high bandwidth - even over 80% of the theoretical global memory …


Smart Decision-Making Via Edge Intelligence For Smart Cities, Nathaniel Hudson Jan 2022

Smart Decision-Making Via Edge Intelligence For Smart Cities, Nathaniel Hudson

Theses and Dissertations--Computer Science

Smart cities are an ambitious vision for future urban environments. The ultimate aim of smart cities is to use modern technology to optimize city resources and operations while improving overall quality-of-life of its citizens. Realizing this ambitious vision will require embracing advancements in information communication technology, data analysis, and other technologies. Because smart cities naturally produce vast amounts of data, recent artificial intelligence (AI) techniques are of interest due to their ability to transform raw data into insightful knowledge to inform decisions (e.g., using live road traffic data to control traffic lights based on current traffic conditions). However, training and …


A Practical Approach To Automated Software Correctness Enhancement, Aleksandr Zakharchenko Dec 2021

A Practical Approach To Automated Software Correctness Enhancement, Aleksandr Zakharchenko

Dissertations

To repair an incorrect program does not mean to make it correct; it only means to make it more-correct, in some sense, than it is. In the absence of a concept of relative correctness, i.e. the property of a program to be more-correct than another with respect to a specification, the discipline of program repair has resorted to various approximations of absolute (traditional) correctness, with varying degrees of success. This shortcoming is concealed by the fact that most program repair tools are tested on basic cases, whence making them absolutely correct is not clearly distinguishable from making them relatively more-correct. …


Component Damage Source Identification For Critical Infrastructure Systems, Nathan Davis Dec 2021

Component Damage Source Identification For Critical Infrastructure Systems, Nathan Davis

Graduate Theses and Dissertations

Cyber-Physical Systems (CPS) are becoming increasingly prevalent for both Critical Infrastructure and the Industry 4.0 initiative. Bad values within components of the software portion of CPS, or the computer systems, have the potential to cause major damage if left unchecked, and so detection and locating of where these occur is vital. We further define features of these computer systems and create a use-based system topology. We then introduce a function to monitor system integrity and the presence of bad values as well as an algorithm to locate them. We then show an improved version, taking advantage of several system properties …


Using High-Performance Computing Profilers To Understand The Performance Of Graph Algorithms, Costain Nachuma Aug 2020

Using High-Performance Computing Profilers To Understand The Performance Of Graph Algorithms, Costain Nachuma

University of New Orleans Theses and Dissertations

An algorithm designer working with parallel computing systems should know how the characteristics of their implemented algorithm affects various performance aspects of their parallel program. It would be beneficial to these designers if each algorithm came with a specific set of standards that identified which algorithms worked better for a specified system. Therefore, the goal of this paper is to take implementations of four graphing algorithms, extract their features such as memory consumption, scalability using profilers (Vtunes /Tau) to determine which algorithms work to their fullest potential in one of the three systems: GPU, shared memory system, or distributed memory …


Investigating Single Precision Floating General Matrix Multiply In Heterogeneous Hardware, Steven Harris Aug 2020

Investigating Single Precision Floating General Matrix Multiply In Heterogeneous Hardware, Steven Harris

McKelvey School of Engineering Theses & Dissertations

The fundamental operation of matrix multiplication is ubiquitous across a myriad of disciplines. Yet, the identification of new optimizations for matrix multiplication remains relevant for emerging hardware architectures and heterogeneous systems. Frameworks such as OpenCL enable computation orchestration on existing systems, and its availability using the Intel High Level Synthesis compiler allows users to architect new designs for reconfigurable hardware using C/C++. Using the HARPv2 as a vehicle for exploration, we investigate the utility of several of the most notable matrix multiplication optimizations to better understand the performance portability of OpenCL and the implications for such optimizations on this and …


Edge-Cloud Iot Data Analytics: Intelligence At The Edge With Deep Learning, Ananda Mohon M. Ghosh May 2020

Edge-Cloud Iot Data Analytics: Intelligence At The Edge With Deep Learning, Ananda Mohon M. Ghosh

Electronic Thesis and Dissertation Repository

Rapid growth in numbers of connected devices, including sensors, mobile, wearable, and other Internet of Things (IoT) devices, is creating an explosion of data that are moving across the network. To carry out machine learning (ML), IoT data are typically transferred to the cloud or another centralized system for storage and processing; however, this causes latencies and increases network traffic. Edge computing has the potential to remedy those issues by moving computation closer to the network edge and data sources. On the other hand, edge computing is limited in terms of computational power and thus is not well suited for …


Nonlinear Least Squares 3-D Geolocation Solutions Using Time Differences Of Arrival, Michael V. Bredemann Apr 2020

Nonlinear Least Squares 3-D Geolocation Solutions Using Time Differences Of Arrival, Michael V. Bredemann

Mathematics & Statistics ETDs

This thesis uses a geometric approach to derive and solve nonlinear least squares minimization problems to geolocate a signal source in three dimensions using time differences of arrival at multiple sensor locations. There is no restriction on the maximum number of sensors used. Residual errors reach the numerical limits of machine precision. Symmetric sensor orientations are found that prevent closed form solutions of source locations lying within the null space. Maximum uncertainties in relative sensor positions and time difference of arrivals, required to locate a source within a maximum specified error, are found from these results. Examples illustrate potential requirements …


A Visual Analytics System For Making Sense Of Real-Time Twitter Streams, Amir Haghighatimaleki Jan 2020

A Visual Analytics System For Making Sense Of Real-Time Twitter Streams, Amir Haghighatimaleki

Electronic Thesis and Dissertation Repository

Through social media platforms, massive amounts of data are being produced. Twitter, as one such platform, enables users to post “tweets” on an unprecedented scale. Once analyzed by machine learning (ML) techniques and in aggregate, Twitter data can be an invaluable resource for gaining insight. However, when applied to real-time data streams, due to covariate shifts in the data (i.e., changes in the distributions of the inputs of ML algorithms), existing ML approaches result in different types of biases and provide uncertain outputs. This thesis describes a visual analytics system (i.e., a tool that combines data visualization, human-data interaction, and …


Contrasting Geometric Variations Of Mathematical Models Of Self-Assembling Systems, Michael Sharp Dec 2019

Contrasting Geometric Variations Of Mathematical Models Of Self-Assembling Systems, Michael Sharp

Graduate Theses and Dissertations

Self-assembly is the process by which complex systems are formed and behave due to the interactions of relatively simple units. In this thesis, we explore multiple augmentations of well known models of self-assembly to gain a better understanding of the roles that geometry and space play in their dynamics. We begin in the abstract Tile Assembly Model (aTAM) with some examples and a brief survey of previous results to provide a foundation. We then introduce the Geometric Thermodynamic Binding Network model, a model that focuses on the thermodynamic stability of its systems while utilizing geometrically rigid components (dissimilar to other …


A Method Of Evaluation Of High-Performance Computing Batch Schedulers, Jeremy Stephen Futral Jan 2019

A Method Of Evaluation Of High-Performance Computing Batch Schedulers, Jeremy Stephen Futral

UNF Graduate Theses and Dissertations

According to Sterling et al., a batch scheduler, also called workload management, is an application or set of services that provide a method to monitor and manage the flow of work through the system [Sterling01]. The purpose of this research was to develop a method to assess the execution speed of workloads that are submitted to a batch scheduler. While previous research exists, this research is different in that more complex jobs were devised that fully exercised the scheduler with established benchmarks. This research is important because the reduction of latency even if it is miniscule can lead to massive …


Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez Dec 2018

Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez

Computer Science ETDs

Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreading (MT) are becoming more popular, especially with applications requiring higher degrees of parallelism and scalability. Consequently, coupled parallel programs, those built via the integration of independently developed and optimized software libraries linked into a single application, increasingly comprise message-passing libraries with differing preferred degrees of threading, resulting in thread-level heterogeneity. Retroactively matching threading levels between independently developed and maintained libraries is difficult, and the challenge is exacerbated because contemporary middleware services provide only static scheduling policies over entire program executions, necessitating suboptimal, over-subscribed or under-subscribed, configurations. In …


Empathetic Computing For Inclusive Application Design, Kenny Choo Tsu Wei Dec 2018

Empathetic Computing For Inclusive Application Design, Kenny Choo Tsu Wei

Dissertations and Theses Collection (Open Access)

The explosive growth of the ecosystem of personal and ambient computing de- vices coupled with the proliferation of high-speed connectivity has enabled ex- tremely powerful and varied mobile computing applications that are used every- where. While such applications have tremendous potential to improve the lives of impaired users, most mobile applications have impoverished designs to be inclusive– lacking support for users with specific disabilities. Mobile app designers today haveinadequate support to design existing classes of apps to support users with specific disabilities, and more so, lack the support to design apps that specifically target these users. One way to resolve …


Developing A Cyberterrorism Policy: Incorporating Individual Values, Osama Bassam J. Rabie Jan 2018

Developing A Cyberterrorism Policy: Incorporating Individual Values, Osama Bassam J. Rabie

Theses and Dissertations

Preventing cyberterrorism is becoming a necessity for individuals, organizations, and governments. However, current policies focus on technical and managerial aspects without asking for experts and non-experts values and preferences for preventing cyberterrorism. This study employs value focused thinking and public value forum to bare strategic measures and alternatives for complex policy decisions for preventing cyberterrorism. The strategic measures and alternatives are per socio-technical process.


Programming Models' Support For Heterogeneous Architecture, Wei Wu May 2017

Programming Models' Support For Heterogeneous Architecture, Wei Wu

Doctoral Dissertations

Accelerator-enhanced computing platforms have drawn a lot of attention due to their massive peak computational capacity. Heterogeneous systems equipped with accelerators such as GPUs have become the most prominent components of High Performance Computing (HPC) systems. Even at the node level the significant heterogeneity of CPU and GPU, i.e. hardware and memory space differences, leads to challenges for fully exploiting such complex architectures. Extending outside the node scope, only escalate such challenges.

Conventional programming models such as data- ow and message passing have been widely adopted in HPC communities. When moving towards heterogeneous systems, the lack of GPU integration causes …


Improving The Performance Of Ice Sheet Modeling Through Embedded Simulation, Christopher G. Dufour Aug 2016

Improving The Performance Of Ice Sheet Modeling Through Embedded Simulation, Christopher G. Dufour

Electronic Theses and Dissertations

Understanding the impact of global climate change is a critical concern for society at large. One important piece of the climate puzzle is how large-scale ice sheets, such as those covering Greenland and Antarctica, respond to a warming climate. Given such ice sheets are under constant change, developing models that can accurately capture their dynamics represents a significant challenge to researchers. The problem, however, is properly capturing the dynamics of an ice sheet model requires a high model resolution and simulating these models is intractable even for state-of-the-art supercomputers.

This thesis presents a revolutionary approach to accurately capture ice sheet …


Interactive Feature Selection And Visualization For Large Observational Data, Jingyuan Wang Dec 2014

Interactive Feature Selection And Visualization For Large Observational Data, Jingyuan Wang

Doctoral Dissertations

Data can create enormous values in both scientific and industrial fields, especially for access to new knowledge and inspiration of innovation. As the massive increases in computing power, data storage capacity, as well as capability of data generation and collection, the scientific research communities are confronting with a transformation of exploiting the advanced uses of the large-scale, complex, and high-resolution data sets in situation awareness and decision-making projects. To comprehensively analyze the big data problems requires the analyses aiming at various aspects which involves of effective selections of static and time-varying feature patterns that fulfills the interests of domain users. …


Data Analytics Of University Student Records, Mark Blaise Decotes Aug 2014

Data Analytics Of University Student Records, Mark Blaise Decotes

Masters Theses

Understanding the proper navigation of a college curriculum is a daunting task for students, faculty, and staff. Collegiate courses offer enough intellectual challenge without the unnecessary confusion caused by course scheduling issues. Administrative faculty who execute curriculum changes need both quantitative data and empirical evidence to support their notions about which courses are cornerstone. Students require clear understanding of paths through their courses and majors that give them the optimal chance of success. In this work, we re-envision the analysis of student records from several decades by opening up these datasets to new ways of interactivity. We represent curricula through …


Proton Computed Tomography: Matrix Data Generation Through General Purpose Graphics Processing Unit Reconstruction, Micah Witt Mar 2014

Proton Computed Tomography: Matrix Data Generation Through General Purpose Graphics Processing Unit Reconstruction, Micah Witt

Electronic Theses, Projects, and Dissertations

Proton computed tomography (pCT) is an image modality that will improve treatment planning for patients receiving proton radiation therapy compared with the current techniques, which are based on X-ray CT. Images are reconstructed in pCT by solving a large and sparse system of linear equations. The size of the system necessitates matrix-partitioning and parallel reconstruction algorithms to be implemented across some sort of cluster computing architecture. The prototypical algorithm to solve the pCT system is the algebraic reconstruction technique (ART) that has been modified into parallel versions called block-iterative-projection (BIP) methods and string-averaging-projection (SAP) methods. General purpose graphics processing units …


Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan Dec 2012

Dynamic Task Execution On Shared And Distributed Memory Architectures, Asim Yarkhan

Doctoral Dissertations

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.

In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by …


Development Of A Systems Engineering Model For Chemical Separation Process, Lijian Sun Dec 2003

Development Of A Systems Engineering Model For Chemical Separation Process, Lijian Sun

UNLV Theses, Dissertations, Professional Papers, and Capstones

This thesis is concerned with the efforts to develop a general-purpose systems engineering model software TRPSEMPro1 that can be used to improve productivity in the design process. Different features of TRPSEMPro will be presented in this thesis. First, Systems Engineering technology is presented, followed by the exposition of different numerical optimization technologies and DOE (Design of Experiments) study technologies. Second, the detailed software process, Object-Oriented Analysis and Design (OOA&D) for the TRPSEMPro is presented. All the design data models are expressed by using Unified Modeling Language (UML).

AMUSESimulator is another software package which has been designed and implemented in order …


Real Time Texture Analysis From The Parallel Computation Of Fractal Dimension, Halford I. Hayes Jr. Jul 1993

Real Time Texture Analysis From The Parallel Computation Of Fractal Dimension, Halford I. Hayes Jr.

Computer Science Theses & Dissertations

The discrimination of texture features in an image has many important applications: from detection of man-made objects from a surrounding natural background to identification of cancerous from healthy tissue in X-ray imagery. The fractal structure in an image has been used with success to identify these features but requires unacceptable processing time if executed sequentially.

The paradigm of data parallelism is presented as the best method for applying massively parallel processing to the computation of fractal dimension of an image. With this methodology, and sufficient numbers of processors, this computation can reach real time speeds necessary for many applications. A …