Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- University of Louisville (3)
- University of Massachusetts Amherst (3)
- West Virginia University (3)
- California Polytechnic State University, San Luis Obispo (2)
- New Jersey Institute of Technology (2)
-
- Southern Methodist University (2)
- University of Tennessee, Knoxville (2)
- Western University (2)
- Bowling Green State University (1)
- California State University, San Bernardino (1)
- Kennesaw State University (1)
- Marshall University (1)
- Nova Southeastern University (1)
- Union College (1)
- University of Kentucky (1)
- University of Nevada, Las Vegas (1)
- Virginia Commonwealth University (1)
- Keyword
-
- Machine learning (7)
- Machine Learning (6)
- Statistics (5)
- Bayesian (2)
- Data mining (2)
-
- Poisson (2)
- "hot hand" (1)
- <p>Cyberterrorism.</p> <p>Data mining – Statistical methods.</p> <p>Data mining – Implements.</p> <p>Support vector machines.</p> <p>Decision trees.</p> <p>Machine learning.</p> <p>Neural networks (computer science) – Research.</p> (1)
- Accessed (1)
- Aging (1)
- Appalachian basin (1)
- Applied sciences (1)
- Artificial Intelligence (1)
- BMI (1)
- Basketball (1)
- Bayesian shrinkage priors (1)
- Bayesian statistics (1)
- Big data (1)
- Binding Sites (1)
- Biological Age (1)
- Body-shape (1)
- Bootstrap resampling (1)
- China (1)
- Classification (1)
- Clinical trials (1)
- Clojure (Computer program language) (1)
- Community detections (1)
- Computational complexity (1)
- Computer Science (1)
- Computer Science Education (1)
- Publication Year
- Publication
-
- Doctoral Dissertations (4)
- Electronic Theses and Dissertations (3)
- Graduate Theses, Dissertations, and Problem Reports (3)
- Dissertations (2)
- Electronic Thesis and Dissertation Repository (2)
-
- Master's Theses (2)
- Statistical Science Theses and Dissertations (2)
- All HCAS Student Capstones, Theses, and Dissertations (1)
- Doctor of Data Science and Analytics Dissertations (1)
- Electronic Theses, Projects, and Dissertations (1)
- Honors Projects (1)
- Honors Theses (1)
- Masters Theses (1)
- Theses and Dissertations (1)
- Theses and Dissertations--Computer Science (1)
- Theses, Dissertations and Capstones (1)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (1)
Articles 1 - 28 of 28
Full-Text Articles in Statistical Methodology
Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth
Code For Care: Hypertension Prediction In Women Aged 18-39 Years, Kruti Sheth
Electronic Theses, Projects, and Dissertations
The longstanding prevalence of hypertension, often undiagnosed, poses significant risks of severe chronic and cardiovascular complications if left untreated. This study investigated the causes and underlying risks of hypertension in females aged between 18-39 years. The research questions were: (Q1.) What factors affect the occurrence of hypertension in females aged 18-39 years? (Q2.) What machine learning algorithms are suited for effectively predicting hypertension? (Q3.) How can SHAP values be leveraged to analyze the factors from model outputs? The findings are: (Q1.) Performing Feature selection using binary classification Logistic regression algorithm reveals an array of 30 most influential factors at an …
A Data-Driven Multi-Regime Approach For Predicting Real-Time Energy Consumption Of Industrial Machines., Abdulgani Kahraman
A Data-Driven Multi-Regime Approach For Predicting Real-Time Energy Consumption Of Industrial Machines., Abdulgani Kahraman
Electronic Theses and Dissertations
This thesis focuses on methods for improving energy consumption prediction performance in complex industrial machines. Working with real-world industrial machines brings several challenges, including data access, algorithmic bias, data privacy, and the interpretation of machine learning algorithms. To effectively manage energy consumption in the industrial sector, it is essential to develop a framework that enhances prediction performance, reduces energy costs, and mitigates air pollution in heavy industrial machine operations. This study aims to assist managers in making informed decisions and driving the transition towards green manufacturing. The energy consumption of industrial machinery is substantial, and the recent increase in CO2 …
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Optimizing Tumor Xenograft Experiments Using Bayesian Linear And Nonlinear Mixed Modelling And Reinforcement Learning, Mary Lena Bleile
Statistical Science Theses and Dissertations
Tumor xenograft experiments are a popular tool of cancer biology research. In a typical such experiment, one implants a set of animals with an aliquot of the human tumor of interest, applies various treatments of interest, and observes the subsequent response. Efficient analysis of the data from these experiments is therefore of utmost importance. This dissertation proposes three methods for optimizing cancer treatment and data analysis in the tumor xenograft context. The first of these is applicable to tumor xenograft experiments in general, and the second two seek to optimize the combination of radiotherapy with immunotherapy in the tumor xenograft …
Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss
Examining The Relationship Between Stomiiform Fish Morphology And Their Ecological Traits, Mikayla L. Twiss
All HCAS Student Capstones, Theses, and Dissertations
Trait-based ecology characterizes individuals’ functional attributes to better understand and predict their interactions with other species and their environments. Utilizing morphological traits to describe functional groups has helped group species with similar ecological niches that are not necessarily taxonomically related. Within the deep-pelagic fishes, the Order Stomiiformes exhibits high morphological and species diversity, and many species undertake diel vertical migration (DVM). While the morphology and behavior of stomiiform fishes have been extensively studied and described through taxonomic assessments, the connection between their form and function regarding their DVM types, morphotypes, and daytime depth distributions is not well known. Here, three …
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
Theses, Dissertations and Capstones
Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …
Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler
Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler
Graduate Theses, Dissertations, and Problem Reports
This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …
Parameter Estimation And Inference Of Spatial Autoregressive Model By Stochastic Gradient Descent, Gan Luan
Parameter Estimation And Inference Of Spatial Autoregressive Model By Stochastic Gradient Descent, Gan Luan
Dissertations
Stochastic gradient descent (SGD) is a popular iterative method for model parameter estimation in large-scale data and online learning settings since it goes through the data in only one pass. While SGD has been well studied for independent data, its application to spatially-correlated data largely remains unexplored. This dissertation develops SGD-based parameter estimation and statistical inference algorithms for the spatial autoregressive (SAR) model, a common model for spatial lattice data.
This research contains three parts. (I) The first part concerns SGD estimation and inference for the SAR mean regression model. A new SGD algorithm based on maximum likelihood estimator (MLE) …
Modeling And Solving The Outsourcing Risk Management Problem In Multi-Echelon Supply Chains, Arian A. Nahangi
Modeling And Solving The Outsourcing Risk Management Problem In Multi-Echelon Supply Chains, Arian A. Nahangi
Master's Theses
Worldwide globalization has made supply chains more vulnerable to risk factors, increasing the associated costs of outsourcing goods. Outsourcing is highly beneficial for any company that values building upon its core competencies, but the emergence of the COVID-19 pandemic and other crises have exposed significant vulnerabilities within supply chains. These disruptions forced a shift in the production of goods from outsourcing to domestic methods.
This paper considers a multi-echelon supply chain model with global and domestic raw material suppliers, manufacturing plants, warehouses, and markets. All levels within the supply chain network are evaluated from a holistic perspective, calculating a total …
Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li
Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li
Masters Theses
Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …
Bayesian Topological Machine Learning, Christopher A. Oballe
Bayesian Topological Machine Learning, Christopher A. Oballe
Doctoral Dissertations
Topological data analysis encompasses a broad set of ideas and techniques that address 1) how to rigorously define and summarize the shape of data, and 2) use these constructs for inference. This dissertation addresses the second problem by developing new inferential tools for topological data analysis and applying them to solve real-world data problems. First, a Bayesian framework to approximate probability distributions of persistence diagrams is established. The key insight underpinning this framework is that persistence diagrams may be viewed as Poisson point processes with prior intensities. With this assumption in hand, one may compute posterior intensities by adopting techniques …
Novel Inference Methods For Generalized Linear Models Using Shrinkage Priors And Data Augmentation., Arinjita Bhattacharyya
Novel Inference Methods For Generalized Linear Models Using Shrinkage Priors And Data Augmentation., Arinjita Bhattacharyya
Electronic Theses and Dissertations
Generalized linear models have broad applications in biostatistics and sociology. In a regression setup, the main target is to find a relevant set of predictors out of a large collection of covariates. Sparsity is the assumption that only a few of these covariates in a regression setup have a meaningful correlation with an outcome variate of interest. Sparsity is incorporated by regularizing the irrelevant slopes towards zero without changing the relevant predictors and keeping the resulting inferences intact. Frequentist variable selection and sparsity are addressed by popular techniques like Lasso, Elastic Net. Bayesian penalized regression can tackle the curse of …
Data-Driven Investment Decisions In P2p Lending: Strategies Of Integrating Credit Scoring And Profit Scoring, Yan Wang
Doctor of Data Science and Analytics Dissertations
In this dissertation, we develop and discuss several loan evaluation methods to guide the investment decisions for peer-to-peer (P2P) lending. In evaluating loans, credit scoring and profit scoring are the two widely utilized approaches. Credit scoring aims at minimizing the risk while profit scoring aims at maximizing the profit. This dissertation addresses the strengths and weaknesses of each scoring method by integrating them in various ways in order to provide the optimal investment suggestions for different investors. Before developing the methods for loan evaluation at the individual level, we applied the state-of-the-art method called the Long Short Term Memory (LSTM) …
Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan
Process Based Analysis Of Fluvial Stratigraphic Record: Middle Pennsylvanian Allegheny Formation, North-Central Wv, Oluwasegun O. Abatan
Graduate Theses, Dissertations, and Problem Reports
Fluvial deposits represent some of the best hydrocarbon reservoirs, but the quality of fluvial reservoirs varies depending on the reservoir architecture, which is controlled by allogenic and autogenic processes. Allogenic controls, including paleoclimate, tectonics, and glacio-eustasy, have long been debated as dominant controls in the deposition of fluvial strata. However, recent research has questioned the validity of this cyclicity and may indicate major influence from autogenic controls. To further investigate allogenic controls on stratal order, I analyzed the facies architecture, geomorphology, paleohydrology, and the stratigraphic framework of the Middle Pennsylvanian Allegheny Formation (MPAF), a fluvial depositional system in the Appalachian …
Allocative Poisson Factorization For Computational Social Science, Aaron Schein
Allocative Poisson Factorization For Computational Social Science, Aaron Schein
Doctoral Dissertations
Social science data often comes in the form of high-dimensional discrete data such as categorical survey responses, social interaction records, or text. These data sets exhibit high degrees of sparsity, missingness, overdispersion, and burstiness, all of which present challenges to traditional statistical modeling techniques. The framework of Poisson factorization (PF) has emerged in recent years as a natural way to model high-dimensional discrete data sets. This framework assumes that each observed count in a data set is a Poisson random variable $y ~ Pois(\mu)$ whose rate parameter $\mu$ is a function of shared model parameters. This thesis examines a specific …
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Dissertations
Spatial and temporal dependencies are ubiquitous properties of data in numerous domains. The popularity of spatial and temporal data mining has thus grown with the increasing prevalence of massive data. The presence of spatial and temporal attributes not only provides complementary useful perspectives, but also poses new challenges to the representation and integration into the learning procedure. In this dissertation, the involved spatial and temporal dependencies are explored with three genres: sample-wise, feature-wise, and target-wise. A family of novel methodologies is developed accordingly for the dependency representation in respective scenarios.
First, dependencies among discrete, continuous and repeated observations are studied …
Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane
Modeling Stochastically Intransitive Relationships In Paired Comparison Data, Ryan Patrick Alexander Mcshane
Statistical Science Theses and Dissertations
If the Warriors beat the Rockets and the Rockets beat the Spurs, does that mean that the Warriors are better than the Spurs? Sophisticated fans would argue that the Warriors are better by the transitive property, but could Spurs fans make a legitimate argument that their team is better despite this chain of evidence?
We first explore the nature of intransitive (rock-scissors-paper) relationships with a graph theoretic approach to the method of paired comparisons framework popularized by Kendall and Smith (1940). Then, we focus on the setting where all pairs of items, teams, players, or objects have been compared to …
Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman
Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman
Graduate Theses, Dissertations, and Problem Reports
Quantifying human biological age is an important and difficult challenge. Different biomarkers and numerous approaches have been studied for biological age prediction, each with its advantages and limitations. In this work, we first introduce a new anthropometric measure (called Surface-based Body Shape Index, SBSI) that accounts for both body shape and body size, and evaluate its performance as a predictor of all-cause mortality. We analyzed data from the National Health and Human Nutrition Examination Survey (NHANES). Based on the analysis, we introduce a new body shape index constructed from four important anthropometric determinants of body shape and body size: body …
Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor
Electronic Theses and Dissertations
Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …
Analysis Challenges For High Dimensional Data, Bangxin Zhao
Analysis Challenges For High Dimensional Data, Bangxin Zhao
Electronic Thesis and Dissertation Repository
In this thesis, we propose new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.
Two methods …
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Electronic Thesis and Dissertation Repository
ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.
This thesis presents a novel …
Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang
Estimating The Respiratory Lung Motion Model Using Tensor Decomposition On Displacement Vector Field, Kingston Kang
Theses and Dissertations
Modern big data often emerge as tensors. Standard statistical methods are inadequate to deal with datasets of large volume, high dimensionality, and complex structure. Therefore, it is important to develop algorithms such as low-rank tensor decomposition for data compression, dimensionality reduction, and approximation.
With the advancement in technology, high-dimensional images are becoming ubiquitous in the medical field. In lung radiation therapy, the respiratory motion of the lung introduces variabilities during treatment as the tumor inside the lung is moving, which brings challenges to the precise delivery of radiation to the tumor. Several approaches to quantifying this uncertainty propose using a …
Statistical Analysis Of Momentum In Basketball, Mackenzi Stump
Statistical Analysis Of Momentum In Basketball, Mackenzi Stump
Honors Projects
The “hot hand” in sports has been debated for as long as sports have been around. The debate involves whether streaks and slumps in sports are true phenomena or just simply perceptions in the mind of the human viewer. This statistical analysis of momentum in basketball analyzes the distribution of time between scoring events for the BGSU Women’s Basketball team from 2011-2017. We discuss how the distribution of time between scoring events changes with normal game factors such as location of the game, game outcome, and several other factors. If scoring events during a game were always randomly distributed, or …
Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu
Intrinsic Functions For Securing Cmos Computation: Variability, Modeling And Noise Sensitivity, Xiaolin Xu
Doctoral Dissertations
A basic premise behind modern secure computation is the demand for lightweight cryptographic primitives, like identifier or key generator. From a circuit perspective, the development of cryptographic modules has also been driven by the aggressive scalability of complementary metal-oxide-semiconductor (CMOS) technology. While advancing into nano-meter regime, one significant characteristic of today's CMOS design is the random nature of process variability, which limits the nominal circuit design. With the continuous scaling of CMOS technology, instead of mitigating the physical variability, leveraging such properties becomes a promising way. One of the famous products adhering to this double-edged sword philosophy is the Physically …
Computing The (Un)Computable: A Computationally-Augmented Perspective On The Yasukuni Shrine Controversy, Ryan Muther
Computing The (Un)Computable: A Computationally-Augmented Perspective On The Yasukuni Shrine Controversy, Ryan Muther
Honors Theses
Computational methods have been used with increasing frequency in the social sciences and humanities, due to the availability of digital sources and computing power to study everything from changes in the meanings of words in Latin texts to how knowledge was categorized in eighteen century encyclopedias. Recent trends in the fields of digital humanities and computational social science include statistical methods like machine learning, requiring large pre-tagged and annotated sets of documents which in turn necessitates a great deal of prior work to create data to use with such methods. This reliance on large corpora of annotated data limits the …
Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar
Threat Analysis, Countermeaures And Design Strategies For Secure Computation In Nanometer Cmos Regime, Raghavan Kumar
Doctoral Dissertations
Advancements in CMOS technologies have led to an era of Internet Of Things (IOT), where the devices have the ability to communicate with each other apart from their computational power. As more and more sensitive data is processed by embedded devices, the trend towards lightweight and efficient cryptographic primitives has gained significant momentum. Achieving a perfect security in silicon is extremely difficult, as the traditional cryptographic implementations are vulnerable to various active and passive attacks. There is also a threat in the form of "hardware Trojans" inserted into the supply chain by the untrusted third-party manufacturers for economic incentives. Apart …
A Fault-Based Model Of Fault Localization Techniques, Mark A. Hays
A Fault-Based Model Of Fault Localization Techniques, Mark A. Hays
Theses and Dissertations--Computer Science
Every day, ordinary people depend on software working properly. We take it for granted; from banking software, to railroad switching software, to flight control software, to software that controls medical devices such as pacemakers or even gas pumps, our lives are touched by software that we expect to work. It is well known that the main technique/activity used to ensure the quality of software is testing. Often it is the only quality assurance activity undertaken, making it that much more important.
In a typical experiment studying these techniques, a researcher will intentionally seed a fault (intentionally breaking the functionality of …
Automating Construction And Selection Of A Neural Network Using Stochastic Optimization, Jason Lee Hurt
Automating Construction And Selection Of A Neural Network Using Stochastic Optimization, Jason Lee Hurt
UNLV Theses, Dissertations, Professional Papers, and Capstones
An artificial neural network can be used to solve various statistical problems by approximating a function that provides a mapping from input to output data. No universal method exists for architecting an optimal neural network. Training one with a low error rate is often a manual process requiring the programmer to have specialized knowledge of the domain for the problem at hand.
A distributed architecture is proposed and implemented for generating a neural network capable of solving a particular problem without specialized knowledge of the problem domain. The only knowledge the application needs is a training set that the network …
Software Internationalization: A Framework Validated Against Industry Requirements For Computer Science And Software Engineering Programs, John Huân Vũ
Master's Theses
View John Huân Vũ's thesis presentation at http://youtu.be/y3bzNmkTr-c.
In 2001, the ACM and IEEE Computing Curriculum stated that it was necessary to address "the need to develop implementation models that are international in scope and could be practiced in universities around the world." With increasing connectivity through the internet, the move towards a global economy and growing use of technology places software internationalization as a more important concern for developers. However, there has been a "clear shortage in terms of numbers of trained persons applying for entry-level positions" in this area. Eric Brechner, Director of Microsoft Development Training, suggested …