Open Access. Powered by Scholars. Published by Universities.®
- Institution
Articles 1 - 13 of 13
Full-Text Articles in Theory and Algorithms
Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu
Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu
Graduate Theses and Dissertations
With the development of artificial intelligence, automated decision-making systems are increasingly integrated into various applications, such as hiring, loans, education, recommendation systems, and more. These machine learning algorithms are expected to facilitate faster, more accurate, and impartial decision-making compared to human judgments. Nevertheless, these expectations are not always met in practice due to biased training data, leading to discriminatory outcomes. In contemporary society, countering discrimination has become a consensus among people, leading the EU and the US to enact laws and regulations that prohibit discrimination based on factors such as gender, age, race, and religion. Consequently, addressing algorithmic discrimination has …
A Novel Approach To Extending Music Using Latent Diffusion, Keon Roohparvar, Franz J. Kurfess
A Novel Approach To Extending Music Using Latent Diffusion, Keon Roohparvar, Franz J. Kurfess
Master's Theses
Using deep learning to synthetically generate music is a research domain that has gained more attention from the public in the past few years. A subproblem of music generation is music extension, or the task of taking existing music and extending it. This work proposes the Continuer Pipeline, a novel technique that uses deep learning to take music and extend it in 5 second increments. It does this by treating the musical generation process as an image generation problem; we utilize latent diffusion models (LDMs) to generate spectrograms, which are image representations of music. The Continuer Pipeline is able to …
A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb
A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb
Masters Theses
One of the biggest challenges the clinical research industry currently faces is the accurate forecasting of patient enrollment (namely if and when a clinical trial will achieve full enrollment), as the stochastic behavior of enrollment can significantly contribute to delays in the development of new drugs, increases in duration and costs of clinical trials, and the over- or under- estimation of clinical supply. This study proposes a Machine Learning model using a Fully Convolutional Network (FCN) that is trained on a dataset of 100,000 patient enrollment data points including patient age, patient gender, patient disease, investigational product, study phase, blinded …
Liquid Tab, Nathan Hulet
Liquid Tab, Nathan Hulet
Williams Honors College, Honors Research Projects
Guitar transcription is a complex task requiring significant time, skill, and musical knowledge to achieve accurate results. Since most music is recorded and processed digitally, it would seem like many tools to digitally analyze and transcribe the audio would be available. However, the problem of automatic transcription presents many more difficulties than are initially evident. There are multiple ways to play a guitar, many diverse styles of playing, and every guitar sounds different. These problems become even more difficult considering the varying qualities of recordings and levels of background noise.
Machine learning has proven itself to be a flexible tool …
Machine Learning With Topological Data Analysis, Ephraim Robert Love
Machine Learning With Topological Data Analysis, Ephraim Robert Love
Doctoral Dissertations
Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with …
Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos
Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos
Electronic Theses and Dissertations
Recently, strategies of National Basketball Association teams have evolved with the skillsets of players and the emergence of advanced analytics. One of the most effective actions in dynamic offensive strategies in basketball is the dribble hand-off (DHO). This thesis proposes an architecture for a classification pipeline for detecting DHOs in an accurate and automated manner. This pipeline consists of a combination of player tracking data and event labels, a rule set to identify candidate actions, manually reviewing game recordings to label the candidates, and embedding player trajectories into hexbin cell paths before passing the completed training set to the classification …
Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li
Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li
Masters Theses
Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …
Achieving Causal Fairness In Machine Learning, Yongkai Wu
Achieving Causal Fairness In Machine Learning, Yongkai Wu
Graduate Theses and Dissertations
Fairness is a social norm and a legal requirement in today's society. Many laws and regulations (e.g., the Equal Credit Opportunity Act of 1974) have been established to prohibit discrimination and enforce fairness on several grounds, such as gender, age, sexual orientation, race, and religion, referred to as sensitive attributes. Nowadays machine learning algorithms are extensively applied to make important decisions in many real-world applications, e.g., employment, admission, and loans. Traditional machine learning algorithms aim to maximize predictive performance, e.g., accuracy. Consequently, certain groups may get unfairly treated when those algorithms are applied for decision-making. Therefore, it is an imperative …
Dish: Democracy In State Houses, Nicholas A. Russo
Dish: Democracy In State Houses, Nicholas A. Russo
Master's Theses
In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we …
Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman
Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman
Graduate Theses, Dissertations, and Problem Reports
Quantifying human biological age is an important and difficult challenge. Different biomarkers and numerous approaches have been studied for biological age prediction, each with its advantages and limitations. In this work, we first introduce a new anthropometric measure (called Surface-based Body Shape Index, SBSI) that accounts for both body shape and body size, and evaluate its performance as a predictor of all-cause mortality. We analyzed data from the National Health and Human Nutrition Examination Survey (NHANES). Based on the analysis, we introduce a new body shape index constructed from four important anthropometric determinants of body shape and body size: body …
Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara
Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara
Dissertations, Master's Theses and Master's Reports
Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.
This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is …
Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu
Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu
Doctoral Dissertations
Cyber-physical systems frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time the physical plant is in a state that allows for a lower level of fault-tolerance. Avoiding the continuous deployment of massive fault-tolerance will greatly reduce the workload of CPSs. In this dissertation, we demonstrate a software simulation framework (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault-tolerance can be applied. We also show the theoretical benefits of AdaFT, and …
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao
Theses and Dissertations
Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …