Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Theory and Algorithms

Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu Dec 2023

Towards Long-Term Fairness In Sequential Decision Making, Yaowei Hu

Graduate Theses and Dissertations

With the development of artificial intelligence, automated decision-making systems are increasingly integrated into various applications, such as hiring, loans, education, recommendation systems, and more. These machine learning algorithms are expected to facilitate faster, more accurate, and impartial decision-making compared to human judgments. Nevertheless, these expectations are not always met in practice due to biased training data, leading to discriminatory outcomes. In contemporary society, countering discrimination has become a consensus among people, leading the EU and the US to enact laws and regulations that prohibit discrimination based on factors such as gender, age, race, and religion. Consequently, addressing algorithmic discrimination has …


A Novel Approach To Extending Music Using Latent Diffusion, Keon Roohparvar, Franz J. Kurfess Jun 2023

A Novel Approach To Extending Music Using Latent Diffusion, Keon Roohparvar, Franz J. Kurfess

Master's Theses

Using deep learning to synthetically generate music is a research domain that has gained more attention from the public in the past few years. A subproblem of music generation is music extension, or the task of taking existing music and extending it. This work proposes the Continuer Pipeline, a novel technique that uses deep learning to take music and extend it in 5 second increments. It does this by treating the musical generation process as an image generation problem; we utilize latent diffusion models (LDMs) to generate spectrograms, which are image representations of music. The Continuer Pipeline is able to …


A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb May 2023

A Machine Learning Approach For Predicting Clinical Trial Patient Enrollment In Drug Development Portfolio Demand Planning, Ahmed Shoieb

Masters Theses

One of the biggest challenges the clinical research industry currently faces is the accurate forecasting of patient enrollment (namely if and when a clinical trial will achieve full enrollment), as the stochastic behavior of enrollment can significantly contribute to delays in the development of new drugs, increases in duration and costs of clinical trials, and the over- or under- estimation of clinical supply. This study proposes a Machine Learning model using a Fully Convolutional Network (FCN) that is trained on a dataset of 100,000 patient enrollment data points including patient age, patient gender, patient disease, investigational product, study phase, blinded …


Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian) Mar 2023

Chatgpt As Metamorphosis Designer For The Future Of Artificial Intelligence (Ai): A Conceptual Investigation, Amarjit Kumar Singh (Library Assistant), Dr. Pankaj Mathur (Deputy Librarian)

Library Philosophy and Practice (e-journal)

Abstract

Purpose: The purpose of this research paper is to explore ChatGPT’s potential as an innovative designer tool for the future development of artificial intelligence. Specifically, this conceptual investigation aims to analyze ChatGPT’s capabilities as a tool for designing and developing near about human intelligent systems for futuristic used and developed in the field of Artificial Intelligence (AI). Also with the helps of this paper, researchers are analyzed the strengths and weaknesses of ChatGPT as a tool, and identify possible areas for improvement in its development and implementation. This investigation focused on the various features and functions of ChatGPT that …


Liquid Tab, Nathan Hulet Jan 2023

Liquid Tab, Nathan Hulet

Williams Honors College, Honors Research Projects

Guitar transcription is a complex task requiring significant time, skill, and musical knowledge to achieve accurate results. Since most music is recorded and processed digitally, it would seem like many tools to digitally analyze and transcribe the audio would be available. However, the problem of automatic transcription presents many more difficulties than are initially evident. There are multiple ways to play a guitar, many diverse styles of playing, and every guitar sounds different. These problems become even more difficult considering the varying qualities of recordings and levels of background noise.

Machine learning has proven itself to be a flexible tool …


A Machine Learning And Deep Learning Framework For Binary, Ternary, And Multiclass Emotion Classification Of Covid-19 Vaccine-Related Tweets, Aditya Dubey May 2022

A Machine Learning And Deep Learning Framework For Binary, Ternary, And Multiclass Emotion Classification Of Covid-19 Vaccine-Related Tweets, Aditya Dubey

Honors Scholar Theses

My research mines public emotion toward the Covid-19 vaccine based on Twitter data collected over the past 6-12 months. This project is centered around building and developing machine learning and deep learning models to perform natural language processing of short-form text, which in our case tweets. These tweets are all vaccine-related tweets and the goal of the classification task is for our models to accurately classify a tweet into one of four emotion groups: Apprehension/Anticipation, Sadness/Anger/Frustration, Joy/Humor/Sarcasm, and Gratitude/Relief. Given this data and the goal of the paper, we aim to answer the following questions: (1) Can a framework be …


The Executive’S Guide To Getting Ai Wrong, Jerrold Soh May 2022

The Executive’S Guide To Getting Ai Wrong, Jerrold Soh

Asian Management Insights

It’s all math. Really.


Information Extraction And Classification On Journal Papers, Lei Yu Nov 2021

Information Extraction And Classification On Journal Papers, Lei Yu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The importance of journals for diffusing the results of scientific research has increased considerably. In the digital era, Portable Document Format (PDF) became the established format of electronic journal articles. This structured form, combined with a regular and wide dissemination, spread scientific advancements easily and quickly. However, the rapidly increasing numbers of published scientific articles requires more time and effort on systematic literature reviews, searches and screens. The comprehension and extraction of useful information from the digital documents is also a challenging task, due to the complex structure of PDF.

To help a soil science team from the United States …


Machine Learning With Topological Data Analysis, Ephraim Robert Love May 2021

Machine Learning With Topological Data Analysis, Ephraim Robert Love

Doctoral Dissertations

Topological Data Analysis (TDA) is a relatively new focus in the fields of statistics and machine learning. Methods of exploiting the geometry of data, such as clustering, have proven theoretically and empirically invaluable. TDA provides a general framework within which to study topological invariants (shapes) of data, which are more robust to noise and can recover information on higher dimensional features than immediately apparent in the data. A common tool for conducting TDA is persistence homology, which measures the significance of these invariants. Persistence homology has prominent realizations in methods of data visualization, statistics and machine learning. Extending ML with …


Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos May 2021

Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos

Electronic Theses and Dissertations

Recently, strategies of National Basketball Association teams have evolved with the skillsets of players and the emergence of advanced analytics. One of the most effective actions in dynamic offensive strategies in basketball is the dribble hand-off (DHO). This thesis proposes an architecture for a classification pipeline for detecting DHOs in an accurate and automated manner. This pipeline consists of a combination of player tracking data and event labels, a rule set to identify candidate actions, manually reviewing game recordings to label the candidates, and embedding player trajectories into hexbin cell paths before passing the completed training set to the classification …


Using Torchattacks To Improve The Robustness Of Models With Adversarial Training, William S. Matos Díaz Jan 2021

Using Torchattacks To Improve The Robustness Of Models With Adversarial Training, William S. Matos Díaz

Cybersecurity: Deep Learning Driven Cybersecurity Research in a Multidisciplinary Environment

Adversarial training has proven to be one of the most successful ways to defend models against adversarial examples. This process consists of training a model with an adversarial example to improve the robustness of the model. In this experiment, Torchattacks, a Pytorch library made for importing adversarial examples more easily, was used to determine which attack was the strongest. Later on, the strongest attack was used to train the model and make it more robust against adversarial examples. The datasets used to perform the experiments were MNIST and CIFAR-10. Both datasets were put to the test using PGD, FGSM, and …


Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li Dec 2020

Random Search Plus: A More Effective Random Search For Machine Learning Hyperparameters Optimization, Bohan Li

Masters Theses

Machine learning hyperparameter optimization has always been the key to improve model performance. There are many methods of hyperparameter optimization. The popular methods include grid search, random search, manual search, Bayesian optimization, population-based optimization, etc. Random search occupies less computations than the grid search, but at the same time there is a penalty for accuracy. However, this paper proposes a more effective random search method based on the traditional random search and hyperparameter space separation. This method is named random search plus. This thesis empirically proves that random search plus is more effective than random search. There are some case …


Achieving Causal Fairness In Machine Learning, Yongkai Wu May 2020

Achieving Causal Fairness In Machine Learning, Yongkai Wu

Graduate Theses and Dissertations

Fairness is a social norm and a legal requirement in today's society. Many laws and regulations (e.g., the Equal Credit Opportunity Act of 1974) have been established to prohibit discrimination and enforce fairness on several grounds, such as gender, age, sexual orientation, race, and religion, referred to as sensitive attributes. Nowadays machine learning algorithms are extensively applied to make important decisions in many real-world applications, e.g., employment, admission, and loans. Traditional machine learning algorithms aim to maximize predictive performance, e.g., accuracy. Consequently, certain groups may get unfairly treated when those algorithms are applied for decision-making. Therefore, it is an imperative …


Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi May 2019

Machine Learning Pipeline For Exoplanet Classification, George Clayton Sturrock, Brychan Manry, Sohail Rafiqi

SMU Data Science Review

Planet identification has typically been a tasked performed exclusively by teams of astronomers and astrophysicists using methods and tools accessible only to those with years of academic education and training. NASA’s Exoplanet Exploration program has introduced modern satellites capable of capturing a vast array of data regarding celestial objects of interest to assist with researching these objects. The availability of satellite data has opened up the task of planet identification to individuals capable of writing and interpreting machine learning models. In this study, several classification models and datasets are utilized to assign a probability of an observation being an exoplanet. …


Dish: Democracy In State Houses, Nicholas A. Russo Feb 2019

Dish: Democracy In State Houses, Nicholas A. Russo

Master's Theses

In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we …


Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater Jan 2019

Improving Vix Futures Forecasts Using Machine Learning Methods, James Hosker, Slobodan Djurdjevic, Hieu Nguyen, Robert Slater

SMU Data Science Review

The problem of forecasting market volatility is a difficult task for most fund managers. Volatility forecasts are used for risk management, alpha (risk) trading, and the reduction of trading friction. Improving the forecasts of future market volatility assists fund managers in adding or reducing risk in their portfolios as well as in increasing hedges to protect their portfolios in anticipation of a market sell-off event. Our analysis compares three existing financial models that forecast future market volatility using the Chicago Board Options Exchange Volatility Index (VIX) to six machine/deep learning supervised regression methods. This analysis determines which models provide best …


Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman Jan 2019

Quantifying Human Biological Age: A Machine Learning Approach, Syed Ashiqur Rahman

Graduate Theses, Dissertations, and Problem Reports

Quantifying human biological age is an important and difficult challenge. Different biomarkers and numerous approaches have been studied for biological age prediction, each with its advantages and limitations. In this work, we first introduce a new anthropometric measure (called Surface-based Body Shape Index, SBSI) that accounts for both body shape and body size, and evaluate its performance as a predictor of all-cause mortality. We analyzed data from the National Health and Human Nutrition Examination Survey (NHANES). Based on the analysis, we introduce a new body shape index constructed from four important anthropometric determinants of body shape and body size: body …


Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara Jan 2018

Offline And Online Density Estimation For Large High-Dimensional Data, Aref Majdara

Dissertations, Master's Theses and Master's Reports

Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.

This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is …


Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu Nov 2017

Adaft: A Resource-Efficient Framework For Adaptive Fault-Tolerance In Cyber-Physical Systems, Ye Xu

Doctoral Dissertations

Cyber-physical systems frequently have to use massive redundancy to meet application requirements for high reliability. While such redundancy is required, it can be activated adaptively, based on the current state of the controlled plant. Most of the time the physical plant is in a state that allows for a lower level of fault-tolerance. Avoiding the continuous deployment of massive fault-tolerance will greatly reduce the workload of CPSs. In this dissertation, we demonstrate a software simulation framework (AdaFT) that can automatically generate the sub-spaces within which our adaptive fault-tolerance can be applied. We also show the theoretical benefits of AdaFT, and …


Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao Jan 2015

Graph-Based Regularization In Machine Learning: Discovering Driver Modules In Biological Networks, Xi Gao

Theses and Dissertations

Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel's law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today's scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, …