Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine Learning

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 578

Full-Text Articles in Physical Sciences and Mathematics

Leveraging A Machine Learning Based Predictive Framework To Study Brain-Phenotype Relationships, Sage Hahn Jan 2023

Leveraging A Machine Learning Based Predictive Framework To Study Brain-Phenotype Relationships, Sage Hahn

Graduate College Dissertations and Theses

An immense collective effort has been put towards the development of methods forquantifying brain activity and structure. In parallel, a similar effort has focused on collecting experimental data, resulting in ever-growing data banks of complex human in vivo neuroimaging data. Machine learning, a broad set of powerful and effective tools for identifying multivariate relationships in high-dimensional problem spaces, has proven to be a promising approach toward better understanding the relationships between the brain and different phenotypes of interest. However, applied machine learning within a predictive framework for the study of neuroimaging data introduces several domain-specific problems and considerations, leaving the …


Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick Jan 2023

Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick

Systems Science Faculty Publications and Presentations

This research applies machine learning methods to build predictive models of Net Load Imbalance for the Resource Sufficiency Flexible Ramping Requirement in the Western Energy Imbalance Market. Several methods are used in this research, including Reconstructability Analysis, developed in the systems community, and more well-known methods such as Bayesian Networks, Support Vector Regression, and Neural Networks. The aims of the research are to identify predictive variables and obtain a new stand-alone model that improves prediction accuracy and reduces the INC (ability to increase generation) and DEC (ability to decrease generation) Resource Sufficiency Requirements for Western Energy Imbalance Market participants. This …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu Jan 2023

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz Dec 2022

Integrated Machine Learning And Optimization Approaches, Dogacan Yilmaz

Dissertations

This dissertation focuses on the integration of machine learning and optimization. Specifically, novel machine learning-based frameworks are proposed to help solve a broad range of well-known operations research problems to reduce the solution times. The first study presents a bidirectional Long Short-Term Memory framework to learn optimal solutions to sequential decision-making problems. Computational results show that the framework significantly reduces the solution time of benchmark capacitated lot-sizing problems without much loss in feasibility and optimality. Also, models trained using shorter planning horizons can successfully predict the optimal solution of the instances with longer planning horizons. For the hardest data set, …


Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James Dec 2022

Investigating Applications Of Deep Learning For Diagnosis Of Post Traumatic Elbow Disease, Hugh James

McKelvey School of Engineering Theses & Dissertations

Traumatic events such as dislocation, breaks, and arthritis of musculoskeletal joints can cause the development of post-traumatic joint contracture (PTJC). Clinically, noninvasive techniques such as Magnetic Resonance Imaging (MRI) scans are used to analyze the disease. Such procedures require a patient to sit sedentary for long periods of time and can be expensive as well. Additionally, years of practice and experience are required for clinicians to accurately recognize the diseased anterior capsule region and make an accurate diagnosis. Manual tracing of the anterior capsule is done to help with diagnosis but is subjective and timely. As a result, there is …


Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak Dec 2022

Artificial Intelligence In The Medical Field: Medical Review Sentiment Analysis, Nicholas Podlesak

Honors Capstones

In this research project, natural language processing techniques’ ability to accurately classify medical text was measured to reinforce the relevance of artificial intelligence in the medical field. Sentiment analyses (analyses to determine whether the text was positive or negative) were performed on the prescription drug reviews in an open-source dataset using four different models: lexical, a neural network, a support vector machine, and a logistic regression model. Each model’s effectiveness was gauged by its ability to correctly classify unlabeled drug reviews (i.e., a percentage representing accuracy). The machine learning models were able to accurately classify the text, while the lexical …


Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal Dec 2022

Learnfca: A Fuzzy Fca And Probability Based Approach For Learning And Classification, Suraj Ketan Samal

Computer Science and Engineering: Theses, Dissertations, and Student Research

Formal concept analysis(FCA) is a mathematical theory based on lattice and order theory used for data analysis and knowledge representation. Over the past several years, many of its extensions have been proposed and applied in several domains including data mining, machine learning, knowledge management, semantic web, software development, chemistry ,biology, medicine, data analytics, biology and ontology engineering.

This thesis reviews the state-of-the-art of theory of Formal Concept Analysis(FCA) and its various extensions that have been developed and well-studied in the past several years. We discuss their historical roots, reproduce the original definitions and derivations with illustrative examples. Further, we provide …


Data From: Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick Dec 2022

Data From: Machine Learning Predictions Of Electricity Capacity, Marcus Harris, Elizabeth Kirby, Ameeta Agrawal, Rhitabrat Pokharel, Francis Puyleart, Martin Zwick

Systems Science Faculty Datasets

This research applies machine learning methods to build predictive models of Net Load Imbalance for the Resource Sufficiency Flexible Ramping Requirement in the Western Energy Imbalance Market. Several methods are used in this research, including Reconstructability Analysis, developed in the systems community, and more well-known methods such as Bayesian Networks, Support Vector Regression, and Neural Networks. The aims of the research are to identify predictive variables and obtain a new stand-alone model that improves prediction accuracy and reduces the INC (ability to increase generation) and DEC (ability to decrease generation) Resource Sufficiency Requirements for Western Energy Imbalance Market participants. This …


Transition Metal Computational Catalysis: Mechanistic Approaches And Development Of Novel Performance Metrics, Brett Anthony Smith Dec 2022

Transition Metal Computational Catalysis: Mechanistic Approaches And Development Of Novel Performance Metrics, Brett Anthony Smith

Doctoral Dissertations

Computational catalysis is an ever-growing field, thanks in part to the incredible progression of computational power and the efficiency offered by our current methodologies. Additionally, the accuracy of computation and the emergence of new methods that can decompose energetics and sterics into quantitative descriptors has allowed for researchers to begin to identify important structure-function relationships that predict the properties of unexplored subspaces within the overall chemical space. Catalytic descriptors have been used frequently in data driven high-throughput computational screenings. With the use of machine learning, a large portion of the chemical space an be predicted in matter of minutes or …


Enhancing The Performance Of The Mtcnn For The Classification Of Cancer Pathology Reports: From Data Annotation To Model Deployment, Kevin De Angeli Dec 2022

Enhancing The Performance Of The Mtcnn For The Classification Of Cancer Pathology Reports: From Data Annotation To Model Deployment, Kevin De Angeli

Doctoral Dissertations

Information contained in electronic health records (EHR) combined with the latest advances in machine learning (ML) have the potential to revolutionize the medical sciences. In particular, information contained in cancer pathology reports is essential to investigate cancer trends across the country. Unfortunately, large parts of information in EHRs are stored in the form of unstructured, free-text which limit their usability and research potential. To overcome this accessibility barrier, cancer registries depend on expert personnel who read, interpret, and extract relevant information. Naturally, as the number of stored pathology reports increases every day, depending on human experts presents scalability challenges. Recently, …


Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany Prof Dec 2022

Identity Term Sampling For Measuring Gender Bias In Training Data, Nasim Sobhani, Sarah Jane Delany Prof

Conference Papers

Predictions from machine learning models can reflect biases in the data on which they are trained. Gender bias has been identified in natural language processing systems such as those used for recruitment. The development of approaches to mitigate gender bias in training data typically need to be able to isolate the effect of gender on the output to see the impact of gender. While it is possible to isolate and identify gender for some types of training data, e.g. CVs in recruitment, for most textual corpora there is no obvious gender label. This paper proposes a general approach to measure …


Design Of Environment Aware Planning Heuristics For Complex Navigation Objectives, Carter D. Bailey Dec 2022

Design Of Environment Aware Planning Heuristics For Complex Navigation Objectives, Carter D. Bailey

All Graduate Theses and Dissertations

A heuristic is the simplified approximations that helps guide a planner in deducing the best way to move forward. Heuristics are valued in many modern AI algorithms and decision-making architectures due to their ability to drastically reduce computation time. Particularly in robotics, path planning heuristics are widely leveraged to aid in navigation and exploration. As the robotic platform explores and navigates, information about the world can and should be used to augment and update the heuristic to guide solutions. Complex heuristics that can account for environmental factors, robot capabilities, and desired actions provide optimal results with little wasted exploration, but …


Adaptive Fairness Improvement Based Causality Analysis, Mengdi Zhang, Jun Sun Nov 2022

Adaptive Fairness Improvement Based Causality Analysis, Mengdi Zhang, Jun Sun

Research Collection School Of Computing and Information Systems

Given a discriminating neural network, the problem of fairness improvement is to systematically reduce discrimination without significantly scarifies its performance (i.e., accuracy). Multiple categories of fairness improving methods have been proposed for neural networks, including pre-processing, in-processing and postprocessing. Our empirical study however shows that these methods are not always effective (e.g., they may improve fairness by paying the price of huge accuracy drop) or even not helpful (e.g., they may even worsen both fairness and accuracy). In this work, we propose an approach which adaptively chooses the fairness improving method based on causality analysis. That is, we choose the …


Overview Of The Clpsych 2022 Shared Task: Capturing Moments Of Change In Longitudinal User Posts, Adam Tsakalidis, Jenny Chim, Iman Munire Bilal, Ayah Zirikly, Dana Atzil-Slonim, Federico Nanni, Philip Resnik, Manas Gaur, Kaushik Roy, Becky Inkster, Jeff Leintz, Maria Liakata Oct 2022

Overview Of The Clpsych 2022 Shared Task: Capturing Moments Of Change In Longitudinal User Posts, Adam Tsakalidis, Jenny Chim, Iman Munire Bilal, Ayah Zirikly, Dana Atzil-Slonim, Federico Nanni, Philip Resnik, Manas Gaur, Kaushik Roy, Becky Inkster, Jeff Leintz, Maria Liakata

Publications

We provide an overview of the CLPsych 2022 Shared Task, which focusses on the automatic identification of Moments of Change in longitudinal posts by individuals on social media and its connection with information regarding mental health . This year's task introduced the notion of longitudinal modelling of the text generated by an individual online over time, along with appropriate temporally sensitive evaluation metrics. The Shared Task consisted of two subtasks: (a) the main task of capturing changes in an individual's mood (drastic changes-`Switches'- and gradual changes -`Escalations'- on the basis of textual content shared online; and subsequently (b) the sub-task …


Machine Learning To Predict Warhead Fragmentation In-Flight Behavior From Static Data, Katharine Larsen Oct 2022

Machine Learning To Predict Warhead Fragmentation In-Flight Behavior From Static Data, Katharine Larsen

Doctoral Dissertations and Master's Theses

Accurate characterization of fragment fly-out properties from high-speed warhead detonations is essential for estimation of collateral damage and lethality for a given weapon. Real warhead dynamic detonation tests are rare, costly, and often unrealizable with current technology, leaving fragmentation experiments limited to static arena tests and numerical simulations. Stereoscopic imaging techniques can now provide static arena tests with time-dependent tracks of individual fragments, each with characteristics such as fragment IDs and their respective position vector. Simulation methods can account for the dynamic case but can exclude relevant dynamics experienced in real-life warhead detonations. This research leverages machine learning methodologies to …


Predicting Mental Health Crisis In Veterans: Early Warning Signs, Precursors And Protective Factors, Priyanka Annapureddy Oct 2022

Predicting Mental Health Crisis In Veterans: Early Warning Signs, Precursors And Protective Factors, Priyanka Annapureddy

Dissertations (1934 -)

Mental Health (MH) conditions have recently increased to a large extent due to socio-demographic changes. Posttraumatic Stress Disorder (PTSD) is one of the most common mental health disorders prevalent in US. PTSD is even more troubling at double the rate in combat veterans leaving their service compared to general population. Severity of PTSD is associated with risk taking behaviors such as substance abuse, non-suicidal self-injury, and sexual risk behaviors. Psychological disorders are often preceded by early warning signs and recognizing the early warning signs of PTSD will help in preventing the returning or worsening of PTSD symptoms. Ecological momentary assessment …


Classification Of Pixel Tracks To Improve Track Reconstruction From Proton-Proton Collisions, Kebur Fantahun, Jobin Joseph, Halle Purdom, Nibhrat Lohia Sep 2022

Classification Of Pixel Tracks To Improve Track Reconstruction From Proton-Proton Collisions, Kebur Fantahun, Jobin Joseph, Halle Purdom, Nibhrat Lohia

SMU Data Science Review

In this paper, machine learning techniques are used to reconstruct particle collision pathways. CERN (Conseil européen pour la recherche nucléaire) uses a massive underground particle collider, called the Large Hadron Collider or LHC, to produce particle collisions at extremely high speeds. There are several layers of detectors in the collider that track the pathways of particles as they collide. The data produced from collisions contains an extraneous amount of background noise, i.e., decays from known particle collisions produce fake signal. Particularly, in the first layer of the detector, the pixel tracker, there is an overwhelming amount of background noise that …


A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah Sep 2022

A Gpu-Based Machine Learning Approach For Detection Of Botnet Attacks, Michal Motylinski, Áine Macdermott, Farkhund Iqbal, Babar Shah

All Works

Rapid development and adaptation of the Internet of Things (IoT) has created new problems for securing these interconnected devices and networks. There are hundreds of thousands of IoT devices with underlying security vulnerabilities, such as insufficient device authentication/authorisation making them vulnerable to malware infection. IoT botnets are designed to grow and compete with one another over unsecure devices and networks. Once infected, the device will monitor a Command-and-Control (C&C) server indicating the target of an attack via Distributed Denial of Service (DDoS) attack. These security issues, coupled with the continued growth of IoT, presents a much larger attack surface for …


Data Preprocessing For Machine Learning Modules, Rawan El Moghrabi Aug 2022

Data Preprocessing For Machine Learning Modules, Rawan El Moghrabi

Undergraduate Student Research Internships Conference

Data preprocessing is an essential step when building machine learning solutions. It significantly impacts the success of machine learning modules and the output of these algorithms. Typically, data preprocessing is made-up of data sanitization, feature engineering, normalization, and transformation. This paper outlines the data preprocessing methodology implemented for a data-driven predictive maintenance solution. The above-mentioned project entails acquiring historical electrical data from industrial assets and creating a health index indicating each asset's remaining useful life. This solution is built using machine learning algorithms and requires several data processing steps to increase the solution's accuracy and efficiency. In this project, the …


Cyberbullying Detection Using Weakly Supervised And Fully Supervised Learning, Abhinav Abhishek Aug 2022

Cyberbullying Detection Using Weakly Supervised And Fully Supervised Learning, Abhinav Abhishek

ETD Archive

Machine learning is a very useful tool to solve issues in multiple domains such as sentiment analysis, fake news detection, facial recognition, and cyberbullying. In this work, we have leveraged its ability to understand the nuances of natural language to detect cyberbullying. We have further utilized it to detect the subject of cyberbullying such as age, gender, ethnicity, and religion. Further, we have built another layer to detect the cases of misogyny in cyberbullying. In one of our experiments, we created a three-layered architecture to detect cyberbullying , then to detect if it is gender based and finally if it …


State-Based Biological Communication, Nathan Clement Aug 2022

State-Based Biological Communication, Nathan Clement

All Theses

Allostery (1) is the process through which proteins self-regulate in response to various stimuli. Allosteric interactions occur between nonadjacent spatially distant residues (1), and they are exhibited through the correlated motions (2) and momenta of participating residues. The location of allosteric sites in proteins can be determined experimentally but computational methods to predict the location of allosteric sites are being developed as well (2-4, 10). Experimental and computational methodologies for locating allosteric sites can be used to design specific targeted drug delivery (5-6, 19), but these methods have not yet …


Predicting Order Status Using Xgboost, Kegan J. Penovich Aug 2022

Predicting Order Status Using Xgboost, Kegan J. Penovich

All Graduate Plan B and other Reports

Invista, a Koch subsidiary, is a multinational producer of fibers, resins, and intermediaries, particularly nylon. To keep the company operating required them to take over 1.5 million orders over the course of - years, less than a third of which arrived on-time. Orders arriving other than when expected can cause many problems for any company. While arriving late is a clear problem, it also troublesome for them to arrive early. In the face of this, it becomes important to be able to tell a-priori if an order will arrive on-time or not.

To address this problem, we made use of …


Deep Learning For Detecting Trees In The Urban Environment From Lidar, Julian R. Rice Aug 2022

Deep Learning For Detecting Trees In The Urban Environment From Lidar, Julian R. Rice

Master's Theses

Cataloguing and classifying trees in the urban environment is a crucial step in urban and environmental planning. However, manual collection and maintenance of this data is expensive and time-consuming. Algorithmic approaches that rely on remote sensing data have been developed for tree detection in forests, though they generally struggle in the more varied urban environment. This work proposes a novel method for the detection of trees in the urban environment that applies deep learning to remote sensing data. Specifically, we train a PointNet-based neural network to predict tree locations directly from LIDAR data augmented with multi-spectral imaging. We compare this …


Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa Jul 2022

Computational Models To Detect Radiation In Urban Environments: An Application Of Signal Processing Techniques And Neural Networks To Radiation Data Analysis, Jose Nicolas Gachancipa

Beyond: Undergraduate Research Journal

Radioactive sources, such as uranium-235, are nuclides that emit ionizing radiation, and which can be used to build nuclear weapons. In public areas, the presence of a radioactive nuclide can present a risk to the population, and therefore, it is imperative that threats are identified by radiological search and response teams in a timely and effective manner. In urban environments, such as densely populated cities, radioactive sources may be more difficult to detect, since background radiation produced by surrounding objects and structures (e.g., buildings, cars) can hinder the effective detection of unnatural radioactive material. This article presents a computational model …


An Empirical Study Towards An Automatic Phishing Attack Detection Using Ensemble Stacking Model, Mahmoud Othman, Hesham Hassan Jul 2022

An Empirical Study Towards An Automatic Phishing Attack Detection Using Ensemble Stacking Model, Mahmoud Othman, Hesham Hassan

Future Computing and Informatics Journal

Phishing attacks have become one of the most attacks facing internet users, especially after the COVID-19 pandemic, as most organizations have transferred part or most of their work and communication to become online using well-known tools, like email, Zoom, WebEx, etc. Therefore, cyber phishing attacks have become progressively recent, directly and frankly reflecting the designated website, allowing the attacker to observe everything while the victim is exploring Webpages. Hence, utilizing Artificial Intelligence (AI) techniques has become a necessary approach that could be used to detect such attacks automatically. In this paper, we introduce an empirical analysis for automatic phishing detection …


Development Of Software Tools For Efficient And Sustainable Process Development And Improvement, Jake P. Stengel Jun 2022

Development Of Software Tools For Efficient And Sustainable Process Development And Improvement, Jake P. Stengel

Theses and Dissertations

Infrastructure is a key component in the well-being of our society that leads to its growth, development, and productive operations. A well-built infrastructure allows the community to be more competitive and promotes economic advancement. In 2021, the ASCE (American Society of Civil Engineers) ranked the American infrastructure as substandard, with an overall grade of C-. The overall ranking suffers when key infrastructure categories are not maintained according to the needs of the population. Therefore, there is a need to consider alternative methods to improve our infrastructure and make it more sustainable to enhance the overall grade. One of the challenges …


Machine Learning With Kay, Lasith Niroshan, James Carswell Jun 2022

Machine Learning With Kay, Lasith Niroshan, James Carswell

Conference Papers

Computational power is very important when training Deep Learning (DL) models with large amounts of data (Wooldridge, 2021). Hence, High-Performance Computing (HPC) can be leveraged to reduce computational cost, and the Irish Centre for High-End Computing (ICHEC) provides significant infrastructure and services for research and development to both academia and industry. A portion of ICHEC's HPC system has been allocated for institutional access, and this paper presents a case study of how to use Kay (Ireland's national supercomputer) in the remote sensing domain. Specifically, this study uses clusters of Kay Graphics Processing Units (GPUs) for training DL models to extract …


An Empirical Study On Sampling Approaches For 3d Image Classification Using Deep Learning, Nicholas Michelette Jun 2022

An Empirical Study On Sampling Approaches For 3d Image Classification Using Deep Learning, Nicholas Michelette

Theses and Dissertations

A 3D classification method requires more training data than a 2D image classification method to achieve good performance. These training data usually come in the form of multiple 2D images (e.g., slices in a CT scan) or point clouds (e.g., 3D CAD modeling) for volumetric object representation. The amount of data required to complete this higher dimension problem comes with the cost of requiring more processing time and space. This problem can be mitigated with data size reduction (i.e., sampling). In this thesis, we empirically study and compare the classification performance and deep learning training time of PointNet utilizing uniform …


A Machine Learning Approach To Revenue Generation Within The Professional Hair Care Industry, Alexander K. Sepenu, Linda Eliasen Jun 2022

A Machine Learning Approach To Revenue Generation Within The Professional Hair Care Industry, Alexander K. Sepenu, Linda Eliasen

SMU Data Science Review

The cosmetic and beauty industry continues to grow and evolve to satisfy its patrons. In the United States, the industry is heavily science-driven, innovative, and fast-paced, suggesting that to remain productive and profitable, companies must seek smart alternatives to their current modus operandi or risk losing out on this multi-billion-dollar industry to fierce competition. In this paper, the authors seek to utilize machine learning models such as clustering and regression to improve the efficiency of current sales and customer segmentation models to help HairCo (pseudonym for confidentiality), a professional hair products manufacturer, strategize their marketing and sales efforts for revenue …


Analysis Of The Electric Power Outage Data And Prediction Of Electric Power Outage For Major Metropolitan Areas In Texas Using Machine Learning And Time Series Methods, Renfeng Wang, Venkata Leela 'Mg' Vanga, Zachary B. Zaiken, Jonathan Bennett Jun 2022

Analysis Of The Electric Power Outage Data And Prediction Of Electric Power Outage For Major Metropolitan Areas In Texas Using Machine Learning And Time Series Methods, Renfeng Wang, Venkata Leela 'Mg' Vanga, Zachary B. Zaiken, Jonathan Bennett

SMU Data Science Review

With growing energy usage, power outages affect millions of households. This case study focuses on gathering power outage historical data, modifying the data to attach weather attributes, and gathering ERCOT energy market conditions for Dallas-Fort Worth and Houston metropolitan areas of Texas. The transformed data is then analyzed using machine learning algorithms including, but not limited to, Regression, Random Forests and XGBoost to consider current weather and ERCOT features and predict power outage percentage for locations. The transformed data is also trained using time series models and serially correlated models including Autoregression and Vector Autoregression. This study also focuses on …