Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Engineering

PDF

University of Massachusetts Amherst

Theses/Dissertations

Machine learning

Articles 1 - 8 of 8

Full-Text Articles in Entire DC Network

Predicting Water Quality Vulnerability Under Climate Change With Machine Learning, Khanh Thi Nhu Nguyen Oct 2022

Predicting Water Quality Vulnerability Under Climate Change With Machine Learning, Khanh Thi Nhu Nguyen

Doctoral Dissertations

Water quality deterioration is a global and pervasive issue due to pollution caused by industrialization, urbanization, agriculturalization, and human population growth in the modern era. This issue is even more challenging in the context of climate change due to warming temperatures and the intensification of precipitation. Therefore, assessing the potential impacts of climate change on water quality is a concern. Assessment is necessary so that planners can prepare for and reduce the negative impacts on water quality. At present, climate change impact assessment frameworks are relatively adolescent. Most studies rely on climate projections from General Circulation Models for simulations of …


Data Parallel Frameworks For Training Machine Learning Models, Guoyi Zhao Jun 2022

Data Parallel Frameworks For Training Machine Learning Models, Guoyi Zhao

Doctoral Dissertations

Machine learning is the study of computer algorithms that focuses on analyzing and interpreting patterns and structures in data. It has been successfully applied to many areas in computer science and achieved state-of-the-art results to enable learning, reasoning, and decision-making without human interactions. This research aims to develop innovated data parallel frameworks to accommodate the computing resources to parallelize different machine learning and deep learning algorithms and speed up the training. To achieve that, we explore three interesting frameworks in this dissertation: (1) Sync-on-the-fly framework for gradient descent algorithms on transient resources; (2) Asynchronous Proactive Data Parallel framework for both …


Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona Jun 2022

Models And Machine Learning Techniques For Improving The Planning And Operation Of Electricity Systems In Developing Regions, Santiago Correa Cardona

Doctoral Dissertations

The enormous innovation in computational intelligence has disrupted the traditional ways we solve the main problems of our society and allowed us to make more data-informed decisions. Energy systems and the ways we deliver electricity are not exceptions to this trend: cheap and pervasive sensing systems and new communication technologies have enabled the collection of large amounts of data that are being used to monitor and predict in real-time the behavior of this infrastructure. Bringing intelligence to the power grid creates many opportunities to integrate new renewable energy sources more efficiently, facilitate grid planning and expansion, improve reliability, optimize electricity …


Benchmarking Small-Dataset Structure-Activity-Relationship Models For Prediction Of Wnt Signaling Inhibition, Mahtab Kokabi Oct 2021

Benchmarking Small-Dataset Structure-Activity-Relationship Models For Prediction Of Wnt Signaling Inhibition, Mahtab Kokabi

Masters Theses

Quantitative structure-activity relationship (QSAR) models based on machine learning algorithms are powerful tools to expedite drug discovery processes and therapeutics development. Given the cost in acquiring large-sized training datasets, it is useful to examine if QSAR analysis can reasonably predict drug activity with only a small-sized dataset (size < 100) and benchmark these small-dataset QSAR models in application-specific studies. To this end, here we present a systematic benchmarking study on small-dataset QSAR models built for prediction of effective Wnt signaling inhibitors, which are essential to therapeutics development in prevalent human diseases (e.g., cancer). Specifically, we examined a total of 72 two-dimensional (2D) QSAR models based on 4 best-performing algorithms, 6 commonly used molecular fingerprints, and 3 typical fingerprint lengths. We trained these models using a training dataset (56 compounds), benchmarked their performance on 4 figures-of-merit (FOMs), and examined their prediction accuracy using an external validation dataset (14 compounds). Our data show that the model performance is maximized when: 1) molecular fingerprints are selected to provide sufficient, unique, and not overly detailed representations of the chemical structures of drug compounds; 2) algorithms are selected to reduce the number of false predictions due to class imbalance in the dataset; and 3) models are selected to reach balanced performance on all 4 FOMs. These results may provide general guidelines in developing high-performance small-dataset QSAR models for drug activity prediction.


Sundown: Model-Driven Per-Panel Solar Anomaly Detection For Residential Arrays, Menghong Feng Jul 2020

Sundown: Model-Driven Per-Panel Solar Anomaly Detection For Residential Arrays, Menghong Feng

Masters Theses

There has been significant growth in both utility-scale and residential-scale solar installa- tions in recent years, driven by rapid technology improvements and falling prices. Unlike utility-scale solar farms that are professionally managed and maintained, smaller residential- scale installations often lack sensing and instrumentation for performance monitoring and fault detection. As a result, faults may go undetected for long periods of time, resulting in generation and revenue losses for the homeowner. In this thesis, we present SunDown, a sensorless approach designed to detect per-panel faults in residential solar arrays. SunDown does not require any new sensors for its fault detection and …


Autoplug: An Automated Metadata Service For Smart Outlets, Lurdh Pradeep Reddy Ambati Oct 2017

Autoplug: An Automated Metadata Service For Smart Outlets, Lurdh Pradeep Reddy Ambati

Masters Theses

Low-cost network-connected smart outlets are now available for monitoring, controlling, and scheduling the energy usage of electrical devices. As a result, such smart outlets are being integrated into automated home management systems, which remotely control them by analyzing and interpreting their data. However, to effectively interpret data and control devices, the system must know the type of device that is plugged into each smart outlet. Existing systems require users to manually input and maintain the outlet metadata that associates a device type with a smart outlet. Such manual operation is time-consuming and error-prone: users must initially inventory all outlet-to-device mappings, …


Automatic Development And Adaptation Of Concise Nonlinear Models For System Identification, William G. La Cava Nov 2016

Automatic Development And Adaptation Of Concise Nonlinear Models For System Identification, William G. La Cava

Doctoral Dissertations

Mathematical descriptions of natural and man-made processes are the bedrock of science, used by humans to understand, estimate, predict and control the natural and built world around them. The goal of system identification is to enable the inference of mathematical descriptions of the true behavior and dynamics of processes from their measured observations. The crux of this task is the identification of the dynamic model form (topology) in addition to its parameters. Model structures must be concise to offer insight to the user about the process in question. To that end, this dissertation proposes three methods to improve the ability …


Universal Schema For Knowledge Representation From Text And Structured Data, Limin Yao Mar 2015

Universal Schema For Knowledge Representation From Text And Structured Data, Limin Yao

Doctoral Dissertations

In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than that can be fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language---one of the most expressive forms of knowledge representation. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually define schemas. …