Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (12)
- Artificial Intelligence and Robotics (11)
- Mathematics (4)
- Medicine and Health Sciences (4)
- Bioinformatics (2)
-
- Chemistry (2)
- Computational Chemistry (2)
- Electrical and Computer Engineering (2)
- Engineering (2)
- Life Sciences (2)
- Other Mathematics (2)
- Pharmaceutics and Drug Design (2)
- Pharmacy and Pharmaceutical Sciences (2)
- Algebra (1)
- Applied Mathematics (1)
- Applied Statistics (1)
- Biomedical (1)
- Biomedical Informatics (1)
- Data Science (1)
- Discrete Mathematics and Combinatorics (1)
- Earth Sciences (1)
- Elementary Particles and Fields and String Theory (1)
- Graphics and Human Computer Interfaces (1)
- Materials Chemistry (1)
- Medical Specialties (1)
- Medicinal and Pharmaceutical Chemistry (1)
- Numerical Analysis and Computation (1)
- Numerical Analysis and Scientific Computing (1)
- Organic Chemistry (1)
Articles 1 - 18 of 18
Full-Text Articles in Physical Sciences and Mathematics
Flexible Attenuation Fields: Tomographic Reconstruction From Heterogeneous Datasets, Clifford S. Parker
Flexible Attenuation Fields: Tomographic Reconstruction From Heterogeneous Datasets, Clifford S. Parker
Theses and Dissertations--Computer Science
Traditional reconstruction methods for X-ray computed tomography (CT) are highly constrained in the variety of input datasets they admit. Many of the imaging settings -- the incident energy, field-of-view, effective resolution -- remain fixed across projection images, and the only real variance is in the detector's position and orientation with respect to the scene. In contrast, methods for 3D reconstruction of natural scenes are extremely flexible to the geometric and photometric properties of the input datasets, readily accepting and benefiting from images captured under varying lighting conditions, with different cameras, and at disparate points in time and space. Extending CT …
Advanced Mathematical Graph-Based Machine Learning And Deep Learning Models For Drug Design, Farjana Tasnim Mukta
Advanced Mathematical Graph-Based Machine Learning And Deep Learning Models For Drug Design, Farjana Tasnim Mukta
Theses and Dissertations--Mathematics
Drug discovery is a highly complicated and time-consuming process. One of the main challenges in drug development is predicting whether a drug-like molecule will interact with a specific target protein. This prediction accelerates target validation and drug development. Recent research in biomolecular sciences has shown significant interest in algebraic graph-based models for representing molecular complexes and predicting drug-target binding affinity. In this thesis, we present algebraic graph-based molecular representations to create data-driven scoring functions (SF) using extended atom types to capture wide-range interactions between targets and drug candidates. Our model employs multiscale weighted colored subgraphs for the protein-ligand complex, colored …
Machine Learning Framework For Real-World Electronic Health Records Regarding Missingness, Interpretability, And Fairness, Jing Lucas Liu
Machine Learning Framework For Real-World Electronic Health Records Regarding Missingness, Interpretability, And Fairness, Jing Lucas Liu
Theses and Dissertations--Computer Science
Machine learning (ML) and deep learning (DL) techniques have shown promising results in healthcare applications using Electronic Health Records (EHRs) data. However, their adoption in real-world healthcare settings is hindered by three major challenges. Firstly, real-world EHR data typically contains numerous missing values. Secondly, traditional ML/DL models are typically considered black-boxes, whereas interpretability is required for real-world healthcare applications. Finally, differences in data distributions may lead to unfairness and performance disparities, particularly in subpopulations.
This dissertation proposes methods to address missing data, interpretability, and fairness issues. The first work proposes an ensemble prediction framework for EHR data with large missing …
Symbolic Computation Of Squared Amplitudes In High Energy Physics With Machine Learning, Abdulhakim Alnuqaydan
Symbolic Computation Of Squared Amplitudes In High Energy Physics With Machine Learning, Abdulhakim Alnuqaydan
Theses and Dissertations--Physics and Astronomy
The calculation of particle interaction squared amplitudes is a key step in the calculation of cross sections in high-energy physics. These complex calculations are currently performed using domain-specific symbolic algebra tools, where the computational time escalates rapidly with an increase in the number of loops and final state particles. This dissertation introduces an innovative approach: employing a transformer-based sequence-to-sequence model capable of accurately predicting squared amplitudes of Standard Model processes up to one-loop order when trained on symbolic sequence pairs. The primary objective of this work is to significantly reduce the computational time and, more importantly, develop a model that …
Developing And Deploying Data-Driven Tools For Accelerated Design Of Organic Semiconductors, Vinayak Bhat
Developing And Deploying Data-Driven Tools For Accelerated Design Of Organic Semiconductors, Vinayak Bhat
Theses and Dissertations--Chemistry
Organic semiconductors have gained widespread attention due to their potential applications in flexible, low-cost, lightweight electronics, energy storage and generation technologies, and sensing applications. However, developing new organic semiconductors with improved performance remains a significant challenge due to the vast chemical space of possible molecular and materials structures. Furthermore, the high cost and time-consuming nature of experimental synthesis and characterization hinder the rapid discovery of new materials. To overcome these challenges, this dissertation presents a data-driven approach to organic semiconductor discovery. The primary focus of this work is the development of data-driven tools, namely machine learning models, to predict critical …
Normalization Techniques For Sequential And Graphical Data, Cole Pospisil
Normalization Techniques For Sequential And Graphical Data, Cole Pospisil
Theses and Dissertations--Mathematics
Normalization methods have proven to be an invaluable tool in the training of deep neural networks. In particular, Layer and Batch Normalization are commonly used to mitigate the risks of exploding and vanishing gradients. This work presents two methods which are related to these normalization techniques. The first method is Batch Normalized Preconditioning (BNP) for recurrent neural networks (RNN) and graph convolutional networks (GCN). BNP has been suggested as a technique for Fully Connected and Convolutional networks for achieving similar performance benefits to Batch Normalization by controlling the condition number of the Hessian through preconditioning on the gradients. We extend …
Batch Normalization Preconditioning For Neural Network Training, Susanna Luisa Gertrude Lange
Batch Normalization Preconditioning For Neural Network Training, Susanna Luisa Gertrude Lange
Theses and Dissertations--Mathematics
Batch normalization (BN) is a popular and ubiquitous method in deep learning that has been shown to decrease training time and improve generalization performance of neural networks. Despite its success, BN is not theoretically well understood. It is not suitable for use with very small mini-batch sizes or online learning. In this work, we propose a new method called Batch Normalization Preconditioning (BNP). Instead of applying normalization explicitly through a batch normalization layer as is done in BN, BNP applies normalization by conditioning the parameter gradients directly during training. This is designed to improve the Hessian matrix of the loss …
Development Of Accurate And Efficient Computational Methodologies For Predicting Protein-Ligand And Protein-Protein Binding Free Energies, Alexander Hamilton Williams
Development Of Accurate And Efficient Computational Methodologies For Predicting Protein-Ligand And Protein-Protein Binding Free Energies, Alexander Hamilton Williams
Theses and Dissertations--Pharmacy
Computational modeling is an invaluable tool in the drug discovery process either for small ligand or protein therapeutics. The widespread availability of protein X-Ray Crystal and Cryo-Electron Microscopy (Cryo-EM) structures has allowed for more accurate molecular dynamics (MD) simulations that are not reliant on methods such as homology modeling, which may produce structures that require significant computational time to demonstrate their stability. In this thesis we describe several novel methodologies for the computationally efficient modeling of protein/ligand and protein/protein complexes that may be employed within both large-scale virtual screenings and lead compound optimization. These methodologies may also be utilized in …
Weakly Supervised Learning For Multi-Image Synthesis, Muhammad Usman Rafique
Weakly Supervised Learning For Multi-Image Synthesis, Muhammad Usman Rafique
Theses and Dissertations--Electrical and Computer Engineering
Machine learning-based approaches have been achieving state-of-the-art results on many computer vision tasks. While deep learning and convolutional networks have been incredibly popular, these approaches come at the expense of huge amounts of labeled data required for training. Manually annotating large amounts of data, often millions of images in a single dataset, is costly and time consuming. To deal with the problem of data annotation, the research community has been exploring approaches that require less amount of labelled data.
The central problem that we consider in this research is image synthesis without any manual labeling. Image synthesis is a classic …
Deep Neural Architectures For End-To-End Relation Extraction, Tung Tran
Deep Neural Architectures For End-To-End Relation Extraction, Tung Tran
Theses and Dissertations--Computer Science
The rapid pace of scientific and technological advancements has led to a meteoric growth in knowledge, as evidenced by a sharp increase in the number of scholarly publications in recent years. PubMed, for example, archives more than 30 million biomedical articles across various domains and covers a wide range of topics including medicine, pharmacy, biology, and healthcare. Social media and digital journalism have similarly experienced their own accelerated growth in the age of big data. Hence, there is a compelling need for ways to organize and distill the vast, fragmented body of information (often unstructured in the form of natural …
Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich
Orthogonal Recurrent Neural Networks And Batch Normalization In Deep Neural Networks, Kyle Eric Helfrich
Theses and Dissertations--Mathematics
Despite the recent success of various machine learning techniques, there are still numerous obstacles that must be overcome. One obstacle is known as the vanishing/exploding gradient problem. This problem refers to gradients that either become zero or unbounded. This is a well known problem that commonly occurs in Recurrent Neural Networks (RNNs). In this work we describe how this problem can be mitigated, establish three different architectures that are designed to avoid this issue, and derive update schemes for each architecture. Another portion of this work focuses on the often used technique of batch normalization. Although found to be successful …
Rule Mining And Sequential Pattern Based Predictive Modeling With Emr Data, Orhan Abar
Rule Mining And Sequential Pattern Based Predictive Modeling With Emr Data, Orhan Abar
Theses and Dissertations--Computer Science
Electronic medical record (EMR) data is collected on a daily basis at hospitals and other healthcare facilities to track patients’ health situations including conditions, treatments (medications, procedures), diagnostics (labs) and associated healthcare operations. Besides being useful for individual patient care and hospital operations (e.g., billing, triaging), EMRs can also be exploited for secondary data analyses to glean discriminative patterns that hold across patient cohorts for different phenotypes. These patterns in turn can yield high level insights into disease progression with interventional potential. In this dissertation, using a large scale realistic EMR dataset of over one million patients visiting University of …
Lattice Simplices: Sufficiently Complicated, Brian Davis
Lattice Simplices: Sufficiently Complicated, Brian Davis
Theses and Dissertations--Mathematics
Simplices are the "simplest" examples of polytopes, and yet they exhibit much of the rich and subtle combinatorics and commutative algebra of their more general cousins. In this way they are sufficiently complicated --- insights gained from their study can inform broader research in Ehrhart theory and associated fields.
In this dissertation we consider two previously unstudied properties of lattice simplices; one algebraic and one combinatorial. The first is the Poincar\'e series of the associated semigroup algebra, which is substantially more complicated than the Hilbert series of that same algebra. The second is the partial ordering of the elements of …
Relation Prediction Over Biomedical Knowledge Bases For Drug Repositioning, Mehmet Bakal
Relation Prediction Over Biomedical Knowledge Bases For Drug Repositioning, Mehmet Bakal
Theses and Dissertations--Computer Science
Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying other essential relations (e.g., causation, prevention) between biomedical entities is also critical to understand biomedical processes. Hence, it is crucial to develop automated relation prediction systems that can yield plausible biomedical relations to expedite the discovery process. In this dissertation, we demonstrate three approaches to predict treatment relations between biomedical entities for the drug repositioning task …
Deep Neural Networks For Multi-Label Text Classification: Application To Coding Electronic Medical Records, Anthony Rios
Deep Neural Networks For Multi-Label Text Classification: Application To Coding Electronic Medical Records, Anthony Rios
Theses and Dissertations--Computer Science
Coding Electronic Medical Records (EMRs) with diagnosis and procedure codes is an essential task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and misinterpretation of a patient’s well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. Therefore, it is necessary to develop automated diagnosis and procedure code recommendation methods that can be used by professional medical coders.
The main difficulty with developing automated EMR coding methods is the nature of the label space. The …
Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones
Scalable Feature Selection And Extraction With Applications In Kinase Polypharmacology, Derek Jones
Theses and Dissertations--Computer Science
In order to reduce the time associated with and the costs of drug discovery, machine learning is being used to automate much of the work in this process. However the size and complex nature of molecular data makes the application of machine learning especially challenging. Much work must go into the process of engineering features that are then used to train machine learning models, costing considerable amounts of time and requiring the knowledge of domain experts to be most effective. The purpose of this work is to demonstrate data driven approaches to perform the feature selection and extraction steps in …
Context-Aware Debugging For Concurrent Programs, Justin Chu
Context-Aware Debugging For Concurrent Programs, Justin Chu
Theses and Dissertations--Computer Science
Concurrency faults are difficult to reproduce and localize because they usually occur under specific inputs and thread interleavings. Most existing fault localization techniques focus on sequential programs but fail to identify faulty memory access patterns across threads, which are usually the root causes of concurrency faults. Moreover, existing techniques for sequential programs cannot be adapted to identify faulty paths in concurrent programs. While concurrency fault localization techniques have been proposed to analyze passing and failing executions obtained from running a set of test cases to identify faulty access patterns, they primarily focus on using statistical analysis. We present a novel …
Soil Hydraulic Property Estimation Under Major Land-Uses In The Shawnee Hills, Trinity Joseph Baker
Soil Hydraulic Property Estimation Under Major Land-Uses In The Shawnee Hills, Trinity Joseph Baker
Theses and Dissertations--Plant and Soil Sciences
The ability to map soil moisture is becoming more important with changing climates and modeling these effects depends on reliable estimations of hydrologic soil properties under different land managements. This study: 1) tests the application of existing soil hydraulic property estimation methods against in-situ values of six catenas under two covers (forest and grass); 2) validate Random Forest Algorithm (RF) estimates informed from the six catenas on two separate catenas; 3) identify Rapid Carbon Assessment (RaCA) sites within the Shawnee Hills Region that represent different land-uses (Crop, Conservation Reserve Program (CRP), Forest, and Pasture); 4) apply RF learning tree informed …