Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Location Optimization Of A Coal Power Plant To Balance Coal Supply And Electric Transmission Costs Against Plant’S Emission Exposure, Najam Khan Jan 2018

Location Optimization Of A Coal Power Plant To Balance Coal Supply And Electric Transmission Costs Against Plant’S Emission Exposure, Najam Khan

Electronic Theses and Dissertations

This research is focused on developing a location analysis methodology that can minimize the pollutant exposure to the public while ensuring that the combined costs of electric transmission losses and coal logistics are minimized. Coal power plants will provide a critical contribution towards meeting electricity demands for various nations in the foreseeable future. The site selection for a new coal power plant is extremely important from an investment point of view. The operational costs for running a coal power plant can be minimized by a combined emphasis on placing a coal power plant near coal mines as well as customers. …


Statistical Algorithms And Bioinformatics Tools Development For Computational Analysis Of High-Throughput Transcriptomic Data, Adam Mcdermaid Jan 2018

Statistical Algorithms And Bioinformatics Tools Development For Computational Analysis Of High-Throughput Transcriptomic Data, Adam Mcdermaid

Electronic Theses and Dissertations

Next-Generation Sequencing technologies allow for a substantial increase in the amount of data available for various biological studies. In order to effectively and efficiently analyze this data, computational approaches combining mathematics, statistics, computer science, and biology are implemented. Even with the substantial efforts devoted to development of these approaches, numerous issues and pitfalls remain. One of these issues is mapping uncertainty, in which read alignment results are biased due to the inherent difficulties associated with accurately aligning RNA-Sequencing reads. GeneQC is an alignment quality control tool that provides insight into the severity of mapping uncertainty in each annotated gene from …


Variable Selection Techniques For Clustering On The Unit Hypersphere, Damon Bayer Jan 2018

Variable Selection Techniques For Clustering On The Unit Hypersphere, Damon Bayer

Electronic Theses and Dissertations

Mixtures of von Mises-Fisher distributions have been shown to be an effective model for clustering data on a unit hypersphere, but variable selection for these models remains an important and challenging problem. In this paper, we derive two variants of the expectation-maximization framework, which are each used to identify a specific type of irrelevant variables for these models. The first type are noise variables, which are not useful for separating any pairs of clusters. The second type are redundant variables, which may be useful for separating pairs of clusters, but do not enable any additional separation beyond the separability provided …


The Impact Of Data Sovereignty On American Indian Self-Determination: A Framework Proof Of Concept Using Data Science, Joseph Carver Robertson Jan 2018

The Impact Of Data Sovereignty On American Indian Self-Determination: A Framework Proof Of Concept Using Data Science, Joseph Carver Robertson

Electronic Theses and Dissertations

The Data Sovereignty Initiative is a collection of ideas that was designed to create SMART solutions for tribal communities. This concept was to develop a horizontal governance framework to create a strategic act of sovereignty using data science. The core concept of this idea was to present data sovereignty as a way for tribal communities to take ownership of data in order to affect policy and strategic decisions that are data driven in nature. The case studies in this manuscript were developed around statistical theories of spatial statistics, exploratory data analysis, and machine learning. And although these case studies are …


Development Of Biclustering Techniques For Gene Expression Data Modeling And Mining, Juan Xie Jan 2018

Development Of Biclustering Techniques For Gene Expression Data Modeling And Mining, Juan Xie

Electronic Theses and Dissertations

The next-generation sequencing technologies can generate large-scale biological data with higher resolution, better accuracy, and lower technical variation than the arraybased counterparts. RNA sequencing (RNA-Seq) can generate genome-scale gene expression data in biological samples at a given moment, facilitating a better understanding of cell functions at genetic and cellular levels. The abundance of gene expression datasets provides an opportunity to identify genes with similar expression patterns across multiple conditions, i.e., co-expression gene modules (CEMs). Genomescale identification of CEMs can be modeled and solved by biclustering, a twodimensional data mining technique that allows clustering of rows and columns in a gene …