Physical Sciences and Mathematics | Open Access Articles

Grammar And Variation: Understanding How Cis-Regulatory Information Is Encoded In Mammalian Genomes, Dana Michele King Dec 2018

Grammar And Variation: Understanding How Cis-Regulatory Information Is Encoded In Mammalian Genomes, Dana Michele King

Arts & Sciences Electronic Theses and Dissertations

Understanding how genotype leads to phenotype is key to understand both the development and dysfunction of complex organisms. In the context of regulating the gene expression patterns that contribute to cell identity and function, the goal of my thesis research is to how changes in genome sequence may impact impact gene expression by determining how sequence features contribute to regulatory potential. To accomplish this goal, I first leveraged the key regulatory role of pluripotency transcription factors (TFs) in mouse embryonic stem cells (mESCs) and tested synthetically generated and genomic identified combinations of binding site for four TFs, OCT4, SOX2, KLF4, …

Go to article

Different Estimation Methods For The Basic Independent Component Analysis Model, Zhenyi An Dec 2018

Different Estimation Methods For The Basic Independent Component Analysis Model, Zhenyi An

Arts & Sciences Electronic Theses and Dissertations

Inspired by classic cocktail-party problem, the basic Independent Component Analysis (ICA) model is created. What differs Independent Component Analysis (ICA) from other kinds of analysis is the intrinsic non-Gaussian assumption of the data. Several approaches are proposed based on maximizing the non-Gaussianity of the data, which is measured by kurtosis, mutual information, and others. With each estimation, we need to optimize the functions of expectations of non-quadratic functions since it can help us to access the higher-order statistics of non-Gaussian part of the data. In this thesis, our goal is to review the one of the most efficient estimation methods, …

Go to article

Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina Aug 2018

Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina

Arts & Sciences Electronic Theses and Dissertations

Multinomial logistic regression model (MNL) is a powerful and easily tractable way for measuring the probabilistic impact of input variables on individual categorical choices. Crucially, the standard MNL assumes that all subjects of the study have the same choice sets. In the meanwhile, especially in political science and economics, this condition is frequently violated. Probably, the most graphical example of varying choice sets (VCS) is partially contested elections. Furthermore, the MNL implicitly implies the Independence of the Irregular Alternatives (IIA) assumption by requiring i.i.d errors that contrasts the MNL and the multinomial probit (MNP) and mixed logit (MXL) models. In …

Go to article

Algorithmic Trading With Prior Information, Xinyi Cai May 2018

Algorithmic Trading With Prior Information, Xinyi Cai

Arts & Sciences Electronic Theses and Dissertations

Traders utilize strategies by using a mix of market and limit orders to generate profits. There are different types of traders in the market, some have prior information and can learn from changes in prices to tweak her trading strategy continuously(Informed Traders), some have no prior information but can learn(Uninformed Learners), and some have no prior information and cannot learn(Uninformed Traders). In this thesis. Alvaro C, Sebastian J and Damir K \cite{AL} proposed a model for algorithmic traders to access the impact of dynamic learning in profit and loss in 2014. The traders can employ the model to decide which …

Go to article

Deep Learning Analysis Of Limit Order Book, Xin Xu May 2018

Deep Learning Analysis Of Limit Order Book, Xin Xu

Arts & Sciences Electronic Theses and Dissertations

In this paper, we build a deep neural network for modeling spatial structure in limit order book and make prediction for future best ask or best bid price based on ideas of (Sirignano 2016). We propose an intuitive data processing method to approximate the data is non-available for us based only on level I data that is more widely available. The model is based on the idea that there is local dependence for best ask or best bid price and sizes of related orders. First we use logistic regression to prove that this approach is reasonable. To show the advantages …

Go to article

Variable Selection Via Lasso With High-Dimensional Proteomic Data, Hongxuan Zhai May 2018

Variable Selection Via Lasso With High-Dimensional Proteomic Data, Hongxuan Zhai

Arts & Sciences Electronic Theses and Dissertations

Multiclass classification with high-dimensional data is an applied topic both in statistics and machine learning. The classification procedure could be done in various ways. In this thesis, we review the theory of the Lasso procedure which provides a parameter estimator while simultaneously achieving dimension reduction due to a property of the L1 norm. Lasso with elastic net penalty and sparse group lasso are also reviewed. Our data is high-dimensional proteomic data (iTRAQ ratios) of breast cancer patients with four subtypes of breast cancer. We use the multinomial logistic regression to train our classifier and use the false classification rates obtained …

Go to article

Distributed Quantile Regression Analysis And A Group Variable Selection Method, Liqun Yu May 2018

Distributed Quantile Regression Analysis And A Group Variable Selection Method, Liqun Yu

Arts & Sciences Electronic Theses and Dissertations

This dissertation develops novel methodologies for distributed quantile regression analysis

for big data by utilizing a distributed optimization algorithm called the alternating direction

method of multipliers (ADMM). Specifically, we first write the penalized quantile regression

into a specific form that can be solved by the ADMM and propose numerical algorithms

for solving the ADMM subproblems. This results in the distributed QR-ADMM

algorithm. Then, to further reduce the computational time, we formulate the penalized

quantile regression into another equivalent ADMM form in which all the subproblems have

exact closed-form solutions and hence avoid iterative numerical methods. This results in the

single-loop …

Go to article

Nonparametric Estimation Of Time Series Volatility Model Estimation, Teng Tu May 2018

Nonparametric Estimation Of Time Series Volatility Model Estimation, Teng Tu

Arts & Sciences Electronic Theses and Dissertations

In this article we consider two estimation methods of a non-parametric volatility model with autoregressive error of order two. The first estimation method based on the two- lag difference. To get a better result, we consider the second approach based on the general quadratic forms. For illustration, we provided several data sets from different simulation models to support the procedures of both two methods, and prove that the second approach can make a better estimation.

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Grammar And Variation: Understanding How Cis-Regulatory Information Is Encoded In Mammalian Genomes, Dana Michele King

Arts & Sciences Electronic Theses and Dissertations

Different Estimation Methods For The Basic Independent Component Analysis Model, Zhenyi An

Arts & Sciences Electronic Theses and Dissertations

Generalized Non-Inferential Approach To Modeling Restricted Discrete Choice For The Case Of The Spatial Random Utility, Elena Labzina

Arts & Sciences Electronic Theses and Dissertations

Algorithmic Trading With Prior Information, Xinyi Cai

Arts & Sciences Electronic Theses and Dissertations

Deep Learning Analysis Of Limit Order Book, Xin Xu

Arts & Sciences Electronic Theses and Dissertations

Variable Selection Via Lasso With High-Dimensional Proteomic Data, Hongxuan Zhai

Arts & Sciences Electronic Theses and Dissertations

Distributed Quantile Regression Analysis And A Group Variable Selection Method, Liqun Yu

Arts & Sciences Electronic Theses and Dissertations

Nonparametric Estimation Of Time Series Volatility Model Estimation, Teng Tu

Arts & Sciences Electronic Theses and Dissertations