Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

PDF

Open Access Dissertations

Statistics and Probability

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Overcoming Uncertainty For Within-Network Relational Machine Learning, Joseph J. Pfeiffer Apr 2015

Overcoming Uncertainty For Within-Network Relational Machine Learning, Joseph J. Pfeiffer

Open Access Dissertations

People increasingly communicate through email and social networks to maintain friendships and conduct business, as well as share online content such as pictures, videos and products. Relational machine learning (RML) utilizes a set of observed attributes and network structure to predict corresponding labels for items; for example, to predict individuals engaged in securities fraud, we can utilize phone calls and workplace information to make joint predictions over the individuals. However, in large scale and partially observed network domains, missing labels and edges can significantly impact standard relational machine learning methods by introducing bias into the learning and inference processes. In …


Stability Of Machine Learning Algorithms, Wei Sun Apr 2015

Stability Of Machine Learning Algorithms, Wei Sun

Open Access Dissertations

In the literature, the predictive accuracy is often the primary criterion for evaluating a learning algorithm. In this thesis, I will introduce novel concepts of stability into the machine learning community. A learning algorithm is said to be stable if it produces consistent predictions with respect to small perturbation of training samples. Stability is an important aspect of a learning procedure because unstable predictions can potentially reduce users' trust in the system and also harm the reproducibility of scientific conclusions. As a prototypical example, stability of the classification procedure will be discussed extensively. In particular, I will present two new …


Divide And Recombine For Large Complex Data: The Subset Likelihood Modeling Approach To Recombination, Philip Gautier Apr 2015

Divide And Recombine For Large Complex Data: The Subset Likelihood Modeling Approach To Recombination, Philip Gautier

Open Access Dissertations

Divide and recombine (D&R) is a statistical framework for the analysis of large complex data. The data are divided into subsets. Numeric and visualization methods, which collectively are analytic methods, are applied to each subset. For each analytic method, the outputs of the application of the method to the subsets are recombined. So each analytic method has associated with it a division method and a recombination method. Here we study D&R methods for likelihood-based model fitting. We introduce a notion of likelihood analysis and modeling. We divide the data and fit a likelihood model on each subset. The fitted model …


A Pure-Jump Market-Making Model For High-Frequency Trading, Chi Wai Law Apr 2015

A Pure-Jump Market-Making Model For High-Frequency Trading, Chi Wai Law

Open Access Dissertations

We propose a new market-making model which incorporates a number of realistic features relevant for high-frequency trading. In particular, we model the dependency structure of prices and order arrivals with novel self- and cross-exciting point processes. Furthermore, instead of assuming the bid and ask prices can be adjusted continuously by the market maker, we formulate the market maker's decisions as an optimal switching problem. Moreover, the risk of overtrading has been taken into consideration by allowing each order to have different size, and the market maker can make use of market orders, which are treated as impulse control, to get …


Spatial Analysis Of Passenger Vehicle Use And Ownership And Its Impact On The Sustainability Of Highway Infrastructure Funding, Matthew Volovski Apr 2015

Spatial Analysis Of Passenger Vehicle Use And Ownership And Its Impact On The Sustainability Of Highway Infrastructure Funding, Matthew Volovski

Open Access Dissertations

Across the United States, the sustainability of highway funding is at risk due to increasing need and uncertainty in the factors that drive revenue. Past studies on highway funding sustainability have identified that the root cause of changing highway revenue are the shifts in social demographics and economic characteristics. Unfortunately, from the revenue perspective (the focus of this dissertation), the ability of previous research to account for these factors has been rather limited in two ways; first, the inability to accurately assess current regional vehicle use (a typical prerequisite for statistical modeling of highway revenues) due to difficulties associated with …