Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Statistical Models

Extensions Of Classification Method Based On Quantiles, Yuanhao Lai Jun 2020

Extensions Of Classification Method Based On Quantiles, Yuanhao Lai

Electronic Thesis and Dissertation Repository

This thesis deals with the problem of classification in general, with a particular focus on heavy-tailed or skewed data. The classification problem is first formalized by statistical learning theory and several important classification methods are reviewed, where the distance-based classifiers, including the median-based classifier and the quantile-based classifier (QC), are especially useful for the heavy-tailed or skewed inputs. However, QC is limited by its model capacity and the issue of high-dimensional accumulated errors. Our objective of this study is to investigate more general methods while retaining the merits of QC.

We present four extensions of QC, which appear in chronological …


Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li Jul 2017

Classification With Large Sparse Datasets: Convergence Analysis And Scalable Algorithms, Xiang Li

Electronic Thesis and Dissertation Repository

Large and sparse datasets, such as user ratings over a large collection of items, are common in the big data era. Many applications need to classify the users or items based on the high-dimensional and sparse data vectors, e.g., to predict the profitability of a product or the age group of a user, etc. Linear classifiers are popular choices for classifying such datasets because of their efficiency. In order to classify the large sparse data more effectively, the following important questions need to be answered.

1. Sparse data and convergence behavior. How different properties of a dataset, such as …