Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2012

Brigham Young University

Imputation

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Advancing The Effectiveness Of Non-Linear Dimensionality Reduction Techniques, Michael S. Gashler May 2012

Advancing The Effectiveness Of Non-Linear Dimensionality Reduction Techniques, Michael S. Gashler

Theses and Dissertations

Data that is represented with high dimensionality presents a computational complexity challenge for many existing algorithms. Limiting dimensionality by discarding attributes is sometimes a poor solution to this problem because significant high-level concepts may be encoded in the data across many or all of the attributes. Non-linear dimensionality reduction (NLDR) techniques have been successful with many problems at minimizing dimensionality while preserving intrinsic high-level concepts that are encoded with varying combinations of attributes. Unfortunately, many challenges remain with existing NLDR techniques, including excessive computational requirements, an inability to benefit from prior knowledge, and an inability to handle certain difficult conditions …


Support Vector Machines For Classification And Imputation, Spencer David Rogers May 2012

Support Vector Machines For Classification And Imputation, Spencer David Rogers

Theses and Dissertations

Support vector machines (SVMs) are a powerful tool for classification problems. SVMs have only been developed in the last 20 years with the availability of cheap and abundant computing power. SVMs are a non-statistical approach and make no assumptions about the distribution of the data. Here support vector machines are applied to a classic data set from the machine learning literature and the out-of-sample misclassification rates are compared to other classification methods. Finally, an algorithm for using support vector machines to address the difficulty in imputing missing categorical data is proposed and its performance is demonstrated under three different scenarios …