Open Access. Powered by Scholars. Published by Universities.®

Education Commons

Open Access. Powered by Scholars. Published by Universities.®

Selected Works

Educational Assessment, Evaluation, and Research

Dr Ling Tan

Data

Publication Year
File Type

Articles 1 - 2 of 2

Full-Text Articles in Education

Maximum-Entropy Estimated Distribution Model For Classification Problems, L Tan, D Taniar Dec 2005

Maximum-Entropy Estimated Distribution Model For Classification Problems, L Tan, D Taniar

Dr Ling Tan

Classification is a fundamental problem in machine learning and data mining. This paper applies a stochastic optimization model to classification problems. The proposed maximum entropy estimated distribution model uses a probabilistic distribution to represent solution space, and a sampling technique to explore search space. This paper demonstrates the application of the proposed maximum entropy estimated distribution model to improve linear discriminant function and rule induction methods. In addition, this paper compares the proposed classification model with decision trees. It shows that the proposed model is preferable to decision tree C4.5 in the following cases: i) when prior distribution of classification …


Parametric Optimization In Data Mining Incorporated With Ga-Based Search, L Tan, D Taniar, K Smith Dec 2001

Parametric Optimization In Data Mining Incorporated With Ga-Based Search, L Tan, D Taniar, K Smith

Dr Ling Tan

A number of parameters must be specified for a data-mining algorithm. Default values of these parameters are given and generally accepted as ‘good’ estimates for any data set. However, data mining models are known to be data dependent, and so are for their parameters. Default values may be good estimates, but they are often not the best parameter values for a particular data set. A tuned set of parameter values is able to produce a data-mining model of better classification and higher prediction accuracy. However parameter search is known to be expensive. This paper investigates GA-based heuristic techniques in a …