Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan May 2003

On Machine Learning Methods For Chinese Document Classification, Ji He, Ah-Hwee Tan, Chew-Lim Tan

Research Collection School Of Computing and Information Systems

This paper reports our comparative evaluation of three machine learning methods, namely k Nearest Neighbor (kNN), Support Vector Machines (SVM), and Adaptive Resonance Associative Map (ARAM) for Chinese document categorization. Based on two Chinese corpora, a series of controlled experiments evaluated their learning capabilities and efficiency in mining text classification knowledge. Benchmark experiments showed that their predictive performance were roughly comparable, especially on clean and well organized data sets. While kNN and ARAM yield better performances than SVM on small and clean data sets, SVM and ARAM significantly outperformed kNN on noisy data. Comparing efficiency, kNN was notably more costly …


Machine Learning Approaches For Determining Effective Seeds For K -Means Algorithm, Kaveephong Lertwachara Apr 2003

Machine Learning Approaches For Determining Effective Seeds For K -Means Algorithm, Kaveephong Lertwachara

Doctoral Dissertations

In this study, I investigate and conduct an experiment on two-stage clustering procedures, hybrid models in simulated environments where conditions such as collinearity problems and cluster structures are controlled, and in real-life problems where conditions are not controlled. The first hybrid model (NK) is an integration between a neural network (NN) and the k-means algorithm (KM) where NN screens seeds and passes them to KM. The second hybrid (GK) uses a genetic algorithm (GA) instead of the neural network. Both NN and GA used in this study are in their simplest-possible forms.

In the simulated data sets, I investigate two …


Machine Learning Techniques For Efficient Query Processing In Kowledge Base Systems, Kevin Paul Grant Jan 2003

Machine Learning Techniques For Efficient Query Processing In Kowledge Base Systems, Kevin Paul Grant

LSU Doctoral Dissertations

In this dissertation we propose a new technique for efficient query processing in knowledge base systems. Query processing in knowledge base systems poses strong computational challenges because of the presence of combinatorial explosion. This arises because at any point during query processing there may be too many subqueries available for further exploration. Overcoming this difficulty requires effective mechanisms for choosing from among these subqueries good subqueries for further processing. Inspired by existing works on stochastic logic programs, compositional modeling and probabilistic heuristic estimates we create a new, nondeterministic method to accomplish the task of subquery selection for query processing. Specifically, …