Other Computer Sciences | Open Access Articles

Constrained K-Means Clustering Validation Study, Nicholas Mcdaniel, Stephen Burgess, Jeremy Evert

Student Research

Machine Learning (ML) is a growing topic within Computer Science with applications in many fields. One open problem in ML is data separation, or data clustering. Our project is a validation study of, “Constrained K-means Clustering with Background Knowledge" by Wagstaff et. al. Our data validates the finding by Wagstaff et. al., which shows that a modified k-means clustering approach can outperform more general unsupervised learning algorithms when some domain information about the problem is available. Our data suggests that k-means clustering augmented with domain information can be a time efficient means for segmenting data sets. Our validation study focused …

Go to article

Unsupervised Machine Learning In Agent-Based Modeling, Luke D. Robinson

Celebration of Learning

Agent-based models (ABMs) are used by researchers in a variety of fields to model natural phenomena. In an ABM, a wide range of behaviors and outcomes can be observed based on the parameters of the model. In many cases, these behaviors can be categorized into discrete outcomes identifiable by human observers. Our goal was to use clustering algorithms to identify those outcomes from model output data. For this project, we used data from the NetLogo Wolf Sheep Predation model to explore and evaluate three clustering algorithms from Python's scikit-learn package. If this task can be completed reliably by a computer, …

Go to article

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Zhongmei Yao

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …

Go to article

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Computer Science Faculty Publications

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …

Go to article

Other Computer Sciences Commons^™

Full-Text Articles in Other Computer Sciences

Constrained K-Means Clustering Validation Study, Nicholas Mcdaniel, Stephen Burgess, Jeremy Evert

Student Research

Unsupervised Machine Learning In Agent-Based Modeling, Luke D. Robinson

Celebration of Learning

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Zhongmei Yao

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Computer Science Faculty Publications