Open Access. Powered by Scholars. Published by Universities.®

Other Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Clustering

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 12 of 12

Full-Text Articles in Other Computer Sciences

Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead Jan 2022

Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead

Engineering Faculty Articles and Research

Large amounts of autism spectrum disorder (ASD) data is created through hospitals, therapy centers, and mobile applications; however, much of this rich data does not have pre-existing classes or labels. Large amounts of data—both genetic and behavioral—that are collected as part of scientific studies or a part of treatment can provide a deeper, more nuanced insight into both diagnosis and treatment of ASD. This paper reviews 43 papers using unsupervised machine learning in ASD, including k-means clustering, hierarchical clustering, model-based clustering, and self-organizing maps. The aim of this review is to provide a survey of the current uses of …


Automated Parsing Of Flexible Molecular Systems Using Principal Component Analysis And K-Means Clustering Techniques, Matthew J. Nwerem Aug 2021

Automated Parsing Of Flexible Molecular Systems Using Principal Component Analysis And K-Means Clustering Techniques, Matthew J. Nwerem

Computational and Data Sciences (MS) Theses

Computational investigation of molecular structures and reactions of biological and pharmaceutical interests remains a grand scientific challenge due to the size and conformational flexibility of these systems. The work requires parsing and analyzing thousands of conformations in each molecular state for meaningful chemical information and subjecting the ensemble to costly quantum chemical calculations. The current status quo typically involves a manual process where the investigator must look at each conformation, separating each into structural families. This process is time-intensive and tedious, making this process infeasible in some cases, and limiting the ability of theoreticians to study these systems. However, the …


A Quantitative Validation Of Multi-Modal Image Fusion And Segmentation For Object Detection And Tracking, Nicholas Lahaye, Michael J. Garay, Brian D. Bue, Hesham El-Askary, Erik Linstead Jun 2021

A Quantitative Validation Of Multi-Modal Image Fusion And Segmentation For Object Detection And Tracking, Nicholas Lahaye, Michael J. Garay, Brian D. Bue, Hesham El-Askary, Erik Linstead

Mathematics, Physics, and Computer Science Faculty Articles and Research

In previous works, we have shown the efficacy of using Deep Belief Networks, paired with clustering, to identify distinct classes of objects within remotely sensed data via cluster analysis and qualitative analysis of the output data in comparison with reference data. In this paper, we quantitatively validate the methodology against datasets currently being generated and used within the remote sensing community, as well as show the capabilities and benefits of the data fusion methodologies used. The experiments run take the output of our unsupervised fusion and segmentation methodology and map them to various labeled datasets at different levels of global …


Constrained K-Means Clustering Validation Study, Nicholas Mcdaniel, Stephen Burgess, Jeremy Evert Nov 2018

Constrained K-Means Clustering Validation Study, Nicholas Mcdaniel, Stephen Burgess, Jeremy Evert

Student Research

Machine Learning (ML) is a growing topic within Computer Science with applications in many fields. One open problem in ML is data separation, or data clustering. Our project is a validation study of, “Constrained K-means Clustering with Background Knowledge" by Wagstaff et. al. Our data validates the finding by Wagstaff et. al., which shows that a modified k-means clustering approach can outperform more general unsupervised learning algorithms when some domain information about the problem is available. Our data suggests that k-means clustering augmented with domain information can be a time efficient means for segmenting data sets. Our validation study focused …


Image Segmentation Using De-Textured Images, Yaswanth Kodavali May 2017

Image Segmentation Using De-Textured Images, Yaswanth Kodavali

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Image segmentation is one of the fundamental problems in computer vision. The outputs of segmentation are used to extract regions of interest and carry out identification or classification tasks. For these tasks to be reliable, segmentation has to be made more reliable. Although there are exceptionally well-built algorithms available today, they perform poorly in many instances by producing over-merged (combining many unrelated objects) or under-merged (one object appeared as many) results. This leads to far fewer or more segments than expected. Such problems primarily arise due to varying textures within a single object and/or common textures near borders of adjacent …


Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso Nov 2016

Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso

Davide Andrea Mauro

The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In order to devise tools that facilitate sound design by vocal sketching we attempt at organizing a database of short excerpts of vocal imitations. By clustering the sound samples on a space whose dimensionality has been reduced to the two principal components, it is experimentally checked how meaningful the resulting clusters are for humans. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark in the exploration of the …


Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso Nov 2016

Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso

Davide Andrea Mauro

The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In order to devise tools that facilitate sound design by vocal sketching we attempt at organizing a database of short excerpts of vocal imitations. By clustering the sound samples on a space whose dimensionality has been reduced to the two principal components, it is experimentally checked how meaningful the resulting clusters are for humans. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark in the exploration of the …


Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso Oct 2015

Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso

Computer Sciences and Electrical Engineering Faculty Research

The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In order to devise tools that facilitate sound design by vocal sketching we attempt at organizing a database of short excerpts of vocal imitations. By clustering the sound samples on a space whose dimensionality has been reduced to the two principal components, it is experimentally checked how meaningful the resulting clusters are for humans. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark in the exploration of the …


Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao Jan 2015

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Zhongmei Yao

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …


Context Aware Privacy Preserving Clustering And Classification, Nirmal Thapa Jan 2013

Context Aware Privacy Preserving Clustering And Classification, Nirmal Thapa

Theses and Dissertations--Computer Science

Data are valuable assets to any organizations or individuals. Data are sources of useful information which is a big part of decision making. All sectors have potential to benefit from having information. Commerce, health, and research are some of the fields that have benefited from data. On the other hand, the availability of the data makes it easy for anyone to exploit the data, which in many cases are private confidential data. It is necessary to preserve the confidentiality of the data. We study two categories of privacy: Data Value Hiding and Data Pattern Hiding. Privacy is a huge concern …


A Clustering Comparison Measure Using Density Profiles And Its Application To The Discovery Of Alternate Clusterings, Eric Bae, James Bailey, Guozhu Dong Nov 2010

A Clustering Comparison Measure Using Density Profiles And Its Application To The Discovery Of Alternate Clusterings, Eric Bae, James Bailey, Guozhu Dong

Kno.e.sis Publications

Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflicting results. This has led to the use of clustering comparison measures to quantify the degree of similarity between alternative clusterings. Existing measures, though, can be limited in their ability to assess similarity and sometimes generate unintuitive results. They also cannot be applied to compare clusterings which contain different data points, an activity which is important for scenarios such as data stream analysis. In this paper, we introduce a …


Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao Jun 2005

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Computer Science Faculty Publications

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …