Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

Clustering

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 77

Full-Text Articles in Engineering

Exploring Human Aging Proteins Based On Deep Autoencoders And K-Means Clustering, Sondos M. Hammad, Mohamed Talaat Saidahmed, Elsayed A. Sallam, Reda Elbasiony Mar 2024

Exploring Human Aging Proteins Based On Deep Autoencoders And K-Means Clustering, Sondos M. Hammad, Mohamed Talaat Saidahmed, Elsayed A. Sallam, Reda Elbasiony

Journal of Engineering Research

Aging significantly affects human health and the overall economy, yet understanding of the underlying molecular mechanisms remains limited. Among all human genes, almost three hundred and five have been linked to human aging. While certain subsets of these genes or specific aging-related genes have been extensively studied. There has been a lack of comprehensive examination encompassing the entire set of aging-related genes. Here, the main objective is to overcome understanding based on an innovative approach that combines the capabilities of deep learning. Particularly using One-Dimensional Deep AutoEncoder (1D-DAE). Followed by the K-means clustering technique as a means of unsupervised learning. …


Meta-Icvi: Ensemble Validity Metrics For Concise Labeling Of Correct, Under- Or Over-Partitioning In Streaming Clustering, Niklas M. Melton, Sasha A. Petrenko, Donald C. Wunsch Jan 2024

Meta-Icvi: Ensemble Validity Metrics For Concise Labeling Of Correct, Under- Or Over-Partitioning In Streaming Clustering, Niklas M. Melton, Sasha A. Petrenko, Donald C. Wunsch

Electrical and Computer Engineering Faculty Research & Creative Works

Understanding the performance and validity of clustering algorithms is both challenging and crucial, particularly when clustering must be done online. Until recently, most validation methods have relied on batch calculation and have required considerable human expertise in their interpretation. Improving real-time performance and interpretability of cluster validation, therefore, continues to be an important theme in unsupervised learning. Building upon previous work on incremental cluster validity indices (iCVIs), this paper introduces the Meta- iCVI as a tool for explainable and concise labeling of partition quality in online clustering. Leveraging a time-series classifier and data-fusion techniques, the Meta- iCVI combines the outputs …


Multiple Imputation For Robust Cluster Analysis To Address Missingness In Medical Data, Arnold Harder, Gayla R. Olbricht, Godwin Ekuma, Daniel B. Hier, Tayo Obafemi-Ajayi Jan 2024

Multiple Imputation For Robust Cluster Analysis To Address Missingness In Medical Data, Arnold Harder, Gayla R. Olbricht, Godwin Ekuma, Daniel B. Hier, Tayo Obafemi-Ajayi

Mathematics and Statistics Faculty Research & Creative Works

Cluster Analysis Has Been Applied To A Wide Range Of Problems As An Exploratory Tool To Enhance Knowledge Discovery. Clustering Aids Disease Subtyping, I.e. Identifying Homogeneous Patient Subgroups, In Medical Data. Missing Data Is A Common Problem In Medical Research And Could Bias Clustering Results If Not Properly Handled. Yet, Multiple Imputation Has Been Under-Utilized To Address Missingness, When Clustering Medical Data. Its Limited Integration In Clustering Of Medical Data, Despite The Known Advantages And Benefits Of Multiple Imputation, Could Be Attributed To Many Factors. This Includes Methodological Complexity, Difficulties In Pooling Results To Obtain A Consensus Clustering, Uncertainty Regarding …


Static Malware Family Clustering Via Structural And Functional Characteristics, David George, Andre Mauldin, Josh Mitchell, Sufiyan Mohammed, Robert Slater Aug 2023

Static Malware Family Clustering Via Structural And Functional Characteristics, David George, Andre Mauldin, Josh Mitchell, Sufiyan Mohammed, Robert Slater

SMU Data Science Review

Static and dynamic analyses are the two primary approaches to analyzing malicious applications. The primary distinction between the two is that the application is analyzed without execution in static analysis, whereas the dynamic approach executes the malware and records the behavior exhibited during execution. Although each approach has advantages and disadvantages, dynamic analysis has been more widely accepted and utilized by the research community whereas static analysis has not seen the same attention. This study aims to apply advancements in static analysis techniques to demonstrate the identification of fine-grained functionality, and show, through clustering, how malicious applications may be grouped …


Analyzing Ground Motion Records With Cvi Fuzzy Art, Dustin Tanksley, Xinzhe Yuan, Genda Chen, Donald C. Wunsch Jan 2023

Analyzing Ground Motion Records With Cvi Fuzzy Art, Dustin Tanksley, Xinzhe Yuan, Genda Chen, Donald C. Wunsch

Civil, Architectural and Environmental Engineering Faculty Research & Creative Works

This paper explores using Cluster Validity Indices Fuzzy Adaptative Resonance Theory (CVI Fuzzy ART) to cluster ground motion records (GMRs). Clustering the features extracted from a supervised network trained for predicting the structure damage results in less overfitting from the trained network. Using Cluster Validity Indices (CVIs) to evaluate the clustering gives feedback to how well the data is being classified, allowing further separation of the data. By using CVI Fuzzy ART in combination with features extracted from a trained Convolutional Neural Network (CNN), we were able to form additional clusters in the data. Within the primary clusters, accuracy was …


K-Means Clustering Using Gravity Distance, Ajinkya Vishwas Indulkar Apr 2022

K-Means Clustering Using Gravity Distance, Ajinkya Vishwas Indulkar

Masters Theses & Specialist Projects

Clustering is an important topic in data modeling. K-means Clustering is a well-known partitional clustering algorithm, where a dataset is separated into groups sharing similar properties. Clustering an unbalanced dataset is a challenging problem in data modeling, where some group has a much larger number of data points than others. When a K-means clustering algorithm with Euclidean distance is applied to such data, the algorithm fails to form good clusters. The standard K-means tends to split data into smaller clusters during a clustering process evenly.

We propose a new K-means clustering algorithm to overcome the disadvantage by introducing a different …


Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead Jan 2022

Applications Of Unsupervised Machine Learning In Autism Spectrum Disorder Research: A Review, Chelsea Parlett-Pelleriti, Elizabeth Stevens, Dennis R. Dixon, Erik J. Linstead

Engineering Faculty Articles and Research

Large amounts of autism spectrum disorder (ASD) data is created through hospitals, therapy centers, and mobile applications; however, much of this rich data does not have pre-existing classes or labels. Large amounts of data—both genetic and behavioral—that are collected as part of scientific studies or a part of treatment can provide a deeper, more nuanced insight into both diagnosis and treatment of ASD. This paper reviews 43 papers using unsupervised machine learning in ASD, including k-means clustering, hierarchical clustering, model-based clustering, and self-organizing maps. The aim of this review is to provide a survey of the current uses of …


Topological Hierarchies And Decomposition: From Clustering To Persistence, Kyle A. Brown Jan 2022

Topological Hierarchies And Decomposition: From Clustering To Persistence, Kyle A. Brown

Browse all Theses and Dissertations

Hierarchical clustering is a class of algorithms commonly used in exploratory data analysis (EDA) and supervised learning. However, they suffer from some drawbacks, including the difficulty of interpreting the resulting dendrogram, arbitrariness in the choice of cut to obtain a flat clustering, and the lack of an obvious way of comparing individual clusters. In this dissertation, we develop the notion of a topological hierarchy on recursively-defined subsets of a metric space. We look to the field of topological data analysis (TDA) for the mathematical background to associate topological structures such as simplicial complexes and maps of covers to clusters in …


Constructing Frameworks For Task-Optimized Visualizations, Ghulam Jilani Abdul Rahim Quadri Oct 2021

Constructing Frameworks For Task-Optimized Visualizations, Ghulam Jilani Abdul Rahim Quadri

USF Tampa Graduate Theses and Dissertations

Visualization is crucial in today’s data-driven world to augment and enhance human understanding and decision-making. Effective visualizations must support accuracy in visual task performance and expressive data communication. Effective visualization design depends on the visual channels used, chart types, or visual tasks. However, design choices and visual judgment are co-related, and effectiveness is not one-dimensional, leading to a significant need to understand the intersection of these factors to create optimized visualizations. Hence, constructing frameworks that consider both design decisions and the task being performed enables optimizing visualization design to maximize efficacy. This dissertation describes experiments, techniques, and user studies to …


A Quantitative Validation Of Multi-Modal Image Fusion And Segmentation For Object Detection And Tracking, Nicholas Lahaye, Michael J. Garay, Brian D. Bue, Hesham El-Askary, Erik Linstead Jun 2021

A Quantitative Validation Of Multi-Modal Image Fusion And Segmentation For Object Detection And Tracking, Nicholas Lahaye, Michael J. Garay, Brian D. Bue, Hesham El-Askary, Erik Linstead

Mathematics, Physics, and Computer Science Faculty Articles and Research

In previous works, we have shown the efficacy of using Deep Belief Networks, paired with clustering, to identify distinct classes of objects within remotely sensed data via cluster analysis and qualitative analysis of the output data in comparison with reference data. In this paper, we quantitatively validate the methodology against datasets currently being generated and used within the remote sensing community, as well as show the capabilities and benefits of the data fusion methodologies used. The experiments run take the output of our unsupervised fusion and segmentation methodology and map them to various labeled datasets at different levels of global …


Can Generative Adversarial Networks Help Us Fight Financial Fraud?, Sean Mciver Jan 2021

Can Generative Adversarial Networks Help Us Fight Financial Fraud?, Sean Mciver

Dissertations

Transactional fraud datasets exhibit extreme class imbalance. Learners cannot make accurate generalizations without sufficient data. Researchers can account for imbalance at the data level, algorithmic level or both. This paper focuses on techniques at the data level. We evaluate the evidence of the optimal technique and potential enhancements. Global fraud losses totalled more than 80 % of the UK’s GDP in 2019. The improvement of preprocessing is inherently valuable in fighting these losses. Synthetic minority oversampling technique (SMOTE) and extensions of SMOTE are currently the most common preprocessing strategies. SMOTE oversamples the minority classes by randomly generating a point between …


Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger Jan 2021

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger

Browse all Theses and Dissertations

The additive manufacturing (AM) field is striving to identify anomalies in laser powder bed fusion (LPBF) using multi-sensor in-process monitoring paired with machine learning (ML). In-process monitoring can reveal the presence of anomalies but creating a ML classifier requires labeled data. The present work approaches this problem by printing hundreds of Inconel-718 coupons with different processing parameters to capture a wide range of process monitoring imagery with multiple sensor types. Afterwards, the process monitoring images are encoded into feature vectors and clustered to isolate groups in each sensor modality. Four texture representations were learned by training two convolutional neural network …


Clustered Mobile Data Collection In Wsns: An Energy-Delay Trade-Of, İzzet Fati̇h Şentürk Jan 2021

Clustered Mobile Data Collection In Wsns: An Energy-Delay Trade-Of, İzzet Fati̇h Şentürk

Turkish Journal of Electrical Engineering and Computer Sciences

Wireless sensor networks enable monitoring remote areas with limited human intervention. However, the network connectivity between sensor nodes and the base station (BS) may not be always possible due to the limited transmission range of the nodes. In such a case, one or more mobile data collectors (MDCs) can be employed to visit nodes for data collection. If multiple MDCs are available, it is desirable to minimize the energy cost of mobility while distributing the cost among the MDCs in a fair manner. Despite availability of various clustering algorithms, there is no single fits all clustering solution when different requirements …


Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger Jan 2021

Texture-Driven Image Clustering In Laser Powder Bed Fusion, Alexander H. Groeger

Browse all Theses and Dissertations

The additive manufacturing (AM) field is striving to identify anomalies in laser powder bed fusion (LPBF) using multi-sensor in-process monitoring paired with machine learning (ML). In-process monitoring can reveal the presence of anomalies but creating a ML classifier requires labeled data. The present work approaches this problem by printing hundreds of Inconel-718 coupons with different processing parameters to capture a wide range of process monitoring imagery with multiple sensor types. Afterwards, the process monitoring images are encoded into feature vectors and clustered to isolate groups in each sensor modality. Four texture representations were learned by training two convolutional neural network …


An Explainable And Statistically Validated Ensemble Clustering Model Applied To The Identification Of Traumatic Brain Injury Subgroups, Dacosta Yeboah, Louis Steinmeister, Daniel B. Hier, Bassam Hadi, Donald C. Wunsch, Gayla R. Olbricht, Tayo Obafemi-Ajayi Sep 2020

An Explainable And Statistically Validated Ensemble Clustering Model Applied To The Identification Of Traumatic Brain Injury Subgroups, Dacosta Yeboah, Louis Steinmeister, Daniel B. Hier, Bassam Hadi, Donald C. Wunsch, Gayla R. Olbricht, Tayo Obafemi-Ajayi

Electrical and Computer Engineering Faculty Research & Creative Works

We present a framework for an explainable and statistically validated ensemble clustering model applied to Traumatic Brain Injury (TBI). The objective of our analysis is to identify patient injury severity subgroups and key phenotypes that delineate these subgroups using varied clinical and computed tomography data. Explainable and statistically-validated models are essential because a data-driven identification of subgroups is an inherently multidisciplinary undertaking. In our case, this procedure yielded six distinct patient subgroups with respect to mechanism of injury, severity of presentation, anatomy, psychometric, and functional outcome. This framework for ensemble cluster analysis fully integrates statistical methods at several stages of …


Slashing Quality Index Modeling And Simulation Based On Data Dispersion Clustering, Yuxian Zhang, Xiaoyi Qian, Dong Xiao, Jianhui Wang Aug 2020

Slashing Quality Index Modeling And Simulation Based On Data Dispersion Clustering, Yuxian Zhang, Xiaoyi Qian, Dong Xiao, Jianhui Wang

Journal of System Simulation

Abstract: For the sensitivity of noise and outliers data in the typical partitioning clustering algorithm, a clustering algorithm based on data dispersion was proposed. The data dispersion was defined and introduced to a non-Euclidean distance. The similarity metric was established, and the data clustering was realized. The optimal clustering number was obtained by the validity function based on improved partition coefficient. Then the proposed clustering algorithm was applied to quality index model in slashing process. A size add-on quality index model was built by radial basis function neural networks. The node number of hidden layer was determined and the center …


Key Technologies Of Precaution And Prediction Of Abnormal Spatial-Temporal Trajectory: A Review Of Recent Advances, Gongda Qiu, He Ming, Yang Jie, Yuting Cao, Jihong Sun Jun 2020

Key Technologies Of Precaution And Prediction Of Abnormal Spatial-Temporal Trajectory: A Review Of Recent Advances, Gongda Qiu, He Ming, Yang Jie, Yuting Cao, Jihong Sun

Journal of System Simulation

Abstract: The ex-post disposition of a major incident, which is expected to transform into prediction and precaution of abnormal behavior, is increasingly unable to meet the urgent needs of the society.Therapid development and popularization of sensor network and positioning technology lay the foundation for mining spatial-temporal trajectory data. With the key objective of prediction and precaution of abnormal trajectory based on big data mining, the future research directions and prospects on trajectory clustering and recognitionareanalyzed, discussed and elaboratedinthis paper.Temporal trajectory prediction applied in prediction and precaution of abnormal spatial-temporal trajectory is also presented, providing a reference for further research on …


Optimizing Cluster Sets For The Scan Statistic Using Local Search, James Shulgan Jan 2020

Optimizing Cluster Sets For The Scan Statistic Using Local Search, James Shulgan

Graduate Research Theses & Dissertations

In recent years, scattering sensors to produce wireless sensor networks (WSN) has been proposed for detecting localized events in large areas. Because sensor measurements are noisy, the WSN needs to use statistical methods such as the scan statistic. The scan statistic groups measurements into various clusters, computes a cluster statistic for each cluster, and decides that an event has happened if any of the statistics exceeds a threshold. Previous researchers have investigated the performance of the scan statistic to detect events; however, little attention was given to the optimization of which clusters the scan statistic should use. Using the scan …


Development Of A Modeling Algorithm To Predict Lean Implementation Success, Richard Charles Barclay Jan 2020

Development Of A Modeling Algorithm To Predict Lean Implementation Success, Richard Charles Barclay

Doctoral Dissertations

”Lean has become a common term and goal in organizations throughout the world. The approach of eliminating waste and continuous improvement may seem simple on the surface but can be more complex when it comes to implementation. Some firms implement lean with great success, getting complete organizational buy-in and realizing the efficiencies foundational to lean. Other organizations struggle to implement lean. Never able to get the buy-in or traction needed to really institute the sort of cultural change that is often needed to implement change. It would be beneficial to have a tool that organizations could use to assess their …


Bibsqlqc: Brown Infomax Boosted Sql Query Clustering Algorithm To Detectanti-Patterns In The Query Log, Vinothsaravanan Ramakrishnan, Palanisamy Chenniappan Jan 2020

Bibsqlqc: Brown Infomax Boosted Sql Query Clustering Algorithm To Detectanti-Patterns In The Query Log, Vinothsaravanan Ramakrishnan, Palanisamy Chenniappan

Turkish Journal of Electrical Engineering and Computer Sciences

Discovery of antipatterns from arbitrary SQL query log depends on the static code analysis used to enhance the quality and performance of software applications. The existence of antipatterns reduces the quality and leads to redundant SQL statements. SQL log includes a large load on the database and it is difficult for an analyst to extract large patterns in a minimal time. Existing techniques which discover antipatterns in SQL query face a lot of innumerable challenges to discover the normal sequences of queries within the log. In order to discover the antipatterns in the log, an efficient technique called Brown infomax …


An Efficient Storage-Optimizing Tick Data Clustering Model, Haleh Amintoosi, Masood Niazi Torshiz, Yahya Forghani, Sara Alinejad Jan 2020

An Efficient Storage-Optimizing Tick Data Clustering Model, Haleh Amintoosi, Masood Niazi Torshiz, Yahya Forghani, Sara Alinejad

Turkish Journal of Electrical Engineering and Computer Sciences

Tick data is a large volume of data, related to a phenomenon such as stock market or weather change, with data values changing rapidly over time. An important issue is to store tick data table in a way that it occupies minimum storage space while at the same time it can provide fast execution of queries. In this paper, a mathematical model is proposed to partition tick data tables into clusters with the aim of minimizing the required storage space. The genetic algorithm is then used to solve the mathematical model which is indeed a clustering model. The proposed method …


Spatiotemporal Mode Analysis Of Urban Dockless Shared Bikes Based On Point Of Interests Clustering, Zhang Fang, Bin Chen, Yanghua Tang, Dong Jian, Chuan Ai, Xiaogang Qiu Dec 2019

Spatiotemporal Mode Analysis Of Urban Dockless Shared Bikes Based On Point Of Interests Clustering, Zhang Fang, Bin Chen, Yanghua Tang, Dong Jian, Chuan Ai, Xiaogang Qiu

Journal of System Simulation

Abstract: The city’s dockless shared bikes have developed rapidly, and its features of convenience, economy and efficiency have been widely welcomed. The digital footprint they generate reveals the movement of people in time and space within the city, which makes it possible to quantify the activities of people in the city using shared bikes. In this paper, based on the collected shared bikes data of Beijing, a clustering method based on the point of interests is proposed to divide the urban space, so as to construct a mobile network of urban shared bikes, and analysis the spatiotemporal mode of bike …


Additional Arguments In Favor Of True Quaternary Fission Of Low Excited Actinides, D. V. Kamanin, A. A. Alexandrov, I. A. Alexandrova, Z. I. Goryainova, E. A. Kuznetsova, A. O. Strekalovsky, O. V. Strekalovsky, V. E. Zhuchko, Yu. V. Pyatkov, A. V. Tomas, V. Malaza Jun 2019

Additional Arguments In Favor Of True Quaternary Fission Of Low Excited Actinides, D. V. Kamanin, A. A. Alexandrov, I. A. Alexandrova, Z. I. Goryainova, E. A. Kuznetsova, A. O. Strekalovsky, O. V. Strekalovsky, V. E. Zhuchko, Yu. V. Pyatkov, A. V. Tomas, V. Malaza

Eurasian Journal of Physics and Functional Materials

Specific linear structures in the region of a big missing mass in the fission fragments mass correlation distributions were revealed due to effective cleaning of this region from the background linked with scattered fragments. One of the most pronounced structures looks like a rectangle bounded by the magic nuclei. The fission events aggregated in the rectangle show a very low total kinetic energy. We propose possible scenario of forming and decay of the multi-cluster pre-scission configuration decisive for the experimental findings.


Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao May 2019

Cure: Flexible Categorical Data Representation By Hierarchical Coupling Learning, Songlei Jian, Guansong Pang, Longbing Cao, Kai Lu, Hang Gao

Research Collection School Of Computing and Information Systems

The representation of categorical data with hierarchical value coupling relationships (i.e., various value-to-value cluster interactions) is very critical yet challenging for capturing complex data characteristics in learning tasks. This paper proposes a novel and flexible coupled unsupervised categorical data representation (CURE) framework, which not only captures the hierarchical couplings but is also flexible enough to be instantiated for contrastive learning tasks. CURE first learns the value clusters of different granularities based on multiple value coupling functions and then learns the value representation from the couplings between the obtained value clusters. With two complementary value coupling functions, CURE is instantiated into …


A New Model To Determine The Hierarchical Structure Of The Wireless Sensor Networks, Resmi̇ye Nasi̇boğlu, Zülküf Teki̇n Erten Jan 2019

A New Model To Determine The Hierarchical Structure Of The Wireless Sensor Networks, Resmi̇ye Nasi̇boğlu, Zülküf Teki̇n Erten

Turkish Journal of Electrical Engineering and Computer Sciences

Wireless sensor networks are one of the rising areas of scientific research. Common purpose of these investigations is usually constructing optimal structure of the network by prolonging its lifetime. In this study, a new model has been proposed to construct a hierarchical structure of wireless sensor networks. Methods used in the model to determine clusters and appropriate cluster heads are k-means clustering and fuzzy inference system (FIS), respectively. The weighted averaging based on levels (WABL) defuzzification method is used to calculate crisp outputs of the FIS. A new theorem for calculation of WABL values has been proved in order to …


Evaluating The Attributes Of Remote Sensing Image Pixels For Fast K-Means Clustering, Ali̇ Sağlam, Nurdan Baykan Jan 2019

Evaluating The Attributes Of Remote Sensing Image Pixels For Fast K-Means Clustering, Ali̇ Sağlam, Nurdan Baykan

Turkish Journal of Electrical Engineering and Computer Sciences

Clustering process is an important stage for many data mining applications. In this process, data elements are grouped according to their similarities. One of the most known clustering algorithms is the k-means algorithm. The algorithm initially requires the number of clusters as a parameter and runs iteratively. Many remote sensing image processing applications usually need the clustering stage like many image processing applications. Remote sensing images provide more information about the environments with the development of the multispectral sensor and laser technologies. In the dataset used in this paper, the infrared (IR) and the digital surface maps (DSM) are also …


Exploring Bigram Character Features For Arabic Text Clustering, Dia Eddin Abuzeina Jan 2019

Exploring Bigram Character Features For Arabic Text Clustering, Dia Eddin Abuzeina

Turkish Journal of Electrical Engineering and Computer Sciences

The vector space model (VSM) is an algebraic model that is widely used for data representation in text mining applications. However, the VSM poses a critical challenge, as it requires a high-dimensional feature space. Therefore, many feature selection techniques, such as employing roots or stems (i.e. words without infixes and prefixes, and/or suffixes) instead of using complete word forms, are proposed to tackle this space challenge problem. Recently, the literature shows that one more basic unit feature can be used to handle the textual features, which is the twoneighboring character form that we call microword. To evaluate this feature type, …


Efficient Hierarchical Temporal Segmentation Method For Facial Expression Sequences, Jiali Bian, Xue Mei, Yu Xue, Liang Wu, Yao Ding Jan 2019

Efficient Hierarchical Temporal Segmentation Method For Facial Expression Sequences, Jiali Bian, Xue Mei, Yu Xue, Liang Wu, Yao Ding

Turkish Journal of Electrical Engineering and Computer Sciences

Temporal segmentation of facial expression sequences is important to understand and analyze human facial expressions. It is, however, challenging to deal with the complexity of facial muscle movements by finding a suitable metric to distinguish among different expressions and to deal with the uncontrolled environmental factors in the real world. This paper presents a two-step unsupervised segmentation method composed of rough segmentation and fine segmentation stages to compute the optimal segmentation positions in video sequences to facilitate the segmentation of different facial expressions. The proposed method performs localization of facial expression patches to aid in recognition and extraction of specific …


Scalable Clustering For Immune Repertoire Sequence Analysis, Prem Bhusal Jan 2019

Scalable Clustering For Immune Repertoire Sequence Analysis, Prem Bhusal

Browse all Theses and Dissertations

The development of the next-generation sequencing technology has enabled systems immunology researchers to conduct detailed immune repertoire analysis at the molecule level. Large sequence datasets (e.g., millions of sequences) are being collected to comprehensively understand how the immune system of a patient evolves over different stages of disease development. A recent study has shown that the hierarchical clustering (HC) algorithm gives the best results for B-cell clones analysis - an important type of immune repertoire sequencing (IR-Seq) analysis. However, due to the inherent complexity, the classical hierarchical clustering algorithm does not scale well to large sequence datasets. Surprisingly, no algorithms …


Clustering Method Based On Graph Data Model And Reliability Detection, Yanyun Cheng, Huisong Bian, Changsheng Bian Jun 2018

Clustering Method Based On Graph Data Model And Reliability Detection, Yanyun Cheng, Huisong Bian, Changsheng Bian

Journal of System Simulation

Abstract: For the data in feature space, traditional clustering algorithm can take clustering analysis directly. High-dimensional spatial data cannot achieve intuitive and effective graphical visualization of clustering results in 2D plane. Graph data can clearly reflect the similarity relationship between objects. According to the distance of the data objects, the feature space data are modeled as graph data by iteration. Cluster analysis based on modularity is carried out on the modeling graph data. The two-dimensional visualization of non-spherical-shape distribution data cluster and result is achieved. The concept of credibility of the clustering result is proposed, and a method is proposed, …