Computer Engineering | Open Access Articles | Digital Commons Network™

Fuzzycsampling: A Hybrid Fuzzy C-Means Clustering Sampling Strategy For Imbalanced Datasets, Abdullah Maraş, Çi̇ğdem Erol Nov 2023

Fuzzycsampling: A Hybrid Fuzzy C-Means Clustering Sampling Strategy For Imbalanced Datasets, Abdullah Maraş, Çi̇ğdem Erol

Turkish Journal of Electrical Engineering and Computer Sciences

Classification model with imbalanced datasets is recently one of the most researched areas in machine learning applications since they induce to the emergence of low-performing machine learning models. The imbalanced datasets occur if target variables have an uneven number of examples in a dataset. The most prevalent solutions to imbalanced datasets can be categorized as data preprocessing, ensemble techniques, and cost-sensitive learning. In this article, we propose a new hybrid approach for binary classification, named FuzzyCSampling, which aims to increase model performance by ensembling fuzzy c-means clustering and data sampling solutions. This article compares the proposed approaches' results not only …

Go to article

Customer Churn Prediction, Deepshikha Wadikar Jan 2020

Customer Churn Prediction, Deepshikha Wadikar

Dissertations

Churned customers identification plays an essential role for the functioning and growth of any business. Identification of churned customers can help the business to know the reasons for the churn and they can plan their market strategies accordingly to enhance the growth of a business. This research is aimed at developing a machine learning model that can precisely predict the churned customers from the total customers of a Credit Union financial institution. A quantitative and deductive research strategies are employed to build a supervised machine learning model that addresses the class imbalance problem handled feature selection and efficiently predict the …

Go to article

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro Jan 2020

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro

Dissertations

This research project seeks to investigate some of the different sampling techniques that generate and use synthetic data to oversample the minority class as a means of handling the imbalanced distribution between non-fraudulent (majority class) and fraudulent (minority class) classes in a credit-card fraud dataset. The purpose of the research project is to assess the effectiveness of these techniques in the context of fraud detection which is a highly imbalanced and cost-sensitive dataset. Machine learning tasks that require learning from datasets that are highly unbalanced have difficulty learning since many of the traditional learning algorithms are not designed to cope …

Go to article

Iterative Stochastic Resonance Demodulation Algorithm Of Frequency-Hopping Signal, Haixia Li, Yongfeng Ren, Yuhua Yang, Zhang Baili, Zhumei Tian Jan 2019

Iterative Stochastic Resonance Demodulation Algorithm Of Frequency-Hopping Signal, Haixia Li, Yongfeng Ren, Yuhua Yang, Zhang Baili, Zhumei Tian

Journal of System Simulation

Abstract: In order to improve the demodulation performance of frequency-hopping signals, this paper proposes a digital receiving algorithm based on the bistable stochastic resonance. With the digital samples sifting, the algorithm processes multiple frequency signals in one stochastic resonance system. The conversion from channel noise to useful signal is improved with multiple iterations of signal system. Finally, the local signal is designed to eliminate the frequency distortion in stochastic resonance system through the relevant calculation. Theoretical analysis and simulation results show that the algorithm can demodulate frequency-hopping signal, and its performance can be improved with the increase of sampling rate. …

Go to article

Hot Zone Identification: Analyzing Effects Of Data Sampling On Spam Clustering, Rasib Khan, Mainul Mizan, Ragib Hasan, Alan Sprague Jan 2014

Hot Zone Identification: Analyzing Effects Of Data Sampling On Spam Clustering, Rasib Khan, Mainul Mizan, Ragib Hasan, Alan Sprague

Journal of Digital Forensics, Security and Law

Email is the most common and comparatively the most efficient means of exchanging information in today's world. However, given the widespread use of emails in all sectors, they have been the target of spammers since the beginning. Filtering spam emails has now led to critical actions such as forensic activities based on mining spam email. The data mine for spam emails at the University of Alabama at Birmingham is considered to be one of the most prominent resources for mining and identifying spam sources. It is a widely researched repository used by researchers from different global organizations. The usual process …

Go to article

Sampling: Making Electronic Discovery More Cost Effective, Milton Luoma, Vicki Luoma Jan 2011

Sampling: Making Electronic Discovery More Cost Effective, Milton Luoma, Vicki Luoma

Journal of Digital Forensics, Security and Law

With the huge volumes of electronic data subject to discovery in virtually every instance of litigation, time and costs of conducting discovery have become exceedingly important when litigants plan their discovery strategies. Rather than incurring the costs of having lawyers review every document produced in response to a discovery request in search of relevant evidence, a cost effective strategy for document review planning is to use statistical sampling of the database of documents to determine the likelihood of finding relevant evidence by reviewing additional documents. This paper reviews and discusses how sampling can be used to make document review more …

Go to article

On The Sampling Of Web Images For Learning Visual Concept Classifiers, Shiai Zhu, Gang Wang, Chong-Wah Ngo, Yu-Gang Jiang Jul 2010

On The Sampling Of Web Images For Learning Visual Concept Classifiers, Shiai Zhu, Gang Wang, Chong-Wah Ngo, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems

Visual concept learning often requires a large set of training images. In practice, nevertheless, acquiring noise-free training labels with sufficient positive examples is always expensive. A plausible solution for training data collection is by sampling the largely available user-tagged images from social media websites. With the general belief that the probability of correct tagging is higher than that of incorrect tagging, such a solution often sounds feasible, though is not without challenges. First, user-tags can be subjective and, to certain extent, are ambiguous. For instance, an image tagged with “whales” may be simply a picture about ocean museum. Learning concept …

Go to article

Computer Engineering Commons^™

Full-Text Articles in Computer Engineering

Fuzzycsampling: A Hybrid Fuzzy C-Means Clustering Sampling Strategy For Imbalanced Datasets, Abdullah Maraş, Çi̇ğdem Erol

Turkish Journal of Electrical Engineering and Computer Sciences

Customer Churn Prediction, Deepshikha Wadikar

Dissertations

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro

Dissertations

Iterative Stochastic Resonance Demodulation Algorithm Of Frequency-Hopping Signal, Haixia Li, Yongfeng Ren, Yuhua Yang, Zhang Baili, Zhumei Tian

Journal of System Simulation

Hot Zone Identification: Analyzing Effects Of Data Sampling On Spam Clustering, Rasib Khan, Mainul Mizan, Ragib Hasan, Alan Sprague

Journal of Digital Forensics, Security and Law

Sampling: Making Electronic Discovery More Cost Effective, Milton Luoma, Vicki Luoma

Journal of Digital Forensics, Security and Law

On The Sampling Of Web Images For Learning Visual Concept Classifiers, Shiai Zhu, Gang Wang, Chong-Wah Ngo, Yu-Gang Jiang

Research Collection School Of Computing and Information Systems