Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

2020

Classification

Institution
Publication
Publication Type
File Type

Articles 1 - 30 of 31

Full-Text Articles in Physical Sciences and Mathematics

Signature Identification And Verification Systems: A Comparative Study On The Online And Offline Techniques, Nehal Hamdy Al-Banhawy, Heba Mohsen, Neveen I. Ghali Prof. Dec 2020

Signature Identification And Verification Systems: A Comparative Study On The Online And Offline Techniques, Nehal Hamdy Al-Banhawy, Heba Mohsen, Neveen I. Ghali Prof.

Future Computing and Informatics Journal

Handwritten signature identification and verification has become an active area of research in recent years. Handwritten signature identification systems are used for identifying the user among all users enrolled in the system while handwritten signature verification systems are used for authenticating a user by comparing a specific signature with his signature that is stored in the system. This paper presents a review for commonly used methods for preprocessing, feature extraction and classification techniques in signature identification and verification systems, in addition to a comparison between the systems implemented in the literature for identification techniques and verification techniques in online and …


Metric Learning Via Linear Embeddings For Human Motion Recognition, Byoungdoo Kong Dec 2020

Metric Learning Via Linear Embeddings For Human Motion Recognition, Byoungdoo Kong

Masters Theses

We consider the application of Few-Shot Learning (FSL) and dimensionality reduction to the problem of human motion recognition (HMR). The structure of human motion has unique characteristics such as its dynamic and high-dimensional nature. Recent research on human motion recognition uses deep neural networks with multiple layers. Most importantly, large datasets will need to be collected to use such networks to analyze human motion. This process is both time-consuming and expensive since a large motion capture database must be collected and labeled. Despite significant progress having been made in human motion recognition, state-of-the-art algorithms still misclassify actions because of characteristics …


Automatically Classifying Non-Functional Requirements With Feature Extraction And Supervised Machine Learning Techniques, Mahtab Ezzatikarami Dec 2020

Automatically Classifying Non-Functional Requirements With Feature Extraction And Supervised Machine Learning Techniques, Mahtab Ezzatikarami

Electronic Thesis and Dissertation Repository

Abstract. Context and Motivation: Non-functional requirements (NFRs) of a system need to be classified into different types such as usability, performance, etc. This would enable stakeholders to ensure the completeness of their work by extracting specific NFRs related to their expertise. Question/Problem: Because of the size and complexity of requirement specification documents, the manual classification of NFRs is time-consuming, labour-intensive, and error-prone. We thus need an automated solution that can provide a highly accurate and efficient categorization of NFRs. Principal ideas/results: In this investigation, using natural language processing and supervised machine learning (SML) techniques, we investigate with feature extraction techniques …


A Systematic Mapping Study On The Risk Factors Leading To Type Ii Diabetes Mellitus, Karar N. J Musafer, Fahrul Zaman Huyop, Mufeed J Ewadh, Eko Supriyanto, Mohammad Rava Oct 2020

A Systematic Mapping Study On The Risk Factors Leading To Type Ii Diabetes Mellitus, Karar N. J Musafer, Fahrul Zaman Huyop, Mufeed J Ewadh, Eko Supriyanto, Mohammad Rava

Karbala International Journal of Modern Science

Diabetes is one of the most common diseases that has had devastating effects on the general population. It is also among the most popular research trends in modern medicine. Thus, due to the complexity and desirability of this particular affliction, there is a lot of demand towards understanding this disease better, so that it can pave the way towards better solutions in combating diabetes. The aim of this review is to provide a categorization of the risk factors leading to Type II Diabetes. In order to provide a justification for the type of diabetes, an explanation is provided which covers …


Wait For It: Identifying 'On-Hold' Self-Admitted Technical Debt, Rungroj Maipradit, Christoph Treude, Hideaki Hata, Kenichi Matsumoto Sep 2020

Wait For It: Identifying 'On-Hold' Self-Admitted Technical Debt, Rungroj Maipradit, Christoph Treude, Hideaki Hata, Kenichi Matsumoto

Research Collection School Of Computing and Information Systems

Self-admitted technical debt refers to situations where a software developer knows that their current implementation is not optimal and indicates this using a source code comment. In this work, we hypothesize that it is possible to develop automated techniques to understand a subset of these comments in more detail, and to propose tool support that can help developers manage self-admitted technical debt more effectively. Based on a qualitative study of 333 comments indicating self-admitted technical debt, we first identify one particular class of debt amenable to automated management: on-hold self-admitted technical debt (on-hold SATD), i.e., debt which contains a condition …


A Unified Framework For Sparse Online Learning, Peilin Zhao, Dayong Wong, Pengcheng Wu, Steven C. H. Hoi Aug 2020

A Unified Framework For Sparse Online Learning, Peilin Zhao, Dayong Wong, Pengcheng Wu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

The amount of data in our society has been exploding in the era of big data. This article aims to address several open challenges in big data stream classification. Many existing studies in data mining literature follow the batch learning setting, which suffers from low efficiency and poor scalability. To tackle these challenges, we investigate a unified online learning framework for the big data stream classification task. Different from the existing online data stream classification techniques, we propose a unified Sparse Online Classification (SOC) framework. Based on SOC, we derive a second-order online learning algorithm and a cost-sensitive sparse online …


Development And Identification Of Metrics To Predict The Impact Of Dimension Reduction Techniques On Classical Machine Learning Algorithms For Still Highway Images, Wasim Akram Khan Aug 2020

Development And Identification Of Metrics To Predict The Impact Of Dimension Reduction Techniques On Classical Machine Learning Algorithms For Still Highway Images, Wasim Akram Khan

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

We are witnessing an influx of data - images, texts, video, etc. Their high dimensionality and large volume make it challenging to apply machine learning to obtain actionable insight. This thesis explores several aspects pertaining to dimensional reduction: dimension reduction methods, metrics to measure distortion, image preprocessing, etc. Faster training and inference time on reduced data and smaller models which can be deployed on commodity hardware are a critical advantage of dimension reduction. For this study, classical machine learning methods were explored owing to their solid mathematical foundation and interpretability.

The dataset used is a time series of images from …


Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund May 2020

Computational Astronomy: Classification Of Celestial Spectra Using Machine Learning Techniques, Gayatri Milind Hungund

Master's Projects

Lightyears beyond the Planet Earth there exist plenty of unknown and unexplored stars and Galaxies that need to be studied in order to support the Big Bang Theory and also make important astronomical discoveries in quest of knowing the unknown. Sophisticated devices and high-power computational resources are now deployed to make a positive effort towards data gathering and analysis. These devices produce massive amount of data from the astronomical surveys and the data is usually in terabytes or petabytes. It is exhaustive to process this data and determine the findings in short period of time. Many details can be missed …


Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak May 2020

Randomized And Evolutionary Approaches To Dataset Characterization, Feature Weighting, And Sampling In K-Nearest Neighbors, Suryoday Basak

Computer Science and Engineering Theses

K-Nearest Neighbors (KNN) has remained one of the most popular methods for supervised machine learning tasks. However, its performance often depends on the characteristics of the dataset and on appropriate feature scaling. In this thesis, characteristics of a dataset that make it suitable for being used within KNN are explored. As part of this, two new measures for dataset dispersion, called mean neighborhood target variance (MNTV), and mean neighborhood target entropy (MNTE) are developed to help determine the performance we expect while using KNN regressors and classifiers, respectively. It is empirically demonstrated that these measures of dispersion can be indicative …


An Exploration Of Methods For Classifying Air-Written Letters From The Spanish Alphabet, Manuel Serna-Aguilera May 2020

An Exploration Of Methods For Classifying Air-Written Letters From The Spanish Alphabet, Manuel Serna-Aguilera

Computer Science and Computer Engineering Undergraduate Honors Theses

The ability to recognize human activity, especially air-writing, is an interesting challenge as one could identify any letter from many languages. I intend to investigate this problem of air-writing, but with the added twist of including the following letters from the Spanish alphabet: Á, É, Í, Ó, Ú, Ü, and Ñ. With this new alphabet, I set out to see what kinds of classifiers work best and on what kinds of data, since letters can be represented in multiple ways.

My tracking system will consist of a regular camera and a subject who will draw with a brightly colored marker …


Towards Multi-Modal Data Classification, Henry Ng May 2020

Towards Multi-Modal Data Classification, Henry Ng

UNLV Theses, Dissertations, Professional Papers, and Capstones

A feature fusion multi-modal neural network (MMN) is a network that combines different modalities at the feature level to perform a specific task. In this paper, we study the problem of training the fusion procedure for MMN. A recent study has found that training a multi-modal network that incorporates late fusion produces a network that has not learned the proper parameters for feature extraction. These late fusion models perform very well during training but fall short to its single modality counterpart when testing. We hypothesize that jointly trained MMN have weight space that is too large for effective training. To …


Novel Inference Methods For Generalized Linear Models Using Shrinkage Priors And Data Augmentation., Arinjita Bhattacharyya May 2020

Novel Inference Methods For Generalized Linear Models Using Shrinkage Priors And Data Augmentation., Arinjita Bhattacharyya

Electronic Theses and Dissertations

Generalized linear models have broad applications in biostatistics and sociology. In a regression setup, the main target is to find a relevant set of predictors out of a large collection of covariates. Sparsity is the assumption that only a few of these covariates in a regression setup have a meaningful correlation with an outcome variate of interest. Sparsity is incorporated by regularizing the irrelevant slopes towards zero without changing the relevant predictors and keeping the resulting inferences intact. Frequentist variable selection and sparsity are addressed by popular techniques like Lasso, Elastic Net. Bayesian penalized regression can tackle the curse of …


A Survey Of Feature Extraction And Fusion Of Deep Learning For Detection Of Abnormalities In Video Endoscopy Of Gastrointestinal-Tract, Hussam Ali, Muhammad Sharif, Mussarat Yasmin, Mubashir Husain Rehmani, Farhan Riaz Apr 2020

A Survey Of Feature Extraction And Fusion Of Deep Learning For Detection Of Abnormalities In Video Endoscopy Of Gastrointestinal-Tract, Hussam Ali, Muhammad Sharif, Mussarat Yasmin, Mubashir Husain Rehmani, Farhan Riaz

Publications

A standard screening procedure involves video endoscopy of the Gastrointestinal tract. It is a less invasive method which is practiced for early diagnosis of gastric diseases. Manual inspection of a large number of gastric frames is an exhaustive, time-consuming task, and requires expertise. Conversely, several computer-aided diagnosis systems have been proposed by researchers to cope with the dilemma of manual inspection of the massive volume of frames. This article gives an overview of different available alternatives for automated inspection, detection, and classification of various GI abnormalities. Also, this work elaborates techniques associated with content-based image retrieval and automated systems for …


Brain Disease Detection From Eegs: Comparing Spiking And Recurrent Neural Networks For Non-Stationary Time Series Classification, Hristo Stoev Jan 2020

Brain Disease Detection From Eegs: Comparing Spiking And Recurrent Neural Networks For Non-Stationary Time Series Classification, Hristo Stoev

Dissertations

Modeling non-stationary time series data is a difficult problem area in AI, due to the fact that the statistical properties of the data change as the time series progresses. This complicates the classification of non-stationary time series, which is a method used in the detection of brain diseases from EEGs. Various techniques have been developed in the field of deep learning for tackling this problem, with recurrent neural networks (RNN) approaches utilising Long short-term memory (LSTM) architectures achieving a high degree of success. This study implements a new, spiking neural network-based approach to time series classification for the purpose of …


A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price Jan 2020

A Description Of A Humans Knowledge Using Artificial Intelligence, Dj Price

Mahurin Honors College Capstone Experience/Thesis Projects

There currently does not exist a way to easily view the relationships between a collection of written items (e.g. sports articles, diary entries, research papers). In recent years, novel machine learning methods have been developed which are very good at extracting semantic relationships from large numbers of documents. One of them is the (unsupervised) machine learning model Doc2Vec which constructs vectors for documents. The research project detailed in this paper uses this and other already existing algorithms to analyze the relationship between pieces of text. We set forth a broader ambition for this project before discussing the use and need …


An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell Jan 2020

An Analysis Of The Success Of Farmers Markets In Kentucky Using Logistic Regression And Support Vector Machines, Jeron Russell

Mahurin Honors College Capstone Experience/Thesis Projects

The purpose of this research is to look at the relationship that market-specific, economic, and demographic variables have with the success of farmers markets in Kentucky. It additionally seeks to build a tool for predicting farmers market success that could be used by policy makers to aid in decision-making processes concerning farmers markets. Logistic regression and Support Vector Machines (SVMs) are used on data acquired from the Kentucky Department of Agriculture and the American Community Survey in order to analyze the data in a traditional statistical approach as well as a machine learning approach. The results included an SVM model …


A Novel Genome Analysis Method With The Entropy-Based Numerical Techniqueusing Pretrained Convolutional Neural Networks, Bi̇hter Daş, Suat Toraman, İbrahi̇m Türkoğlu Jan 2020

A Novel Genome Analysis Method With The Entropy-Based Numerical Techniqueusing Pretrained Convolutional Neural Networks, Bi̇hter Daş, Suat Toraman, İbrahi̇m Türkoğlu

Turkish Journal of Electrical Engineering and Computer Sciences

The identification of DNA sequences as exon and intron is a common problem in genome analysis. The methods used for feature extraction and mapping techniques for the digitization of sequences affect directly the solution of this problem. The existing mapping techniques are not enough to detect coding and noncoding regions in some genomes because the digital representation of each base in a DNA sequence with an integer does not fully reflect the structure of an original DNA sequence. In the entropy-based mapping technique, we could overcome this problem because the technique deepens distinction rates of exon regions, and better reflects …


Classification Of Animal Sound Using Convolutional Neural Network, Neha Singh Jan 2020

Classification Of Animal Sound Using Convolutional Neural Network, Neha Singh

Dissertations

Recently, labeling of acoustic events has emerged as an active topic covering a wide range of applications. High-level semantic inference can be conducted based on main audioeffects to facilitate various content-based applications for analysis, efficient recovery and content management. This paper proposes a flexible Convolutional neural network-based framework for animal audio classification. The work takes inspiration from various deep neural network developed for multimedia classification recently. The model is driven by the ideology of identifying the animal sound in the audio file by forcing the network to pay attention to core audio effect present in the audio to generate Mel-spectrogram. …


Customer Churn Prediction, Deepshikha Wadikar Jan 2020

Customer Churn Prediction, Deepshikha Wadikar

Dissertations

Churned customers identification plays an essential role for the functioning and growth of any business. Identification of churned customers can help the business to know the reasons for the churn and they can plan their market strategies accordingly to enhance the growth of a business. This research is aimed at developing a machine learning model that can precisely predict the churned customers from the total customers of a Credit Union financial institution. A quantitative and deductive research strategies are employed to build a supervised machine learning model that addresses the class imbalance problem handled feature selection and efficiently predict the …


An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro Jan 2020

An Examination Of The Smote And Other Smote-Based Techniques That Use Synthetic Data To Oversample The Minority Class In The Context Of Credit-Card Fraud Classification, Eduardo Parkinson De Castro

Dissertations

This research project seeks to investigate some of the different sampling techniques that generate and use synthetic data to oversample the minority class as a means of handling the imbalanced distribution between non-fraudulent (majority class) and fraudulent (minority class) classes in a credit-card fraud dataset. The purpose of the research project is to assess the effectiveness of these techniques in the context of fraud detection which is a highly imbalanced and cost-sensitive dataset. Machine learning tasks that require learning from datasets that are highly unbalanced have difficulty learning since many of the traditional learning algorithms are not designed to cope …


Prediction Of Sudden Cardiac Death Using Ensemble Classifiers, Ayman Momtaz El-Geneidy Jan 2020

Prediction Of Sudden Cardiac Death Using Ensemble Classifiers, Ayman Momtaz El-Geneidy

CCE Theses and Dissertations

Sudden Cardiac Death (SCD) is a medical problem that is responsible for over 300,000 deaths per year in the United States and millions worldwide. SCD is defined as death occurring from within one hour of the onset of acute symptoms, an unwitnessed death in the absence of pre-existing progressive circulatory failures or other causes of deaths, or death during attempted resuscitation. Sudden death due to cardiac reasons is a leading cause of death among Congestive Heart Failure (CHF) patients. The use of Electronic Medical Records (EMR) systems has made a wealth of medical data available for research and analysis. Supervised …


Development Of Criteria For Mobile Device Cybersecurity Threat Classification And Communication Standards (Ctc&Cs), Emmanuel Jigo Jan 2020

Development Of Criteria For Mobile Device Cybersecurity Threat Classification And Communication Standards (Ctc&Cs), Emmanuel Jigo

CCE Theses and Dissertations

The increasing use of mobile devices and the unfettered access to cyberspace has introduced new threats to users. Mobile device users are continually being targeted for cybersecurity threats via vectors such as public information sharing on social media, user surveillance (geolocation, camera, etc.), phishing, malware, spyware, trojans, and keyloggers. Users are often uninformed about the cybersecurity threats posed by mobile devices. Users are held responsible for the security of their device that includes taking precautions against cybersecurity threats. In recent years, financial institutions are passing the costs associated with fraud to the users because of the lack of security.

The …


A Computational Method For The Image Segmentation Of Pigmented Skin Lesions, Kaila M. Piscitelli Jan 2020

A Computational Method For The Image Segmentation Of Pigmented Skin Lesions, Kaila M. Piscitelli

Senior Projects Spring 2020

Senior Project submitted to The Division of Science, Mathematics and Computing of Bard College.


Multi-Label Classification Models For Heterogeneous Data: An Ensemble-Based Approach., Jose Maria Moyano Murillo Jan 2020

Multi-Label Classification Models For Heterogeneous Data: An Ensemble-Based Approach., Jose Maria Moyano Murillo

Theses and Dissertations

In recent years, the multi-label classification gained attention of the scientific community given its ability to solve real-world problems where each instance of the dataset may be associated with several class labels simultaneously, such as multimedia categorization or medical problems.

The first objective of this dissertation is to perform a thorough review of the state-of-the-art ensembles of multi-label classifiers (EMLCs). Its aim is twofold: 1) study state-of-the-art ensembles of multi-label classifiers and categorize them proposing a novel taxonomy; and 2) perform an experimental study to give some tips and guidelines to select the method that perform the best according to …


Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe Jan 2020

Disaster Damage Categorization Applying Satellite Images And Machine Learning Algorithm, Farinaz Sabz Ali Pour, Adrian Gheorghe

Engineering Management & Systems Engineering Faculty Publications

Special information has a significant role in disaster management. Land cover mapping can detect short- and long-term changes and monitor the vulnerable habitats. It is an effective evaluation to be included in the disaster management system to protect the conservation areas. The critical visual and statistical information presented to the decision-makers can help in mitigation or adaption before crossing a threshold. This paper aims to contribute in the academic and the practice aspects by offering a potential solution to enhance the disaster data source effectiveness. The key research question that the authors try to answer in this paper is how …


Detection Of Hand Osteoarthritis From Hand Radiographs Using Convolutionalneural Networks With Transfer Learning, Kemal Üreten, Hasan Erbay, Hadi̇ Hakan Maraş Jan 2020

Detection Of Hand Osteoarthritis From Hand Radiographs Using Convolutionalneural Networks With Transfer Learning, Kemal Üreten, Hasan Erbay, Hadi̇ Hakan Maraş

Turkish Journal of Electrical Engineering and Computer Sciences

Osteoarthritis is the most common type of arthritis. Hand osteoarthritis leads to specific structural changes in the joints, such as asymmetric joint space narrowing and osteophytes (bone spurs). Conventional radiography has traditionally been the primary method of visualizing these structural changes and diagnosing osteoarthritis. We aimed to develop a computerized method that is capable of determining the structural changes seen in radiography of the hand and to assist practitioners in interpreting radiographic changes and diagnosing the disease. In this retrospective study, transfer-learning-based convolutional neural networks were trained on a randomly selected dataset containing 332 radiography images of hands from an …


Dynamically Updated Diversified Ensemble-Based Approach For Handling Concept Drift, Kanu Goel, Shalini Batra Jan 2020

Dynamically Updated Diversified Ensemble-Based Approach For Handling Concept Drift, Kanu Goel, Shalini Batra

Turkish Journal of Electrical Engineering and Computer Sciences

Concept drift is the phenomenon where underlying data distribution changes over time unexpectedly. Examining such drifts and getting insight into the executing processes at that instance of time is a big challenge. Prediction models should be capable of handling drifts in scenarios where statistical properties show abrupt changes. Various strategies exist in the literature to deal with such challenging scenarios but the majority of them are limited to the identification of a particular kind of drift pattern. The proposed approach uses online drift detection in a diversified adaptive setting with pruning techniques to formulate a concept drift handling approach, named …


Sketic: A Machine Learning-Based Digital Circuit Recognition Platform, Mohamamd Abdel Majeed, Tasneem Almousa, Maysaa Alsalman, Abeer Yosef Jan 2020

Sketic: A Machine Learning-Based Digital Circuit Recognition Platform, Mohamamd Abdel Majeed, Tasneem Almousa, Maysaa Alsalman, Abeer Yosef

Turkish Journal of Electrical Engineering and Computer Sciences

In digital system design, digital logic circuit diagrams are built using interconnects and symbolic representations of the basic logic gates. Constructing such diagrams using free sketches is the first step in the design process. After that the circuit schematic or code has to be generated before being able to simulate the design. While most of the mentioned steps are automated using design automation tools, drafting the schematic circuit and then converting it into a valid format that can be simulated are still done manually due to the lack of robust tools that can recognize the free sketches and incorporate them …


A Random Subspace Based Conic Functions Ensemble Classifier, Emre Çi̇men Jan 2020

A Random Subspace Based Conic Functions Ensemble Classifier, Emre Çi̇men

Turkish Journal of Electrical Engineering and Computer Sciences

Classifiers overfit when the data dimensionality ratio to the number of samples is high in a dataset. This problem makes a classification model unreliable. When the overfitting problem occurs, one can achieve high accuracy in the training; however, test accuracy occurs significantly less than training accuracy. The random subspace method is a practical approach to overcome the overfitting problem. In random subspace methods, the classification algorithm selects a random subset of the features and trains a classifier function trained with the selected features. The classification algorithm repeats the process multiple times, and eventually obtains an ensemble of classifier functions. Conic …


Revised Polyhedral Conic Functions Algorithm For Supervised Classification, Gürhan Ceylan, Gürkan Öztürk Jan 2020

Revised Polyhedral Conic Functions Algorithm For Supervised Classification, Gürhan Ceylan, Gürkan Öztürk

Turkish Journal of Electrical Engineering and Computer Sciences

In supervised classification, obtaining nonlinear separating functions from an algorithm is crucial for prediction accuracy. This paper analyzes the polyhedral conic functions (PCF) algorithm that generates nonlinear separating functions by only solving simple subproblems. Then, a revised version of the algorithm is developed that achieves better generalization and fast training while maintaining the simplicity and high prediction accuracy of the original PCF algorithm. This is accomplished by making the following modifications to the subproblem: extension of the objective function with a regularization term, relaxation of a hard constraint set and introduction of a new error term. Experimental results show that …