Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2016

Classification

Discipline
Institution
Publication
Publication Type
File Type

Articles 1 - 28 of 28

Full-Text Articles in Physical Sciences and Mathematics

Seaweed And Seagrass Mapping In Thailand Measured Using Landsat 8 Optical And Textural Image Properties, Satomi Kakuta, Wataru Takeuchi, Anchana Prathep Dec 2016

Seaweed And Seagrass Mapping In Thailand Measured Using Landsat 8 Optical And Textural Image Properties, Satomi Kakuta, Wataru Takeuchi, Anchana Prathep

Journal of Marine Science and Technology

Seaweed and seagrass beds are an important ecosystem in coastal zones. However, they are degrading because of various causes, such as the anthropogenic impacts of coastal development, aquaculture, overharvesting, and climate change. To contribute to the research related to coastal blue carbon and marine biodiversity as well as conservation and sustainable management of natural resources in coastal regions, the spatial distribution of benthic cover derived from satellite images can be the most practical tool for monitoring seaweed and seagrass beds. This study aimed at mapping the latest distribution of seaweed and seagrass in Thailand using Landsat 8 images. Thus, we …


Review Classification, Balraj Aujla Dec 2016

Review Classification, Balraj Aujla

Computer Science and Software Engineering

The goal of this project is to find a way to analyze reviews and determine the sentiment of a review. It uses various machine learning techniques in order to achieve its goals such as SVMs and Naive Bayes. Overall the purpose is to learn many different machine learning techniques, determine which ones would be useful for the project, then compare the results. Research is the foremost goal of the project, and it is able to determine the better algorithm for review classification, naive bayes or an SVM. In addition, an SVM which actually gave review’s scores rather than just classifying …


A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis Dec 2016

A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis

Open Access Dissertations

Mass spectrometry (MS) imaging is a powerful investigation technique for a wide range of biological applications such as molecular histology of tissue, whole body sections, and bacterial films , and biomedical applications such as cancer diagnosis. MS imaging visualizes the spatial distribution of molecular ions in a sample by repeatedly collecting mass spectra across its surface, resulting in complex, high-dimensional imaging datasets. Two of the primary goals of statistical analysis of MS imaging experiments are classification (for supervised experiments), i.e. assigning pixels to pre-defined classes based on their spectral profiles, and segmentation (for unsupervised experiments), i.e. assigning pixels to newly …


Differentially Private Data Publishing For Data Analysis, Dong Su Dec 2016

Differentially Private Data Publishing For Data Analysis, Dong Su

Open Access Dissertations

In the information age, vast amounts of sensitive personal information are collected by companies, institutions and governments. A key technological challenge is how to design mechanisms for effectively extracting knowledge from data while preserving the privacy of the individuals involved. In this dissertation, we address this challenge from the perspective of differentially private data publishing. Firstly, we propose PrivPfC, a differentially private method for releasing data for classification. The key idea underlying PrivPfC is to privately select, in a single step, a grid, which partitions the data domain into a number of cells. This selection is done using the exponential …


A Reduced Labeled Samples (Rls) Framework For Classification Of Imbalanced Concept-Drifting Streaming Data., Elaheh Arabmakki Dec 2016

A Reduced Labeled Samples (Rls) Framework For Classification Of Imbalanced Concept-Drifting Streaming Data., Elaheh Arabmakki

Electronic Theses and Dissertations

Stream processing frameworks are designed to process the streaming data that arrives in time. An example of such data is stream of emails that a user receives every day. Most of the real world data streams are also imbalanced as is in the stream of emails, which contains few spam emails compared to a lot of legitimate emails. The classification of the imbalanced data stream is challenging due to the several reasons: First of all, data streams are huge and they can not be stored in the memory for one time processing. Second, if the data is imbalanced, the accuracy …


Regularized Neural Network To Identify Potential Breast Cancer: A Bayesian Approach, Hansapani S. Rodrigo, Chris P. Tsokos, Taysseer Sharaf Nov 2016

Regularized Neural Network To Identify Potential Breast Cancer: A Bayesian Approach, Hansapani S. Rodrigo, Chris P. Tsokos, Taysseer Sharaf

Journal of Modern Applied Statistical Methods

In the current study, we have exemplified the use of Bayesian neural networks for breast cancer classification using the evidence procedure. The optimal Bayesian network has 81% overall accuracy in correctly classifying the true status of breast cancer patients, 59% sensitivity in correctly detecting the malignancy and 83% specificity in correctly detecting the non-malignancy. The area under the receiver operating characteristic curve (0.7940) shows that this is a moderate classification model.


On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim Nov 2016

On Profiling Bots In Social Media, Richard J. Oentaryo, Arinto Murdopo, Philips K. Prasetyo, Ee Peng Lim

Research Collection School Of Computing and Information Systems

The popularity of social media platforms such as Twitter has led to the proliferation of automated bots, creating both opportunities and challenges in information dissemination, user engagements, and quality of services. Past works on profiling bots had been focused largely on malicious bots, with the assumption that these bots should be removed. In this work, however, we find many bots that are benign, and propose a new, broader categorization of bots based on their behaviors. This includes broadcast, consumption, and spam bots. To facilitate comprehensive analyses of bots and how they compare to human accounts, we develop a systematic profiling …


A Summary Of Classification And Regression Tree With Application, Adem Meta Oct 2016

A Summary Of Classification And Regression Tree With Application, Adem Meta

UBT International Conference

Classification and regression tree (CART) is a non-parametric methodology that was introduced first by Breiman and colleagues in 1984. CART is a technique which divides populations into meaningful subgroups that allows the identification of groups of interest. CART as a classification method constructs decision trees. Depending on information that is available about the dataset, a classification tree or a regression tree can be constructed. The first part of this paper describes the fundamental principles of tree construction, pruning procedure and different splitting algorithms. The second part of the paper answers the questions why or why not the CART method should …


Advanced Data Analysis - Lecture Notes, Erik B. Erhardt, Edward J. Bedrick, Ronald M. Schrader Oct 2016

Advanced Data Analysis - Lecture Notes, Erik B. Erhardt, Edward J. Bedrick, Ronald M. Schrader

Open Textbooks

Lecture notes for Advanced Data Analysis (ADA1 Stat 427/527 and ADA2 Stat 428/528), Department of Mathematics and Statistics, University of New Mexico, Fall 2016-Spring 2017. Additional material including RMarkdown templates for in-class and homework exercises, datasets, R code, and video lectures are available on the course websites: https://statacumen.com/teaching/ada1 and https://statacumen.com/teaching/ada2 .

Contents

I ADA1: Software

  • 0 Introduction to R, Rstudio, and ggplot

II ADA1: Summaries and displays, and one-, two-, and many-way tests of means

  • 1 Summarizing and Displaying Data
  • 2 Estimation in One-Sample Problems
  • 3 Two-Sample Inferences
  • 4 Checking Assumptions
  • 5 One-Way Analysis of Variance

III ADA1: Nonparametric, categorical, …


Computerized Classification Of Surface Spikes In Three-Dimensional Electron Microscopic Reconstructions Of Viruses, Younes Benkarroum Sep 2016

Computerized Classification Of Surface Spikes In Three-Dimensional Electron Microscopic Reconstructions Of Viruses, Younes Benkarroum

Dissertations, Theses, and Capstone Projects

The purpose of this research is to develop computer techniques for improved three-dimensional (3D) reconstruction of viruses from electron microscopic images of them and for the subsequent improved classification of the surface spikes in the resulting reconstruction. The broader impact of such work is the following.

Influenza is an infectious disease caused by rapidly-changing viruses that appear seasonally in the human population. New strains of influenza viruses appear every year, with the potential to cause a serious global pandemic. Two kinds of spikes – hemagglutinin (HA) and neuraminidase (NA) – decorate the surface of the virus particles and these proteins …


A Supervised Classification Method For Levee Slide Detection Using Complex Synthetic Aperture Radar Imagery, Ramakalavathi Marapareddy, James V. Aanstoos, Nicolas H. Younan Sep 2016

A Supervised Classification Method For Levee Slide Detection Using Complex Synthetic Aperture Radar Imagery, Ramakalavathi Marapareddy, James V. Aanstoos, Nicolas H. Younan

Faculty Publications

The dynamics of surface and sub-surface water events can lead to slope instability, resulting in anomalies such as slough slides on earthen levees. Early detection of these anomalies by a remote sensing approach could save time versus direct assessment. We have implemented a supervised Mahalanobis distance classification algorithm for the detection of slough slides on levees using complex polarimetric Synthetic Aperture Radar (polSAR) data. The classifier output was followed by a spatial majority filter post-processing step that improved the accuracy. The effectiveness of the algorithm is demonstrated using fully quad-polarimetric L-band Synthetic Aperture Radar (SAR) imagery from the NASA Jet …


Algorithms For Pre-Microrna Classification And A Gpu Program For Whole Genome Comparison, Ling Zhong Aug 2016

Algorithms For Pre-Microrna Classification And A Gpu Program For Whole Genome Comparison, Ling Zhong

Dissertations

MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpin can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). The first part of this dissertation presents a new method, called MirID, for identifying and classifying microRNA precursors. MirID is comprised of three steps. Initially, a combinatorial feature mining algorithm is developed to identify suitable feature sets. Then, the …


Variable Selection For Estimating The Optimal Treatment Regimes In The Presence Of A Large Number Of Covariate, Baqun Zhang, Min Zhang Jul 2016

Variable Selection For Estimating The Optimal Treatment Regimes In The Presence Of A Large Number Of Covariate, Baqun Zhang, Min Zhang

The University of Michigan Department of Biostatistics Working Paper Series

Most of existing methods for optimal treatment regimes, with few exceptions, focus on estimation and are not designed for variable selection with the objective of optimizing treatment decisions. In clinical trials and observational studies, often numerous baseline variables are collected and variable selection is essential for deriving reliable optimal treatment regimes. Although many variable selection methods exist, they mostly focus on selecting variables that are important for prediction (predictive variables) instead of variables that have a qualitative interaction with treatment (prescriptive variables) and hence are important for making treatment decisions. We propose a variable selection method within a general classification …


Automated Detection Of Deep-Sea Animals, Dallas J. Hollis, Duane Edgington, Danelle Cline Jul 2016

Automated Detection Of Deep-Sea Animals, Dallas J. Hollis, Duane Edgington, Danelle Cline

STAR Program Research Presentations

The Monterey Bay Aquarium Research Institute routinely deploys remotely operated underwater vehicles equipped with high definition cameras for use in scientific studies. Utilizing a video collection of over 22,000 hours and the Video Annotation and Reference System, we have set out to automate the detection and classification of deep-sea animals. This paper serves to explore the pitfalls of automation and suggest possible solutions to automated detection in diverse ecosystems with varying field conditions. Detection was tested using a saliency-based neuromorphic selective attention algorithm. The animals that were not detected were then used to tune saliency parameters. Once objects are detected, …


Assessing Metacognitive Skills Using Adaptive Neural Networks, Anderson Justin, Kouider Mokhtari, Arun Kulkarni May 2016

Assessing Metacognitive Skills Using Adaptive Neural Networks, Anderson Justin, Kouider Mokhtari, Arun Kulkarni

Arun Kulkarni

The assessment of student's levels of metacognitive knowledge and skills is critical in determining their ability to effectively perform complex cognitive tasks such as solving mathematics or reading comprehension problems. In this paper, we use an adaptive multiplayer perceptron model to categorize participants based on their metacognitive awareness and perceived use of reading strategies while reading. Eight hundred and sixty-five middle school students participated in the study. All participants completed a 30-item instrument- the Metacognitive Awareness-of-Reading Strategies Inventory (MARSI). We used adaptive multi-layer perceptron models to classify participants into three groups based on their metacognitive strategy awareness levels using thirteen …


Knowledge Extraction From Survey Data Using Neural Networks, Khan Imran, Arun Kulkarni May 2016

Knowledge Extraction From Survey Data Using Neural Networks, Khan Imran, Arun Kulkarni

Arun Kulkarni

Surveys are an important tool for researchers. It is increasingly important to develop powerful means for analyzing such data and to extract knowledge that could help in decision-making. Survey attributes are typically discrete data measured on a Likert scale. The process of classification becomes complex if the number of survey attributes is large. Another major issue in Likert-Scale data is the uniqueness of tuples. A large number of unique tuples may result in a large number of patterns. The main focus of this paper is to propose an efficient knowledge extraction method that can extract knowledge in terms of rules. …


Multispectral Image Analysis Using Random Forest, Barrett Lowe, Arun Kulkarni May 2016

Multispectral Image Analysis Using Random Forest, Barrett Lowe, Arun Kulkarni

Arun Kulkarni

Classical methods for classification of pixels in multispectral images include supervised classifiers such as the maximum-likelihood classifier, neural network classifiers, fuzzy neural networks, support vector machines, and decision trees. Recently, there has been an increase of interest in ensemble learning – a method that generates many classifiers and aggregates their results. Breiman proposed Random Forestin 2001 for classification and clustering. Random Forest grows many decision trees for classification. To classify a new object, the input vector is run through each decision tree in the forest. Each tree gives a classification. The forest chooses the classification having the most votes. Random …


Knowledge Extraction From Metacognitive Reading Strategies Data Using Induction Trees, Christopher Taylor, Arun D. Kulkarni, Kouider Mokhtari Jan 2016

Knowledge Extraction From Metacognitive Reading Strategies Data Using Induction Trees, Christopher Taylor, Arun D. Kulkarni, Kouider Mokhtari

Computer Science Faculty Publications and Presentations

The assessment of students’ metacognitive knowledge and skills about reading is critical in determining their ability to read academic texts and do so with comprehension. In this paper, we used induction trees to extract metacognitive knowledge about reading from a reading strategies dataset obtained from a group of 1636 undergraduate college students. Using a C4.5 algorithm, we constructed decision trees, which helped us classify participants into three groups based on their metacognitive strategy awareness levels consisting of global, problem-solving and support reading strategies. We extracted rules from these decision trees, and in order to evaluate accuracy of the extracted rules, …


Breast Cancer Classification Of Mammographic Masses Using Circularity Max Metric, A New Method, Tae Keun Heo Jan 2016

Breast Cancer Classification Of Mammographic Masses Using Circularity Max Metric, A New Method, Tae Keun Heo

Electronic Theses and Dissertations

Breast cancer classification can be divided into two categories. The first category is a benign tumor, and the other is a malignant tumor. The main purpose of breast cancer classification is to classify abnormalities into benign or malignant classes and thus help physicians with further analysis by minimizing potential errors that can be made by fatigued or inexperienced physicians. This paper proposes a new shape metric based on the area ratio of a circle to classify mammographic images into benign and malignant class. Support Vector Machine is used as a machine learning tool for training and classification purposes. The improved …


An Analysis Of Accuracy Using Logistic Regression And Time Series, Edwin Baidoo, Jennifer L. Priestley Jan 2016

An Analysis Of Accuracy Using Logistic Regression And Time Series, Edwin Baidoo, Jennifer L. Priestley

Published and Grey Literature from PhD Candidates

This paper analyzes the accuracy rates for logistic regression and time series models. It also examines a relatively new performance index that takes into consideration the business assumptions of credit markets. Although prior research has focused on evaluation metrics, such as AUC and Gini index, this new measure has a more intuitive interpretation for various managers and decision makers and can be applied to both Logistic and Time Series models.


Assessing The Utility Of Imaging Radar For Identifying White Sand Vegetation Structure, Jessica Rosenqvist Jan 2016

Assessing The Utility Of Imaging Radar For Identifying White Sand Vegetation Structure, Jessica Rosenqvist

Dissertations and Theses

White sand vegetation communities are wide spread across South America; found in Peru, Venezuela, Brazilian Amazon and Guyana. They are distributed in patches ranging from <1 km2 to greater than tens of square kilometers and their origins and locations are still not well understood. The communities are related to a variety of factors (soil type, flooding, nutrient content and fire); hence a precise definition for the ecosystem is still not fully defined. Nevertheless, the result of these variations creates a unique environment for endemic plant and animal species to thrive. Furthermore, analysis of these areas has been very scattered and identification of local white sand areas (<1 km2) have not been accomplished. In addition, identification of these locations has currently only used optical satellite imagery (Landsat, MODIS). Hence, in this project, we have attempted to use synthetic aperture radar to create a classification system to locate the white sand vegetation systems. The goal is to be able to apply this method to identify white sand vegetation distribution across South America. The region of focus for this thesis has been in Aracá, a large white sand area located in Brazil in the State of Amazonas. Due to the lack of ground reference data, a classified map by Capurucho et al. (2013), generated using Landsat data, was used as a comparison and reference. JAXA’s ALOS-1 PALSAR (L-band), ESA’s Sentinel-1A (C-band) and NASA’s SRTM sensors were used for land classification. As microwave signals penetrate clouds and haze, the advantage of using sensors with this wavelength allows for an unobstructed coverage of the landscape all year round. Different combinations of polarizations and wavelengths were used during the analysis to try and separate the white sand vegetation from water and terra firme forest. The resulting classification images showed a 30% agreement with the classification map by Capurucho et al. It is important to note, that this number is in fact an agreement percentage as the map used was a classification image and coarse in resolution (due to the lack of reference data). Therefore, this value does not imply a bad classification. Future work will include time-series data, precise ground reference points and data from other sensors such as ALOS-2 PALSAR, to improve the classification accuracy.


Classification Of Natural Phytoplankton Populations With Fluorescence Excitation-Based Imaging Multivariate Optical Computing, Shawna Kathleen Tazik Jan 2016

Classification Of Natural Phytoplankton Populations With Fluorescence Excitation-Based Imaging Multivariate Optical Computing, Shawna Kathleen Tazik

Theses and Dissertations

Phytoplankton account for the majority of the primary productivity in the ocean and contribute significantly to the global carbon cycle through photosynthesis. A quantitative characterization of phytoplankton cell size and taxonomic composition is essential for understanding marine biogeochemical cycles, quantifying carbon export, and for predicting the ocean’s response to future climate change. Our labs have developed a new instrument for this purpose that combines fluorescence excitation spectroscopy with an all-optical approach to multivariate statistics called multivariate optical computing (MOC). The instrument, known as the Shipboard Streak Imaging Multivariate Optical Computing (SSIMOC) photometer, is a simple filter photometer that images the …


Integration Of Spectral And Spatial Information Via Local Covariance Matrices For Segmentation And Classification Of Hyperspectral Images, Uğur Ergül, Gökhan Bi̇lgi̇n Jan 2016

Integration Of Spectral And Spatial Information Via Local Covariance Matrices For Segmentation And Classification Of Hyperspectral Images, Uğur Ergül, Gökhan Bi̇lgi̇n

Turkish Journal of Electrical Engineering and Computer Sciences

In this work, a novel approach is presented for the feature extraction step in hyperspectral image processing to form more discriminative features between different pixel regions. The proposed method combines both spatial and spectral information, which is very important for segmentation and classification of hyperspectral images. For comparison, five different feature sets are formed using eigen decomposition of local covariance matrices of subcubes located around a pixel of interest in the scene. Subcubes of neighbor pixels are obtained by a windowed structure to expose pattern similarities. As a novel approach, local covariance matrices are computed in eigenspace and proposed feature …


A Comprehensive Comparison Of Features And Embedding Methods For Face Recognition, Hasan Serhan Yavuz, Hakan Çevi̇kalp, Ri̇fat Edi̇zkan Jan 2016

A Comprehensive Comparison Of Features And Embedding Methods For Face Recognition, Hasan Serhan Yavuz, Hakan Çevi̇kalp, Ri̇fat Edi̇zkan

Turkish Journal of Electrical Engineering and Computer Sciences

Face recognition is an essential issue in modern-day applications since it can be used in many areas for several purposes. Many methods have been proposed for face recognition. It is a difficult task since variations in lighting, instantaneous mimic varieties, posing angles, and scaling differences can drastically change the appearance of the face. To suppress these complications, effective feature extraction and proper alignment of face images gain as much importance as the recognition method choice. In this paper, we provide an extensive comparison of the state-of-the-art face recognition methods with the most well-known techniques used in feature representation. In order …


A New Fuzzy Membership Assignment And Model Selection Approach Based On Dynamic Class Centers For Fuzzy Svm Family Using The Firefly Algorithm, Omid Naghash Almasi, Modjtaba Rouhani Jan 2016

A New Fuzzy Membership Assignment And Model Selection Approach Based On Dynamic Class Centers For Fuzzy Svm Family Using The Firefly Algorithm, Omid Naghash Almasi, Modjtaba Rouhani

Turkish Journal of Electrical Engineering and Computer Sciences

The support vector machine (SVM) is a powerful tool for classification problems. Unfortunately, the training phase of the SVM is highly sensitive to noises in the training set. Noises are inevitable in real-world applications. To overcome this problem, the SVM was extended to a fuzzy SVM by assigning an appropriate fuzzy membership to each data point. However, suitable choice of fuzzy memberships and an accurate model selection raise fundamental issues. In this paper, we propose a new method based on optimization methods to simultaneously generate appropriate fuzzy membership and solve the model selection problem for the SVM family in linear/nonlinear …


Pro-Fit: Exercise With Friends, Saumil Dharia, Vijesh Jain, Jvalant Patel, Jainikkumar Vora, Rizen Yamauchi, Magdalini Eirinaki, Iraklis Varlamis Jan 2016

Pro-Fit: Exercise With Friends, Saumil Dharia, Vijesh Jain, Jvalant Patel, Jainikkumar Vora, Rizen Yamauchi, Magdalini Eirinaki, Iraklis Varlamis

Faculty Publications

The advancements in wearable technology, where embedded accelerometers, gyroscopes and other sensors enable the users to actively monitor their activity have made it easier for individuals to pursue a healthy lifestyle. However, most of the existing applications expect continuous commitment from the end users, who need to proactively interact with the application in order to connect with friends and attain their goals. These applications fail to engage and motivate users who have busy schedules, or are not as committed and self-motivated. In this work, we present PRO-Fit, a personalized fitness assistant application that employs machine learning and recommendation algorithms in …


The New Issues In Classification Problems, Md Mahmudul Hasan Jan 2016

The New Issues In Classification Problems, Md Mahmudul Hasan

Open Access Theses & Dissertations

The data involved with science and engineering getting bigger everyday. To study and organize a big amount of data is difficult without classification. In machine learning, classification is the problem of identifying a given data from a set of categories. There are several classification technique people using to classify a given data. In our work we present a sparse representation technique to perform classification. The popularity of this technique motivates us to use on our collected samples. To find a sparse representation, we used an $l_1$-minimization algorithm which is a convex relaxation algorithm proven very efficient by researchers. The purpose …


Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes Jan 2016

Towards An Infodemiological Algorithm For Classification Of Filipino Health Tweets, Ma. Regina Justina E. Estuar, Kennedy E. Espina, Delfin Jay Sabido Ix, Raymond Josef Edward Lara, Vikki Car De Los Reyes

Department of Information Systems & Computer Science Faculty Publications

Finding innovative ICT solutions to enhance the Philippines’ health sector is part and parcel of the Philippine eHealth Strategic Framework and Plan 2020 program. This study sees the opportunity of using collected Twitter data to create a model that processes tweets to produce a dataset that may be relevant in the field of epidemiology and infodemiology. Through the collection of relevant tweets, future studies may make use of the output of this research for various purposes, such as the improvement of epidemiological systems of the Department of Health in support of the eHealth strategy. In this study, we …