Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 27 of 27

Full-Text Articles in Entire DC Network

Accuracy Analysis Comparison Of Supervised Classification Methods For Anomaly Detection On Levees Using Sar Imagery, Ramakalavathi Marapareddy, James V. Aanstoos, Nicolas H. Younan Dec 2017

Accuracy Analysis Comparison Of Supervised Classification Methods For Anomaly Detection On Levees Using Sar Imagery, Ramakalavathi Marapareddy, James V. Aanstoos, Nicolas H. Younan

Faculty Publications

This paper analyzes the use of a synthetic aperture radar (SAR) imagery to support levee condition assessment by detecting potential slide areas in an efficient and cost-effective manner. Levees are prone to a failure in the form of internal erosion within the earthen structure and landslides (also called slough or slump slides). If not repaired, slough slides may lead to levee failures. In this paper, we compare the accuracy of the supervised classification methods minimum distance (MD) using Euclidean and Mahalanobis distance, support vector machine (SVM), and maximum likelihood (ML), using SAR technology to detect slough slides on earthen levees. …


Reconstructing Yeasts Phylogenies And Ancestors From Whole Genome Data, Bing Feng, Yu Lin, Lingxi Zhou, Yan Guo, Robert Friedman, Roufan Xia, Chao Liu, Jijun Tang Nov 2017

Reconstructing Yeasts Phylogenies And Ancestors From Whole Genome Data, Bing Feng, Yu Lin, Lingxi Zhou, Yan Guo, Robert Friedman, Roufan Xia, Chao Liu, Jijun Tang

Faculty Publications

Phylogenetic studies aim to discover evolutionary relationships and histories. These studies are based on similarities of morphological characters and molecular sequences. Currently, widely accepted phylogenetic approaches are based on multiple sequence alignments, which analyze shared gene datasets and concatenate/coalesce these results to a final phylogeny with maximum support. However, these approaches still have limitations, and often have conflicting results with each other. Reconstructing ancestral genomes helps us understand mechanisms and corresponding consequences of evolution. Most existing genome level phylogeny and ancestor reconstruction methods can only process simplified real genome datasets or simulated datasets with identical genome content, unique genome markers, …


Reconstructing Yeasts Phylogenies And Ancestors From Whole Genome Data, Bing Feng, Yu Ling, Lingxi Zhou, Roufan Xia, Fei Hu, Chao Liu Nov 2017

Reconstructing Yeasts Phylogenies And Ancestors From Whole Genome Data, Bing Feng, Yu Ling, Lingxi Zhou, Roufan Xia, Fei Hu, Chao Liu

Faculty Publications

Phylogenetic studies aim to discover evolutionary relationships and histories. These studies are based on similarities of morphological characters and molecular sequences. Currently, widely accepted phylogenetic approaches are based on multiple sequence alignments, which analyze shared gene datasets and concatenate/coalesce these results to a final phylogeny with maximum support. However, these approaches still have limitations, and often have conflicting results with each other. Reconstructing ancestral genomes helps us understand mechanisms and corresponding consequences of evolution. Most existing genome level phylogeny and ancestor reconstruction methods can only process simplified real genome datasets or simulated datasets with identical genome content, unique genome markers, …


Teaching Stats For Data Science, Daniel Kaplan Nov 2017

Teaching Stats For Data Science, Daniel Kaplan

Faculty Publications

“Data science” is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the “tidyverse” provide a concise and readable notation for wrangling, visualization, …


Phylogeny Analysis From Gene-Order Data With Massive Duplications, Lingxi Zhou, Yu Ling, Bing Feng, Jieyi Zhao, Jijun Tang Oct 2017

Phylogeny Analysis From Gene-Order Data With Massive Duplications, Lingxi Zhou, Yu Ling, Bing Feng, Jieyi Zhao, Jijun Tang

Faculty Publications

Background: Gene order changes, under rearrangements, insertions, deletions and duplications, have been used as a new type of data source for phylogenetic reconstruction. Because these changes are rare compared to sequence mutations, they allow the inference of phylogeny further back in evolutionary time. There exist many computational methods for the reconstruction of gene-order phylogenies, including widely used maximum parsimonious methods and maximum likelihood methods. However, both methods face challenges in handling large genomes with many duplicated genes, especially in the presence of whole genome duplication.

Methods: In this paper, we present three simple yet powerful methods based on maximum-likelihood (ML) …


Formal Performance Guarantees For An Approach To Human In The Loop Robot Missions, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang Oct 2017

Formal Performance Guarantees For An Approach To Human In The Loop Robot Missions, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang

Faculty Publications

Abstract— A key challenge in the automatic verification of robot mission software, especially critical mission software, is to be able to effectively model the performance of a human operator and factor that into the formal performance guarantees for the mission. We present a novel approach to modelling the skill level of the operator and integrating it into automatic verification using a linear Gaussians model parameterized by experimental calibration. Our approach allows us to model different skill levels directly in terms of the behavior of the lumped, robot plus operator, system.

Using MissionLab and VIPARS (a behavior-based robot mission verification …


A Framework For Recommendation Of Highly Popular News Lacking Social Feedback, Nuno Moniz, Luís Torgo, Magdalini Eirinaki, Paula Branco Oct 2017

A Framework For Recommendation Of Highly Popular News Lacking Social Feedback, Nuno Moniz, Luís Torgo, Magdalini Eirinaki, Paula Branco

Faculty Publications

Social media is rapidly becoming the main source of news consumption for users, raising significant challenges to news aggregation and recommendation tasks. One of these challenges concerns the recommendation of very recent news. To tackle this problem, approaches to the prediction of news popularity have been proposed. In this paper, we study the task of predicting news popularity upon their publication, when social feedback is unavailable or scarce, and to use such predictions to produce news rankings. Unlike previous work, we focus on accurately predicting highly popular news. Such cases are rare, causing known issues for standard prediction models and …


Improvement Of Phylogenetic Method To Analyze Compositional Heterogeneity, Zehua Zhang, Kecheng Guo, Gaofeng Pan, Jijun Tang, Fei Guo Sep 2017

Improvement Of Phylogenetic Method To Analyze Compositional Heterogeneity, Zehua Zhang, Kecheng Guo, Gaofeng Pan, Jijun Tang, Fei Guo

Faculty Publications

Background: Phylogenetic analysis is a key way to understand current research in the biological processes and detect theory in evolution of natural selection. The evolutionary relationship between species is generally reflected in the form of phylogenetic trees. Many methods for constructing phylogenetic trees, are based on the optimization criteria. We extract the biological data via modeling features, and then compare these characteristics to study the biological evolution between species.

Results: Here, we use maximum likelihood and Bayesian inference method to establish phylogenetic trees; multi-chain Markov chain Monte Carlo sampling method can be used to select optimal phylogenetic tree, resolving local …


Automated Software Testing In The Dod: Current Practices And Opportunities For Improvement, Darryl K. Ahner, James Wisnowski, James R. Simpson Sep 2017

Automated Software Testing In The Dod: Current Practices And Opportunities For Improvement, Darryl K. Ahner, James Wisnowski, James R. Simpson

Faculty Publications

The concept of automating the testing of software-intensive systems has been around for decades, but the practice of automating testing is scarce in many industries, especially in the government defense sector. A one-year project initiated by the Office of the Secretary of Defense (OSD), Scientific Test and Analysis Techniques Center of Excellence (STAT COE) and sponsored by Navy OPNAV N94 set out to:

  • study the degree to which the Department of Defense (DoD) has adopted automated software testing (AST);
  • share the best software practices used by industry; and
  • develop and distribute an AST implementation guide intended for program management and …


Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan Sep 2017

Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan

Faculty Publications

We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) …


The Human Touch: How Non-Expert Users Perceive, Interpret, And Fix Topic Models, Tak Yeon Lee, Alison Smith, Kevin Seppi, Niklas Elmqvist, Jordan Boyd-Graber, Leah Findlater Sep 2017

The Human Touch: How Non-Expert Users Perceive, Interpret, And Fix Topic Models, Tak Yeon Lee, Alison Smith, Kevin Seppi, Niklas Elmqvist, Jordan Boyd-Graber, Leah Findlater

Faculty Publications

Topic modeling is a common tool for understanding large bodies of text, but is typically provided as a “take it or leave it” proposition. Incorporating human knowledge in unsupervised learning is a promising approach to create high-quality topic models. Existing interactive systems and modeling algorithms support a wide range of refinement operations to express feedback. However, these systems’ interactions are primarily driven by algorithmic convenience, ignoring users who may lack expertise in topic modeling. To better understand how non-expert users understand, assess, and refine topics, we conducted two user studies—an in-person interview study and an online crowdsourced study. These studies …


Anomalydetection: Implementation Of Augmented Network Log Anomaly Detection Procedures, Robert J. Gutierrez, Bradley C. Boehmke, Kenneth W. Bauer, Cade M. Saie, Trevor J. Bihl Aug 2017

Anomalydetection: Implementation Of Augmented Network Log Anomaly Detection Procedures, Robert J. Gutierrez, Bradley C. Boehmke, Kenneth W. Bauer, Cade M. Saie, Trevor J. Bihl

Faculty Publications

As the number of cyber-attacks continues to grow on a daily basis, so does the delay in threat detection. For instance, in 2015, the Office of Personnel Management discovered that approximately 21.5 million individual records of Federal employees and contractors had been stolen. On average, the time between an attack and its discovery is more than 200 days. In the case of the OPM breach, the attack had been going on for almost a year. Currently, cyber analysts inspect numerous potential incidents on a daily basis, but have neither the time nor the resources available to perform such a task. …


An Ameliorated Prediction Of Drug–Target Interactions Based On Multi-Scale Discrete Wavelet Transform And Network Features, Cong Shen, Yijie Ding, Jijun Tang, Xinying Xu, Fei Guo Aug 2017

An Ameliorated Prediction Of Drug–Target Interactions Based On Multi-Scale Discrete Wavelet Transform And Network Features, Cong Shen, Yijie Ding, Jijun Tang, Xinying Xu, Fei Guo

Faculty Publications

The prediction of drug–target interactions (DTIs) via computational technology plays a crucial role in reducing the experimental cost. A variety of state-of-the-art methods have been proposed to improve the accuracy of DTI predictions. In this paper, we propose a kind of drug–target interactions predictor adopting multi-scale discrete wavelet transform and network features (named as DAWN) in order to solve the DTIs prediction problem. We encode the drug molecule by a substructure fingerprint with a dictionary of substructure patterns. Simultaneously, we apply the discrete wavelet transform (DWT) to extract features from target sequences. Then, we concatenate and normalize the target, drug, …


Exact And Heuristic Algorithms For Risk-Aware Stochastic Physical Search, Daniel S. Brown, Jeffrey Hudack, Nathaniel Gemelli, Bikramjit Banerjee Aug 2017

Exact And Heuristic Algorithms For Risk-Aware Stochastic Physical Search, Daniel S. Brown, Jeffrey Hudack, Nathaniel Gemelli, Bikramjit Banerjee

Faculty Publications

We consider an intelligent agent seeking to obtain an item from one of several physical locations, where the cost to obtain the item at each location is stochastic. We study risk-aware stochastic physical search (RA-SPS), where both the cost to travel and the cost to obtain the item are taken from the same budget and where the objective is to maximize the probability of success while minimizing the required budget. This type of problem models many task-planning scenarios, such as space exploration, shopping, or surveillance. In these types of scenarios, the actual cost of completing an objective at a location …


An Ensemble Deep Convolutional Neural Network Model With Improved D-S Evidence Fusion For Bearing Fault Diagnosis, Shaobo Li, Guoka Liu, Xianghong Tang, Jianguang Lu, Jianjun Hu Jul 2017

An Ensemble Deep Convolutional Neural Network Model With Improved D-S Evidence Fusion For Bearing Fault Diagnosis, Shaobo Li, Guoka Liu, Xianghong Tang, Jianguang Lu, Jianjun Hu

Faculty Publications

Intelligent machine health monitoring and fault diagnosis are becoming increasingly important for modern manufacturing industries. Current fault diagnosis approaches mostly depend on expert-designed features for building prediction models. In this paper, we proposed IDSCNN, a novel bearing fault diagnosis algorithm based on ensemble deep convolutional neural networks and an improved Dempster–Shafer theory based evidence fusion. The convolutional neural networks take the root mean square (RMS) maps from the FFT (Fast Fourier Transformation) features of the vibration signals from two sensors as inputs. The improved D-S evidence theory is implemented via distance matrix from evidences and modified Gini Index. Extensive evaluations …


An Ensemble Multilabel Classification For Disease Risk Prediction, Runzhi Li, Wei Liu, Yusong Lin, Hongling Zhao, Chaoyang Zhang Jun 2017

An Ensemble Multilabel Classification For Disease Risk Prediction, Runzhi Li, Wei Liu, Yusong Lin, Hongling Zhao, Chaoyang Zhang

Faculty Publications

It is important to identify and prevent disease risk as early as possible through regular physical examinations. We formulate the disease risk prediction into a multilabel classification problem. A novel Ensemble Label Power-set Pruned datasets Joint Decomposition (ELPPJD) method is proposed in this work. First, we transform the multilabel classification into a multiclass classification. Then, we propose the pruned datasets and joint decomposition methods to deal with the imbalance learning problem. Two strategies size balanced (SB) and label similarity (LS) are designed to decompose the training dataset. In the experiments, the dataset is from the real physical examination records. We …


Digital Hegemonies: The Localness Of Search Engine Results, Andrea Ballatore, Mark Graham, Shilad Sen May 2017

Digital Hegemonies: The Localness Of Search Engine Results, Andrea Ballatore, Mark Graham, Shilad Sen

Faculty Publications

Every day, billions of Internet users rely on search engines to find information about places to make decisions about tourism, shopping, and countless other economic activities. In an opaque process, search engines assemble digital content produced in a variety of locations around the world and make it available to large cohorts of consumers. Although these representations of place are increasingly important and consequential, little is known about their characteristics and possible biases. Analyzing a corpus of Google search results generated for 188 capital cities, this article investigates the geographic dimension of search results, focusing on searches such as “Lagos” and …


An Approach To Robust Homing With Stereovision, Fuqiang Fu, Damian Lyons Apr 2017

An Approach To Robust Homing With Stereovision, Fuqiang Fu, Damian Lyons

Faculty Publications

Visual Homing is a bioinspired approach to robot navigation which can be fast and uses few assumptions. However, visual homing in a cluttered and unstructured outdoor environment offers several challenges to homing methods that have been developed for primarily indoor environments. One issue is that any current image during homing may be tilted with respect to the home image. The second is that moving through a cluttered scene during homing may cause obstacles to interfere between the home scene and location and the current scene and location. In this paper, we introduce a robust method to improve a previous developed …


Multi-Valued Sequences Generated By Power Residue Symbols Over Odd Characteristic Fields, Begum Nasima, Yasuyuki Nogami, Satoshi Uehara, Robert Morelos-Zaragoza Apr 2017

Multi-Valued Sequences Generated By Power Residue Symbols Over Odd Characteristic Fields, Begum Nasima, Yasuyuki Nogami, Satoshi Uehara, Robert Morelos-Zaragoza

Faculty Publications

This paper proposes a new approach for generating pseudo random multi-valued (including binary-valued) sequences. The approach uses a primitive polynomial over an odd characteristic prime field $\f{p}$, where p is an odd prime number. Then, for the maximum length sequence of vectors generated by the primitive polynomial, the trace function is used for mapping these vectors to scalars as elements in the prime field. Power residue symbol (Legendre symbol in binary case) is applied to translate the scalars to k-value scalars, where k is a prime factor of p-1. Finally, a pseudo random k-value sequence is obtained. Some important properties …


Whitelisting System State In Windows Forensic Memory Visualizations, Joshua A. Lapso, Gilbert L. Peterson, James S. Okolica Mar 2017

Whitelisting System State In Windows Forensic Memory Visualizations, Joshua A. Lapso, Gilbert L. Peterson, James S. Okolica

Faculty Publications

Examiners in the field of digital forensics regularly encounter enormous amounts of data and must identify the few artifacts of evidentiary value. One challenge these examiners face is manual reconstruction of complex datasets with both hierarchical and associative relationships. The complexity of this data requires significant knowledge, training, and experience to correctly and efficiently examine. Current methods provide text-based representations or low-level visualizations, but levee the task of maintaining global context of system state on the examiner. This research presents a visualization tool that improves analysis methods through simultaneous representation of the hierarchical and associative relationships and local detailed data …


Modeling, Simulation, And Performance Analysis Of Decoy State Enabled Quantum Key Distribution Systems, Logan O. Mailloux, Michael R. Grimaila, Douglas D. Hodson, Ryan D. Engle, Colin V. Mclaughlin, Gerald B. Baumgartner Feb 2017

Modeling, Simulation, And Performance Analysis Of Decoy State Enabled Quantum Key Distribution Systems, Logan O. Mailloux, Michael R. Grimaila, Douglas D. Hodson, Ryan D. Engle, Colin V. Mclaughlin, Gerald B. Baumgartner

Faculty Publications

Quantum Key Distribution (QKD) systems exploit the laws of quantum mechanics to generate secure keying material for cryptographic purposes. To date, several commercially viable decoy state enabled QKD systems have been successfully demonstrated and show promise for high-security applications such as banking, government, and military environments. In this work, a detailed performance analysis of decoy state enabled QKD systems is conducted through model and simulation of several common decoy state configurations. The results of this study uniquely demonstrate that the decoy state protocol can ensure Photon Number Splitting (PNS) attacks are detected with high confidence, while maximizing the system’s quantum …


Performance Verification For Robot Missions In Uncertain Environments, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang Jan 2017

Performance Verification For Robot Missions In Uncertain Environments, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang

Faculty Publications

Abstract—Certain robot missions need to perform predictably in a physical environment that may have significant uncertainty. One approach is to leverage automatic software verification techniques to establish a performance guarantee. The addition of an environment model and uncertainty in both program and environment, however, means the state-space of a model-checking solution to the problem can be prohibitively large. An approach based on behavior-based controllers in a process-algebra framework that avoids state-space combinatorics is presented here. In this approach, verification of the robot program in the uncertain environment is reduced to a filtering problem for a Bayesian Network. Validation results …


Robust And Agile System Against Fault And Anomaly Traffic In Software Defined Networks, Mihui Kim, Younghee Park, Rohit Kotalwar Jan 2017

Robust And Agile System Against Fault And Anomaly Traffic In Software Defined Networks, Mihui Kim, Younghee Park, Rohit Kotalwar

Faculty Publications

The main advantage of software defined networking (SDN) is that it allows intelligent control and management of networking though programmability in real time. It enables efficient utilization of network resources through traffic engineering, and offers potential attack defense methods when abnormalities arise. However, previous studies have only identified individual solutions for respective problems, instead of finding a more global solution in real time that is capable of addressing multiple situations in network status. To cover diverse network conditions, this paper presents a comprehensive reactive system for simultaneously monitoring failures, anomalies, and attacks for high availability and reliability. We design three …


Impact Of Reviewer Social Interaction On Online Consumer Review Fraud Detection, Kunal Goswami, Younghee Park, Chungsik Song Jan 2017

Impact Of Reviewer Social Interaction On Online Consumer Review Fraud Detection, Kunal Goswami, Younghee Park, Chungsik Song

Faculty Publications

Background Online consumer reviews have become a baseline for new consumers to try out a business or a new product. The reviews provide a quick look into the application and experience of the business/product and market it to new customers. However, some businesses or reviewers use these reviews to spread fake information about the business/product. The fake information can be used to promote a relatively average product/business or can be used to malign their competition. This activity is known as reviewer fraud or opinion spam. The paper proposes a feature set, capturing the user social interaction behavior to identify fraud. …


Human-Centered Authentication Guidelines, Jeremiah Still, Ashley Cain, David Schuster Jan 2017

Human-Centered Authentication Guidelines, Jeremiah Still, Ashley Cain, David Schuster

Faculty Publications

PurposeDespite the widespread use of authentication schemes and the rapid emergence of novel authentication schemes, a general set of domain-specific guidelines has not yet been developed. This paper aims to present and explain a list of human-centered guidelines for developing usable authentication schemes.Design/methodology/approachThe guidelines stem from research findings within the fields of psychology, human–computer interaction and information/computer science.FindingsInstead of viewing users as the inevitable weak point in the authentication process, this study proposes that authentication interfaces be designed to take advantage of users’ natural abilities. This approach requires that one understands how interactions with authentication interfaces can be improved and …


Establishing A-Priori Performance Guarantees For Robot Missions That Include Localization Software, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang Jan 2017

Establishing A-Priori Performance Guarantees For Robot Missions That Include Localization Software, Damian Lyons, Ron Arkin, Shu Jiang, Matt O'Brien, Feng Tang, Peng Tang

Faculty Publications

One approach to determining whether an automated system is performing correctly is to monitor its performance, signaling when the performance is not acceptable; another approach is to automatically analyze the possible behaviors of the system a-priori and determine performance guarantees. Thea authors have applied this second approach to automatically derive performance guarantees for behaviorbased, multi-robot critical mission software using an innovative approach to formal verification for robotic software. Localization and mapping algorithms can allow a robot to navigate well in an unknown environment. However, whether such algorithms enhance any specific robot mission is currently a matter for empirical validation. Several …


Clustering-Based Online Player Modeling, Jason M. Bindewald, Gilbert L. Peterson, Michael E. Miller Jan 2017

Clustering-Based Online Player Modeling, Jason M. Bindewald, Gilbert L. Peterson, Michael E. Miller

Faculty Publications

Being able to imitate individual players in a game can benefit game development by providing a means to create a variety of autonomous agents and aid understanding of which aspects of game states influence game-play. This paper presents a clustering and locally weighted regression method for modeling and imitating individual players. The algorithm first learns a generic player cluster model that is updated online to capture an individual’s game-play tendencies. The models can then be used to play the game or for analysis to identify how different players react to separate aspects of game states. The method is demonstrated on …