Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

Faculty Publications

Discipline
Institution
Keyword
Publication Year

Articles 1 - 13 of 13

Full-Text Articles in Theory and Algorithms

The Impact Of Data Preparation And Model Complexity On The Natural Language Classification Of Chinese News Headlines, Torrey J. Wagner, Dennis Guhl, Brent T. Langhals Mar 2024

The Impact Of Data Preparation And Model Complexity On The Natural Language Classification Of Chinese News Headlines, Torrey J. Wagner, Dennis Guhl, Brent T. Langhals

Faculty Publications

Given the emergence of China as a political and economic power in the 21st century, there is increased interest in analyzing Chinese news articles to better understand developing trends in China. Because of the volume of the material, automating the categorization of Chinese-language news articles by headline text or titles can be an effective way to sort the articles into categories for efficient review. A 383,000-headline dataset labeled with 15 categories from the Toutiao website was evaluated via natural language processing to predict topic categories. The influence of six data preparation variations on the predictive accuracy of four algorithms was …


Toward A Simulation Model Complexity Measure, J. Scott Thompson, Douglas D. Hodson, Michael R. Grimaila, Nicholas Hanlon, Richard Dill Mar 2023

Toward A Simulation Model Complexity Measure, J. Scott Thompson, Douglas D. Hodson, Michael R. Grimaila, Nicholas Hanlon, Richard Dill

Faculty Publications

Is it possible to develop a meaningful measure for the complexity of a simulation model? Algorithmic information theory provides concepts that have been applied in other areas of research for the practical measurement of object complexity. This article offers an overview of the complexity from a variety of perspectives and provides a body of knowledge with respect to the complexity of simulation models. The key terms model detail, resolution, and scope are defined. An important concept from algorithmic information theory, Kolmogorov complexity, and an application of this concept, normalized compression distance, are used to indicate the possibility of measuring changes …


Robust Error Estimation Based On Factor-Graph Models For Non-Line-Of-Sight Localization, O. Arda Vanli, Clark N. Taylor Jan 2022

Robust Error Estimation Based On Factor-Graph Models For Non-Line-Of-Sight Localization, O. Arda Vanli, Clark N. Taylor

Faculty Publications

This paper presents a method to estimate the covariances of the inputs in a factor-graph formulation for localization under non-line-of-sight conditions. A general solution based on covariance estimation and M-estimators in linear regression problems, is presented that is shown to give unbiased estimators of multiple variances and are robust against outliers. An iteratively re-weighted least squares algorithm is proposed to jointly compute the proposed variance estimators and the state estimates for the nonlinear factor graph optimization. The efficacy of the method is illustrated in a simulation study using a robot localization problem under various process and measurement models and measurement …


Effect Of Trigonometric Transformations On The Machine Learning Prediction And Quality Control Of Air Temperature, Andrea Fenoglio [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals Jan 2022

Effect Of Trigonometric Transformations On The Machine Learning Prediction And Quality Control Of Air Temperature, Andrea Fenoglio [*], Torrey J. Wagner, Paul Auclair, Brent T. Langhals

Faculty Publications

Conducting effective quality control of weather observations in real time is vital to the 14th Weather Squadron’s mission of providing authoritative climate data. This study explored automated quality control of weather observations by applying multiple machine learning techniques to 43,487 surface weather observations from 5 years of data at a single location. Temperature predictors were evaluated using recursive feature elimination on linear regression and XGBoost algorithms, as well as using a neural network hyperparameter sweep. Modeling was repeated after calculating trigonometric transforms of temporal variables to give the models insight into the diurnal heating cycle of the Earth. All models …


Implications Of The Quantum Dna Model For Information Sciences, F. Matthew Mihelic Apr 2021

Implications Of The Quantum Dna Model For Information Sciences, F. Matthew Mihelic

Faculty Publications

The DNA molecule can be modeled as a quantum logic processor, and this model has been supported by pilot research that experimentally demonstrated non-local communication between cells in separated cell cultures. This modeling and pilot research have important implications for information sciences, providing a potential architecture for quantum computing that operates at room temperature and is scalable to millions of qubits, and including the potential for an entanglement communication system based upon the quantum DNA architecture. Such a system could be used to provide non-local quantum key distribution that could not be blocked by any shielding or water depth, would …


An Underground Radio Wave Propagation Prediction Model For Digital Agriculture, Abdul Salam Apr 2019

An Underground Radio Wave Propagation Prediction Model For Digital Agriculture, Abdul Salam

Faculty Publications

Underground sensing and propagation of Signals in the Soil (SitS) medium is an electromagnetic issue. The path loss prediction with higher accuracy is an open research subject in digital agriculture monitoring applications for sensing and communications. The statistical data are predominantly derived from site-specific empirical measurements, which is considered an impediment to universal application. Nevertheless, in the existing literature, statistical approaches have been applied to the SitS channel modeling, where impulse response analysis and the Friis open space transmission formula are employed as the channel modeling tool in different soil types under varying soil moisture conditions at diverse communication distances …


A Theoretical Model Of Underground Dipole Antennas For Communications In Internet Of Underground Things, Abdul Salam, Mehmet C. Vuran, Xin Dong, Christos Argyropoulos, Suat Irmak Feb 2019

A Theoretical Model Of Underground Dipole Antennas For Communications In Internet Of Underground Things, Abdul Salam, Mehmet C. Vuran, Xin Dong, Christos Argyropoulos, Suat Irmak

Faculty Publications

The realization of Internet of Underground Things (IOUT) relies on the establishment of reliable communication links, where the antenna becomes a major design component due to the significant impacts of soil. In this paper, a theoretical model is developed to capture the impacts of change of soil moisture on the return loss, resonant frequency, and bandwidth of a buried dipole antenna. Experiments are conducted in silty clay loam, sandy, and silt loam soil, to characterize the effects of soil, in an indoor testbed and field testbeds. It is shown that at subsurface burial depths (0.1-0.4m), change in soil moisture impacts …


Sequence Pattern Mining With Variables, James S. Okolica, Gilbert L. Peterson, Robert F. Mills, Michael R. Grimaila Nov 2018

Sequence Pattern Mining With Variables, James S. Okolica, Gilbert L. Peterson, Robert F. Mills, Michael R. Grimaila

Faculty Publications

Sequence pattern mining (SPM) seeks to find multiple items that commonly occur together in a specific order. One common assumption is that all of the relevant differences between items are captured through creating distinct items, e.g., if color matters then the same item in two different colors would have two items created, one for each color. In some domains, that is unrealistic. This paper makes two contributions. The first extends SPM algorithms to allow item differentiation through attribute variables for domains with large numbers of items, e.g, by having one item with a variable with a color attribute rather than …


The Effectiveness Of Using Diversity To Select Multiple Classifier Systems With Varying Classification Thresholds, Harris K. Butler Iv, Mark A. Friend, Kenneth W. Bauer, Trevor J. Bihl Sep 2018

The Effectiveness Of Using Diversity To Select Multiple Classifier Systems With Varying Classification Thresholds, Harris K. Butler Iv, Mark A. Friend, Kenneth W. Bauer, Trevor J. Bihl

Faculty Publications

In classification applications, the goal of fusion techniques is to exploit complementary approaches and merge the information provided by these methods to provide a solution superior than any single method. Associated with choosing a methodology to fuse pattern recognition algorithms is the choice of algorithm or algorithms to fuse. Historically, classifier ensemble accuracy has been used to select which pattern recognition algorithms are included in a multiple classifier system. More recently, research has focused on creating and evaluating diversity metrics to more effectively select ensemble members. Using a wide range of classification data sets, methodologies, and fusion techniques, current diversity …


Cyber Anomaly Detection: Using Tabulated Vectors And Embedded Analytics For Efficient Data Mining, Robert J. Gutierrez, Kenneth W. Bauer, Bradley C. Boehmke, Cade M. Saie, Trevor J. Bihl Aug 2018

Cyber Anomaly Detection: Using Tabulated Vectors And Embedded Analytics For Efficient Data Mining, Robert J. Gutierrez, Kenneth W. Bauer, Bradley C. Boehmke, Cade M. Saie, Trevor J. Bihl

Faculty Publications

Firewalls, especially at large organizations, process high velocity internet traffic and flag suspicious events and activities. Flagged events can be benign, such as misconfigured routers, or malignant, such as a hacker trying to gain access to a specific computer. Confounding this is that flagged events are not always obvious in their danger and the high velocity nature of the problem. Current work in firewall log analysis is manual intensive and involves manpower hours to find events to investigate. This is predominantly achieved by manually sorting firewall and intrusion detection/prevention system log data. This work aims to improve the ability of …


Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan Sep 2017

Methods For Real-Time Prediction Of The Mode Of Travel Using Smartphone-Based Gps And Accelerometer Data, Bryan D. Martin, Vittorio Addona, Julian Wolfson, Gediminas Adomavicius, Yingling Fan

Faculty Publications

We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) …


Impact Of Reviewer Social Interaction On Online Consumer Review Fraud Detection, Kunal Goswami, Younghee Park, Chungsik Song Jan 2017

Impact Of Reviewer Social Interaction On Online Consumer Review Fraud Detection, Kunal Goswami, Younghee Park, Chungsik Song

Faculty Publications

Background Online consumer reviews have become a baseline for new consumers to try out a business or a new product. The reviews provide a quick look into the application and experience of the business/product and market it to new customers. However, some businesses or reviewers use these reviews to spread fake information about the business/product. The fake information can be used to promote a relatively average product/business or can be used to malign their competition. This activity is known as reviewer fraud or opinion spam. The paper proposes a feature set, capturing the user social interaction behavior to identify fraud. …


An Evolutionary Algorithm To Generate Hyper-Ellipsoid Detectors For Negative Selection, Joseph M. Shapiro, Gary B. Lamont, Gilbert L. Peterson Jun 2005

An Evolutionary Algorithm To Generate Hyper-Ellipsoid Detectors For Negative Selection, Joseph M. Shapiro, Gary B. Lamont, Gilbert L. Peterson

Faculty Publications

This paper introduces hyper-ellipsoids as an improvement to hyper-spheres as intrusion detectors in a negative selection problem within an artificial immune system. Since hyper-spheres are a specialization of hyper-ellipsoids, hyper-ellipsoids retain the benefits of hyper-spheres. However, hyper-ellipsoids are much more flexible, mostly in that they can be stretched and reoriented. The viability of using hyper-ellipsoids is established using several pedagogical problems. We conjecture that fewer hyper-ellipsoids than hyper-spheres are needed to achieve similar coverage of nonself space in a negative selection problem. Experimentation validates this conjecture. In pedagogical benchmark problems, the number of hyper-ellipsoids to achieve good results is significantly …