Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Deep learning (7)
- Machine learning (6)
- Machine Learning (5)
- Conflation (3)
- Data mining (3)
-
- Fuzzy ARTMAP (3)
- Multidimensional data (3)
- Visualization (3)
- Alignment (2)
- CNN (2)
- Causality (2)
- Collocated Paired Coordinates (2)
- Computer security (2)
- Concept drift (2)
- Convolutional neural networks (2)
- Dynamic logic (2)
- Feature extraction (2)
- General Line Coordinates (2)
- H.265 (2)
- HEVC (2)
- Hyperparameter optimization (2)
- Image (2)
- Image processing (2)
- Incremental learning (2)
- Information energy (2)
- K-Nearest Neighbors (2)
- KDD (2)
- Lossless visualization (2)
- Multi-objective optimization (2)
- Neural networks (2)
- Publication Year
- Publication
- Publication Type
Articles 1 - 30 of 72
Full-Text Articles in Physical Sciences and Mathematics
Crosshair Optimizer, Jason Torrence
Crosshair Optimizer, Jason Torrence
All Master's Theses
Metaheuristic optimization algorithms are heuristics that are capable of creating a "good enough'' solution to a computationally complex problem. Algorithms in this area of study are focused on the process of exploration and exploitation: exploration of the solution space and exploitation of the results that have been found during that exploration, with most resources going toward the former half of the process. The novel Crosshair optimizer developed in this thesis seeks to take advantage of the latter, exploiting the best possible result as much as possible by directly searching the area around that best result with a stochastic approach. This …
Weighted Incremental–Decremental Support Vector Machines For Concept Drift With Shifting Window, Honorius Gâlmeanu, Răzvan Andonie
Weighted Incremental–Decremental Support Vector Machines For Concept Drift With Shifting Window, Honorius Gâlmeanu, Răzvan Andonie
Computer Science Faculty Scholarship
We study the problem of learning the data samples’ distribution as it changes in time. This change, known as concept drift, complicates the task of training a model, as the predictions become less and less accurate. It is known that Support Vector Machines (SVMs) can learn weighted input instances and that they can also be trained online (incremental–decremental learning). Combining these two SVM properties, the open problem is to define an online SVM concept drift model with shifting weighted window. The classic SVM model should be retrained from scratch after each window shift. We introduce the Weighted Incremental–Decremental SVM (WIDSVM), …
Information Bottleneck In Deep Learning - A Semiotic Approach, Bogdan Musat, Razvan Andonie
Information Bottleneck In Deep Learning - A Semiotic Approach, Bogdan Musat, Razvan Andonie
Computer Science Faculty Scholarship
The information bottleneck principle was recently proposed as a theory meant to explain some of the training dynamics of deep neural architectures. Via information plane analysis, patterns start to emerge in this framework, where two phases can be distinguished: fitting and compression. We take a step further and study the behaviour of the spatial entropy characterizing the layers of convolutional neural networks (CNNs), in relation to the information bottleneck theory. We observe pattern formations which resemble the information bottleneck fitting and compression phases. From the perspective of semiotics, also known as the study of signs and sign-using behavior, the saliency …
Interpretable Machine Learning For Self-Service High-Risk Decision Making, Charles Recaido
Interpretable Machine Learning For Self-Service High-Risk Decision Making, Charles Recaido
All Master's Theses
This research contributes to interpretable machine learning via visual knowledge discovery in General Line Coordinates (GLC). The concepts of hyperblocks as interpretable dataset units and GLC are combined to create a visual self-service machine learning model. Two variants of GLC known as Dynamic Scaffold Coordinates (DSC) are proposed. DSC1 and DSC2 can map in a lossless manner multiple dataset attributes to a single two-dimensional (X, Y) Cartesian plane using a dynamic scaffolding graph construction algorithm.
Hyperblock analysis is used to determine visually appealing dataset attribute orders and to reduce line occlusion. It is shown that hyperblocks can generalize decision tree …
Concept Drift Adaptation With Incremental–Decremental Svm, Honorius Gâlmeanu, Răzvan Andonie
Concept Drift Adaptation With Incremental–Decremental Svm, Honorius Gâlmeanu, Răzvan Andonie
Computer Science Faculty Scholarship
Data classification in streams where the underlying distribution changes over time is known to be difficult. This problem—known as concept drift detection—involves two aspects: (i) detecting the concept drift and (ii) adapting the classifier. Online training only considers the most recent samples; they form the so-called shifting window. Dynamic adaptation to concept drift is performed by varying the width of the window. Defining an online Support Vector Machine (SVM) classifier able to cope with concept drift by dynamically changing the window size and avoiding retraining from scratch is currently an open problem. We introduce the Adaptive Incremental–Decremental SVM (AIDSVM), a …
Learning In Convolutional Neural Networks Accelerated By Transfer Entropy, Adrian Moldovan, Angel Caţaron, Răzvan Andonie
Learning In Convolutional Neural Networks Accelerated By Transfer Entropy, Adrian Moldovan, Angel Caţaron, Răzvan Andonie
Computer Science Faculty Scholarship
Recently, there is a growing interest in applying Transfer Entropy (TE) in quantifying the effective connectivity between artificial neurons. In a feedforward network, the TE can be used to quantify the relationships between neuron output pairs located in different layers. Our focus is on how to include the TE in the learning mechanisms of a Convolutional Neural Network (CNN) architecture. We introduce a novel training mechanism for CNN architectures which integrates the TE feedback connections. Adding the TE feedback parameter accelerates the training process, as fewer epochs are needed. On the flip side, it adds computational overhead to each epoch. …
Energy Optimization In Multi-Uav-Assisted Edge Data Collection System, Bin Xu, Lu Zhang, Zipeng Xu, Yichuan Liu, Jinming Chai, Sichong Qin, Yanfei Sun
Energy Optimization In Multi-Uav-Assisted Edge Data Collection System, Bin Xu, Lu Zhang, Zipeng Xu, Yichuan Liu, Jinming Chai, Sichong Qin, Yanfei Sun
Student Published Works
In the IoT (Internet of Things) system, the introduction of UAV (Unmanned Aerial Vehicle) as a new data collection platform can solve the problem that IoT devices are unable to transmit data over long distances due to the limitation of their battery energy. However, the unreasonable distribution of UAVs will still lead to the problem of the high total energy consumption of the system. In this work, to deal with the problem, a deployment model of a mobile edge computing (MEC) system based on multi-UAV is proposed. The goal of the model is to minimize the energy consumption of the …
Interactive Visual Self-Service Data Classification Approach To Democratize Machine Learning, Sridevi Narayana Wagle
Interactive Visual Self-Service Data Classification Approach To Democratize Machine Learning, Sridevi Narayana Wagle
All Master's Theses
Machine learning algorithms often produce models considered as complex black-box models by both end users and developers. Such algorithms fail to explain the model in terms of the domain they are designed for. The proposed Iterative Visual Logical Classifier (IVLC) is an interpretable machine learning algorithm that allows end users to design a model and classify data with more confidence and without having to compromise on the accuracy. Such technique is especially helpful when dealing with sensitive and crucial data like cancer data in the medical domain with high cost of errors. With the help of the proposed interactive and …
Visualization For Solving Non-Image Problems And Saliency Mapping, Divya Chandrika Kalla
Visualization For Solving Non-Image Problems And Saliency Mapping, Divya Chandrika Kalla
All Master's Theses
High-dimensional data play an important role in knowledge discovery and data science. Integration of visualization, visual analytics, machine learning (ML), and data mining (DM) are the key aspects of data science research for high-dimensional data. This thesis is to explore the efficiency of a new algorithm to convert non-images data into raster images by visualizing data using heatmap in the collocated paired coordinates (CPC). These images are called the CPC-R images and the algorithm that produces them is called the CPC-R algorithm. Powerful deep learning methods open an opportunity to solve non-image ML/DM problems by transforming non-image ML problems into …
Semiotic Aggregation In Deep Learning, Bogdan Muşat, Răzvan Andonie
Semiotic Aggregation In Deep Learning, Bogdan Muşat, Răzvan Andonie
All Faculty Scholarship for the College of the Sciences
Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In …
Modeling Multi-Targets Sentiment Classification Via Graph Convolutional Networks And Auxiliary Relation, Ao Feng, Zhengjie Gao, Xinyu Song, Ke Ke, Tianhao Xu, Xuelei Zhang
Modeling Multi-Targets Sentiment Classification Via Graph Convolutional Networks And Auxiliary Relation, Ao Feng, Zhengjie Gao, Xinyu Song, Ke Ke, Tianhao Xu, Xuelei Zhang
All Faculty Scholarship for the College of the Sciences
Existing solutions do not work well when multi-targets coexist in a sentence. The reason is that the existing solution is usually to separate multiple targets and process them separately. If the original sentence has N target, the original sentence will be repeated for N times, and only one target will be processed each time. To some extent, this approach degenerates the fine-grained sentiment classification task into the sentencelevel sentiment classification task, and the research method of processing the target separately ignores the internal relation and interaction between the targets. Based on the above considerations, we proposes to use Graph Convolutional …
Weighted Random Search For Cnn Hyperparameter Optimization, Rǎzvan Andonie, Adrian-Cǎtǎlin Florea
Weighted Random Search For Cnn Hyperparameter Optimization, Rǎzvan Andonie, Adrian-Cǎtǎlin Florea
All Faculty Scholarship for the College of the Sciences
Nearly all model algorithms used in machine learning use two different sets of parameters: the training parameters and the meta-parameters (hyperparameters). While the training parameters are learned during the training phase, the values of the hyperparameters have to be specified before learning starts. For a given dataset, we would like to find the optimal combination of hyperparameter values, in a reasonable amount of time. This is a challenging task because of its computational complexity. In previous work, we introduced the Weighted Random Search (WRS) method, a combination of Random Search (RS) and probabilistic greedy heuristic. In the current paper, we …
Learning In Feedforward Neural Networks Accelerated By Transfer Entropy, Adrian Moldovan, Angel Caţaron, Rǎzvan Andonie
Learning In Feedforward Neural Networks Accelerated By Transfer Entropy, Adrian Moldovan, Angel Caţaron, Rǎzvan Andonie
All Faculty Scholarship for the College of the Sciences
Current neural networks architectures are many times harder to train because of the increasing size and complexity of the used datasets. Our objective is to design more efficient training algorithms utilizing causal relationships inferred from neural networks. The transfer entropy (TE) was initially introduced as an information transfer measure used to quantify the statistical coherence between events (time series). Later, it was related to causality, even if they are not the same. There are only few papers reporting applications of causality or TE in neural networks. Our contribution is an information-theoretical method for analyzing information transfer between the nodes of …
Using Cuda To Enhance Data Processing Of Variant Call Format Files For Statistical Genetic Analysis, Heather Mckinnon
Using Cuda To Enhance Data Processing Of Variant Call Format Files For Statistical Genetic Analysis, Heather Mckinnon
All Graduate Projects
Utilizing the power of GPU parallel processing with CUDA can speed up the processing of Variant Call Format (VCF) files and statistical analysis of genomic data. A software package designed toward this purpose would be beneficial to genetic researchers by saving them time which they could spend on other aspects of their research. A data set containing genetics from a study of trichome production in Mimulus guttatus, or yellow monkey flower, was used to develop a package to test the effectiveness of GPU parallel processing versus serial executions. After a serial version of the code was generated and benchmarked, OpenACC …
Optimizing Pollution Routing Problem, Shivika Dewan
Optimizing Pollution Routing Problem, Shivika Dewan
All Master's Theses
Pollution is a major environmental issue around the world. Despite the growing use and impact of commercial vehicles, recent research has been conducted with minimizing pollution as the primary objective to be reduced. The objective of this project is to implement different optimization algorithms to solve this problem. A basic model is created using the Vehicle Routing Problem (VRP) which is further extended to the Pollution Routing Problem (PRP). The basic model is updated using a Monte Carlo Algorithm (MCA). The data set contains 180 data files with a combination of 10, 15, 20, 25, 50, 75, 100, 150, and …
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
All Master's Theses
Tuberculosis (TB) is a respiratory disease which affects millions of people each year, accounting for the tenth leading cause of death worldwide, and is especially prevalent in underdeveloped regions where access to adequate medical care may be limited. Analysis of digital chest radiographs (CXRs) is a common and inexpensive method for the diagnosis of TB; however, a trained radiologist is required to interpret the results, and is subject to human error. Computer-Aided Detection (CAD) systems are a promising machine-learning based solution to automate the diagnosis of TB from CXR images. As the dimensionality of a high-resolution CXR image is very …
Toward Efficient Automation Of Interpretable Machine Learning Boosting, Nathan Neuhaus
Toward Efficient Automation Of Interpretable Machine Learning Boosting, Nathan Neuhaus
All Master's Theses
Developing efficient automated methods for Interpretable Machine Learning (IML) is an important and long-term goal in the field of Artificial Intelligence. Currently the Machine Learning landscape is dominated by Neural Networks (NNs) and Support Vector Machines (SVMs), models which are often highly accurate. Despite high accuracy, such models are essentially “black boxes” and therefore are too risky for situations like healthcare where real lives are at stake. In such situations, so called “glass-box” models, such as Decision Trees (DTs), Bayesian Networks (BNs), and Logic Relational (LR) models are often preferred, however can succumb to accuracy limitations. Unfortunately, having to choose …
Automated Morgan Keenan Classification Of Observed Stellar Spectra Collected By The Sloan Digital Sky Survey Using A Single Classifier, Michael J. Brice, Răzvan Andonie
Automated Morgan Keenan Classification Of Observed Stellar Spectra Collected By The Sloan Digital Sky Survey Using A Single Classifier, Michael J. Brice, Răzvan Andonie
All Faculty Scholarship for the College of the Sciences
The classification of stellar spectra is a fundamental task in stellar astrophysics. Stellar spectra from the Sloan Digital Sky Survey are applied to standard classification methods, k-nearest neighbors and random forest, to automatically classify the spectra. Stellar spectra are high dimensional data and the dimensionality is reduced using astronomical knowledge because classifiers work in low dimensional space. These methods are utilized to classify the stellar spectra into a complete Morgan Keenan classification (spectral and luminosity) using a single classifier. The motion of stars (radial velocity) causes machine-learning complications through the feature matrix when classifying stellar spectra. Due to the nature …
Weighted Random Search For Hyperparameter Optimization, Adrian-Cǎtǎlin Florea, Rǎzvan Andonie
Weighted Random Search For Hyperparameter Optimization, Adrian-Cǎtǎlin Florea, Rǎzvan Andonie
All Faculty Scholarship for the College of the Sciences
We introduce an improved version of Random Search (RS), used here for hyperparameter optimization of machine learning algorithms. Unlike the standard RS, which generates for each trial new values for all hyperparameters, we generate new values for each hyperparameter with a probability of change. The intuition behind our approach is that a value that already triggered a good result is a good candidate for the next step, and should be tested in new combinations of hyperparameter values. Within the same computational budget, our method yields better results than the standard RS. Our theoretical results prove this statement. We test our …
Classification Of Stars From Redshifted Stellar Spectra Utilizing Machine Learning, Michael J. Brice
Classification Of Stars From Redshifted Stellar Spectra Utilizing Machine Learning, Michael J. Brice
All Master's Theses
The classification of stellar spectra is a fundamental task in stellar astrophysics. There have been many explorations into the automated classification of stellar spectra but few that involve the Sloan Digital Sky Survey (SDSS). Stellar spectra from the SDSS are applied to standard classification methods such as K-Nearest Neighbors, Random Forest, and Support Vector Machine to automatically classify the spectra. Stellar spectra are high dimensional data and the dimensionality is reduced using standard Feature Selection methods such as Chi-Squared and Fisher score and with domain-specific astronomical knowledge because classifiers work in low dimensional space. These methods are utilized to classify …
Automatic Classification And Shift Detection Of Facial Expressions In Event-Aware Smart Environments, Arne Bernin, Larissa Müller, Sobin Ghose, Christos Grecos, Qi Wang, Ralf Jettke, Kai Von Luck, Florian Vogt
Automatic Classification And Shift Detection Of Facial Expressions In Event-Aware Smart Environments, Arne Bernin, Larissa Müller, Sobin Ghose, Christos Grecos, Qi Wang, Ralf Jettke, Kai Von Luck, Florian Vogt
All Faculty Scholarship for the College of the Sciences
Affective application developers often face a challenge in integrating the output of facial expression recognition (FER) software in interactive systems: although many algorithms have been proposed for FER, integrating the results of these algorithms into applications remains difficult. Due to inter- and within-subject variations further post-processing is needed. Our work addresses this problem by introducing and comparing three post-processing classification algorithms for FER output applied to an event-based interaction scheme to pinpoint the affective context within a time window. Our comparison is based on earlier published experiments with an interactive cycling simulation in which participants were provoked with game elements …
Transfer Information Energy: A Quantitative Indicator Of Information Transfer Between Time Series, Angel Caƫaron, Rǎzvan Andonie
Transfer Information Energy: A Quantitative Indicator Of Information Transfer Between Time Series, Angel Caƫaron, Rǎzvan Andonie
All Faculty Scholarship for the College of the Sciences
We introduce an information-theoretical approach for analyzing information transfer between time series. Rather than using the Transfer Entropy (TE), we define and apply the Transfer Information Energy (TIE), which is based on Onicescu’s Information Energy. Whereas the TE can be used as a measure of the reduction in uncertainty about one time series given another, the TIE may be viewed as a measure of the increase in certainty about one time series given another. We compare the TIE and the TE in two known time series prediction applications. First, we analyze stock market indexes from the Americas, Asia/Pacific and Europe, …
Retrospective Analysis And Prediction: Artificial Intelligence And Its Applications In Libraries, Ping Fu
Retrospective Analysis And Prediction: Artificial Intelligence And Its Applications In Libraries, Ping Fu
Library Scholarship
The application of Artificial Intelligence (AI) has brought significant innovation to fundamental science and research in recent years. This paper briefly reviews and analyzes the findings of research and development of AI technologies such as expert systems, natural language processing, pattern recognition, robotics and machine learning in the fields of library such as information retrieval, reference service, cataloging, classification, acquisitions, circulation and automation. By reviewing and analyzing research papers published on respected academic journals, studying the examples and practical cases of the latest AI applications in industry, this study finds that current AI applications in the field of library are …
Deep Learning Of 2-D Images Representing N-D Data In General Line Coordinates, Dmytro Dovhalets, Boris Kovalerchuk, Szilárd Vajda, Răzvan Andonie
Deep Learning Of 2-D Images Representing N-D Data In General Line Coordinates, Dmytro Dovhalets, Boris Kovalerchuk, Szilárd Vajda, Răzvan Andonie
Computer Science Faculty Scholarship
While knowledge discovery and n-D data visualization procedures are often efficient, the loss of information, occlusion, and clutter continue to be a challenge. General Line Coordinates (GLC) is a rather new technique to deal with such artifacts. GLC-Linear, which is one of the methods in GLC, allows transforming n-D numerical data to their visual representation as polylines losslessly. The method proposed in this paper uses these 2-D visual representations as input to Convolutional Neural Network (CNN) classifiers. The obtained classification accuracies are close to the ones obtained by other machine learning algorithms. The main benefit of the method is the …
Looking At Faces In The Wild, Eugene Borovikov, Szilárd Vajda, Michael Bonifant, Michael Gill
Looking At Faces In The Wild, Eugene Borovikov, Szilárd Vajda, Michael Bonifant, Michael Gill
Computer Science Faculty Scholarship
Recent advances in the face detection (FD) and recognition (FR) technology may give an impression that the problem of face matching is essentially solved, e.g. via deep learning models using thousands of samples per face for training and validation on the available benchmark data-sets. Human vision system seems to handle face localization and matching problem differently from the modern FR systems, since humans detect faces instantly even in most cluttered environments, and often require a single view of a face to reliably distinguish it from all others. This prompted us to take a biologically inspired look at building a cognitive …
Data Visualization And Classification Of Artificially Created Images, Dmytro Dovhalets
Data Visualization And Classification Of Artificially Created Images, Dmytro Dovhalets
All Master's Theses
Visualization of multidimensional data is a long-standing challenge in machine learning and knowledge discovery. A problem arises as soon as 4-dimensions are introduced since we live in a 3-dimensional world. There are methods out there which can visualize multidimensional data, but loss of information and clutter are still a problem. General Line Coordinates (GLC) can losslessly project n-dimensional data in 2- dimensions. A new method is introduced based on GLC called GLC-L. This new method can do interactive visualization, dimension reduction, and supervised learning. One of the applications of GLC-L is transformation of vector data into image data. This novel …
Decreasing Occlusion And Increasing Explanation In Interactive Visual Knowledge Discovery, Abdulrahman Ahmed Gharawi
Decreasing Occlusion And Increasing Explanation In Interactive Visual Knowledge Discovery, Abdulrahman Ahmed Gharawi
All Master's Theses
Lack of explanation and occlusion are the major problems for interactive visual knowledge discovery, machine learning and data mining in multidimensional data. This thesis proposes a hybrid method that combines visual and analytical means to deal with these problems. This method, denoted as FSP, uses visualization of n-D data in 2-D in a set of Shifted Paired Coordinates (SPC). SPC for n-D data consists of n/2 pairs of Cartesian coordinates that are shifted relative to each other to avoid their overlap. Each n-D point is represented as a directed graph in SPC. It is shown that the FSP method simplifies …
Spike-Based Classification Of Uci Datasets With Multi-Layer Resume-Like Tempotron, Sami Abdul-Wahid
Spike-Based Classification Of Uci Datasets With Multi-Layer Resume-Like Tempotron, Sami Abdul-Wahid
All Master's Theses
Spiking neurons are a class of neuron models that represent information in timed sequences called ``spikes.'' Though predominantly used in neuro-scientific investigations, spiking neural networks (SNN) can be applied to machine learning problems such as classification and regression. SNN are computationally more powerful per neuron than traditional neural networks. Though training time is slow on general purpose computers, spike-based hardware implementations are faster and have shown capability for ultra-low power consumption. Additionally, various SNN training algorithms have achieved comparable performance with the State of the Art on the Fisher Iris dataset. Our main contribution is a software implementation of the …
Asymptotically Unbiased Estimation Of A Nonsymmetric Dependence Measure Applied To Sensor Data Analytics And Financial Time Series, Angel Caƫaron, Razvan Andonie, Yvonne Chueh
Asymptotically Unbiased Estimation Of A Nonsymmetric Dependence Measure Applied To Sensor Data Analytics And Financial Time Series, Angel Caƫaron, Razvan Andonie, Yvonne Chueh
All Faculty Scholarship for the College of the Sciences
A fundamental concept frequently applied to statistical machine learning is the detection of dependencies between unknown random variables found from data samples. In previous work, we have introduced a nonparametric unilateral dependence measure based on Onicescu’s information energy and a kNN method for estimating this measure from an available sample set of discrete or continuous variables. This paper provides the formal proofs which show that the estimator is asymptotically unbiased and has asymptotic zero variance when the sample size increases. It implies that the estimator has good statistical qualities. We investigate the performance of the estimator for data analysis applications …
Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets
Constructing Interactive Visual Classification, Clustering And Dimension Reduction Models For N-D Data, Boris Kovalerchuk, Dmytro Dovhalets
Computer Science Faculty Scholarship
The exploration of multidimensional datasets of all possible sizes and dimensions is a long-standing challenge in knowledge discovery, machine learning, and visualization. While multiple efficient visualization methods for n-D data analysis exist, the loss of information, occlusion, and clutter continue to be a challenge. This paper proposes and explores a new interactive method for visual discovery of n-D relations for supervised learning. The method includes automatic, interactive, and combined algorithms for discovering linear relations, dimension reduction, and generalization for non-linear relations. This method is a special category of reversible General Line Coordinates (GLC). It produces graphs in 2-D that represent …