Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (13)
- Business (3)
- Databases and Information Systems (3)
- Management Information Systems (3)
- Data Science (2)
-
- Social and Behavioral Sciences (2)
- Artificial Intelligence and Robotics (1)
- Biogeochemistry (1)
- Computer Engineering (1)
- Earth Sciences (1)
- Engineering (1)
- Environmental Sciences (1)
- Geography (1)
- Mathematics (1)
- Public Affairs, Public Policy and Public Administration (1)
- Remote Sensing (1)
- Statistical Methodology (1)
- Statistics and Probability (1)
- Transportation (1)
- Water Resource Management (1)
Articles 1 - 14 of 14
Full-Text Articles in Physical Sciences and Mathematics
A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi
A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi
Dissertations
In the last few decades, harmful algal blooms (HABs, also known as “red tides”) have become one of the most detrimental natural phenomena all around the world especially in Florida’s coastal areas due to local environmental factors and global warming in a larger scale. Karenia brevis produces toxins that have harmful effects on humans, fisheries, and ecosystems. In this study, I developed and compared the efficiency of state-of-the-art machine learning models (e.g., XGBoost, Random Forest, and Support Vector Machine) in predicting the occurrence of HABs. In the proposed models, the K. brevis abundance is used as the target, and 10 …
Hierarchical Aggregation Of Multidimensional Data For Efficient Data Mining, Safaa Khalil Alwajidi
Hierarchical Aggregation Of Multidimensional Data For Efficient Data Mining, Safaa Khalil Alwajidi
Dissertations
Big data analysis is essential for many smart applications in areas such as connected healthcare, intelligent transportation, human activity recognition, environment, and climate change monitoring. Traditional data mining algorithms do not scale well to big data due to the enormous number of data points and the velocity of their generation. Mining and learning from big data need time and memory efficiency techniques, albeit the cost of possible loss in accuracy. This research focuses on the mining of big data using aggregated data as input. We developed a data structure that is to be used to aggregate data at multiple resolutions. …
Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou
Hybrid Deep Neural Networks For Mining Heterogeneous Data, Xiurui Hou
Dissertations
In the era of big data, the rapidly growing flood of data represents an immense opportunity. New computational methods are desired to fully leverage the potential that exists within massive structured and unstructured data. However, decision-makers are often confronted with multiple diverse heterogeneous data sources. The heterogeneity includes different data types, different granularities, and different dimensions, posing a fundamental challenge in many applications. This dissertation focuses on designing hybrid deep neural networks for modeling various kinds of data heterogeneity.
The first part of this dissertation concerns modeling diverse data types, the first kind of data heterogeneity. Specifically, image data and …
Applied Deep Learning In Intelligent Transportation Systems And Embedding Exploration, Xiaoyuan Liang
Applied Deep Learning In Intelligent Transportation Systems And Embedding Exploration, Xiaoyuan Liang
Dissertations
Deep learning techniques have achieved tremendous success in many real applications in recent years and show their great potential in many areas including transportation. Even though transportation becomes increasingly indispensable in people’s daily life, its related problems, such as traffic congestion and energy waste, have not been completely solved, yet some problems have become even more critical. This dissertation focuses on solving the following fundamental problems: (1) passenger demand prediction, (2) transportation mode detection, (3) traffic light control, in the transportation field using deep learning. The dissertation also extends the application of deep learning to an embedding system for visualization …
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Dissertations
Spatial and temporal dependencies are ubiquitous properties of data in numerous domains. The popularity of spatial and temporal data mining has thus grown with the increasing prevalence of massive data. The presence of spatial and temporal attributes not only provides complementary useful perspectives, but also poses new challenges to the representation and integration into the learning procedure. In this dissertation, the involved spatial and temporal dependencies are explored with three genres: sample-wise, feature-wise, and target-wise. A family of novel methodologies is developed accordingly for the dependency representation in respective scenarios.
First, dependencies among discrete, continuous and repeated observations are studied …
Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki
Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki
Dissertations
Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches.
This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques …
Big Data Analytics In Computational Biology And Bioinformatics, Kevin Byron
Big Data Analytics In Computational Biology And Bioinformatics, Kevin Byron
Dissertations
Big data analytics in computational biology and bioinformatics refers to an array of operations including biological pattern discovery, classification, prediction, inference, clustering as well as data mining in the cloud, among others. This dissertation addresses big data analytics by investigating two important operations, namely pattern discovery and network inference.
The dissertation starts by focusing on biological pattern discovery at a genomic scale. Research reveals that the secondary structure in non-coding RNA (ncRNA) is more conserved during evolution than its primary nucleotide sequence. Using a covariance model approach, the stems and loops of an ncRNA secondary structure are represented as a …
Statistical Learning Methods For Mining Marketing And Biological Data, Jie Zhang
Statistical Learning Methods For Mining Marketing And Biological Data, Jie Zhang
Dissertations
Nowadays, the value of data has been broadly recognized and emphasized. More and more decisions are made based on data and analysis rather than solely on experience and intuition. With the fast development of networking, data storage, and data collection capacity, data have increased dramatically in industry, science and engineering domains, which brings both great opportunities and challenges. To take advantage of the data flood, new computational methods are in demand to process, analyze and understand these datasets.
This dissertation focuses on the development of statistical learning methods for online advertising and bioinformatics to model real world data with temporal …
Data Mining In Computational Proteomics And Genomics, Yang Song
Data Mining In Computational Proteomics And Genomics, Yang Song
Dissertations
This dissertation addresses data mining in bioinformatics by investigating two important problems, namely peak detection and structure matching. Peak detection is useful for biological pattern discovery while structure matching finds many applications in clustering and classification.
The first part of this dissertation focuses on elastic peak detection in 2D liquid chromatographic mass spectrometry (LC-MS) data used in proteomics research. These data can be modeled as a time series, in which the X-axis represents time points and the Y-axis represents intensity values. A peak occurs in a set of 2D LC-MS data when the sum of the intensity values in a …
Enhancing Web Marketing By Using Ontology, Xuan Zhou
Enhancing Web Marketing By Using Ontology, Xuan Zhou
Dissertations
The existence of the Web has a major impact on people's life styles. Online shopping, online banking, email, instant messenger services, search engines and bulletin boards have gradually become parts of our daily life. All kinds of information can be found on the Web. Web marketing is one of the ways to make use of online information. By extracting demographic information and interest information from the Web, marketing knowledge can be augmented by applying data mining algorithms. Therefore, this knowledge which connects customers to products can be used for marketing purposes and for targeting existing and potential customers. The Web …
Text Mining With Exploitation Of User's Background Knowledge : Discovering Novel Association Rules From Text, Xin Chen
Dissertations
The goal of text mining is to find interesting and non-trivial patterns or knowledge from unstructured documents. Both objective and subjective measures have been proposed in the literature to evaluate the interestingness of discovered patterns. However, objective measures alone are insufficient because such measures do not consider knowledge and interests of the users. Subjective measures require explicit input of user expectations which is difficult or even impossible to obtain in text mining environments.
This study proposes a user-oriented text-mining framework and applies it to the problem of discovering novel association rules from documents. The developed system, uMining, consists of two …
Pattern Discovery In Structural Databases With Applications To Bioinformatics, Sen Zhang
Pattern Discovery In Structural Databases With Applications To Bioinformatics, Sen Zhang
Dissertations
Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data such as trees and graphs. FSM finds many applications in bioinformatics, XML processing, Web log analysis, and so on. In this thesis, two new FSM techniques are proposed for finding patterns in unordered labeled trees. Such trees can be used to model evolutionary histories of different species, among others.
The first FSM technique finds cousin pairs in the trees. A cousin pair is a pair of nodes sharing the same parent, the same grandparent, or the same great-grandparent, etc. Given a tree T, our …
New Techniques For Improving Biological Data Quality Through Information Integration, Katherine Grace Herbert
New Techniques For Improving Biological Data Quality Through Information Integration, Katherine Grace Herbert
Dissertations
As databases become more pervasive through the biological sciences, various data quality concerns are emerging. Biological databases tend to develop data quality issues regarding data legacy, data uniformity and data duplication. Due to the nature of this data, each of these problems is non-trivial and can cause many problems for the database. For biological data to be corrected and standardized, methods and frameworks must be developed to handle both structural and traditional data.
The BIG-AJAX framework has been developed for solving these problems through both data cleaning and data integration. This framework exploits declarative data cleaning and exploratory data mining …
Knowledge Discovery In Biological Databases : A Neural Network Approach, Qicheng Ma
Knowledge Discovery In Biological Databases : A Neural Network Approach, Qicheng Ma
Dissertations
Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, …