Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (18)
- Statistics and Probability (7)
- Engineering (5)
- Life Sciences (4)
- Applied Mathematics (3)
-
- Bioinformatics (2)
- Business (2)
- Computer Engineering (2)
- Databases and Information Systems (2)
- Econometrics (2)
- Economics (2)
- Environmental Sciences (2)
- International Business (2)
- Longitudinal Data Analysis and Time Series (2)
- Management Sciences and Quantitative Methods (2)
- Mathematics (2)
- Other Applied Mathematics (2)
- Other Computer Sciences (2)
- Social and Behavioral Sciences (2)
- Strategic Management Policy (2)
- Theory and Algorithms (2)
- Aerospace Engineering (1)
- Agriculture (1)
- Applied Statistics (1)
- Artificial Intelligence and Robotics (1)
- Civil and Environmental Engineering (1)
- Data Storage Systems (1)
- Design of Experiments and Sample Surveys (1)
- Dynamical Systems (1)
- Institution
-
- Purdue University (4)
- Selected Works (3)
- Brigham Young University (2)
- University of South Florida (2)
- Claremont Colleges (1)
-
- Embry-Riddle Aeronautical University (1)
- Florida International University (1)
- Missouri University of Science and Technology (1)
- New Jersey Institute of Technology (1)
- Old Dominion University (1)
- Rose-Hulman Institute of Technology (1)
- Singapore Management University (1)
- Stephen F. Austin State University (1)
- University of Nebraska - Lincoln (1)
- University of Nevada, Las Vegas (1)
- University of South Carolina (1)
- University of Texas at El Paso (1)
- University of Vermont (1)
- Publication
-
- Open Access Dissertations (3)
- Theses and Dissertations (3)
- Davide Andrea Mauro (2)
- Computer Science Theses & Dissertations (1)
- Department of Statistics: Faculty Publications (1)
-
- Dissertations (1)
- Dr. Tamilla Curtis (1)
- Engineering Management and Systems Engineering Faculty Research & Creative Works (1)
- FIU Electronic Theses and Dissertations (1)
- Graduate College Dissertations and Theses (1)
- HMC Senior Theses (1)
- Journal of Geospatial Applications in Natural Resources (1)
- Journal of Public Transportation (1)
- Mathematical Sciences Technical Reports (MSTR) (1)
- Open Access Theses (1)
- Open Access Theses & Dissertations (1)
- Publications (1)
- Research Collection School Of Computing and Information Systems (1)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (1)
- USF Tampa Graduate Theses and Dissertations (1)
- Publication Type
Articles 1 - 25 of 25
Full-Text Articles in Physical Sciences and Mathematics
A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis
A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis
Open Access Dissertations
Mass spectrometry (MS) imaging is a powerful investigation technique for a wide range of biological applications such as molecular histology of tissue, whole body sections, and bacterial films , and biomedical applications such as cancer diagnosis. MS imaging visualizes the spatial distribution of molecular ions in a sample by repeatedly collecting mass spectra across its surface, resulting in complex, high-dimensional imaging datasets. Two of the primary goals of statistical analysis of MS imaging experiments are classification (for supervised experiments), i.e. assigning pixels to pre-defined classes based on their spectral profiles, and segmentation (for unsupervised experiments), i.e. assigning pixels to newly …
Low Rank Methods For Optimizing Clustering, Yangyang Hou
Low Rank Methods For Optimizing Clustering, Yangyang Hou
Open Access Dissertations
Complex optimization models and problems in machine learning often have the majority of information in a low rank subspace. By careful exploitation of these low rank structures in clustering problems, we find new optimization approaches that reduce the memory and computational cost.
We discuss two cases where this arises. First, we consider the NEO-K-Means (Non-Exhaustive, Overlapping K-Means) objective as a way to address overlapping and outliers in an integrated fashion. Optimizing this discrete objective is NP-hard, and even though there is a convex relaxation of the objective, straightforward convex optimization approaches are too expensive for large datasets. We utilize low …
Differentially Private Data Publishing For Data Analysis, Dong Su
Differentially Private Data Publishing For Data Analysis, Dong Su
Open Access Dissertations
In the information age, vast amounts of sensitive personal information are collected by companies, institutions and governments. A key technological challenge is how to design mechanisms for effectively extracting knowledge from data while preserving the privacy of the individuals involved. In this dissertation, we address this challenge from the perspective of differentially private data publishing. Firstly, we propose PrivPfC, a differentially private method for releasing data for classification. The key idea underlying PrivPfC is to privately select, in a single step, a grid, which partitions the data domain into a number of cells. This selection is done using the exponential …
Applying Ahp And Clustering Approaches For Public Transportation Decisionmaking: A Case Study Of Isfahan City, Alireza Salavati, Hossein Haghshenas, Bahador Ghadirifaraz, Jamshid Laghaei, Ghodrat Eftekhari
Applying Ahp And Clustering Approaches For Public Transportation Decisionmaking: A Case Study Of Isfahan City, Alireza Salavati, Hossein Haghshenas, Bahador Ghadirifaraz, Jamshid Laghaei, Ghodrat Eftekhari
Journal of Public Transportation
The main purpose of this paper is to define appropriate criteria for the systematic approach to evaluate and prioritize multiple candidate corridors for public transport investment simultaneously to serve travel demand, regarding supply of current public transportation system and road network conditions of Isfahan, Iran. To optimize resource allocation, policymakers need to identify proper corridors to implement a public transportation system. In fact, the main question is to adopt the best public transportation system for each main corridor of Isfahan. In this regard, 137 questionnaires were completed by experts, directors, and policymakers of Isfahan to identify goals and objectives in …
Semi-Automated Tool For Providing Effective Feedback On Programming Assignments, Min Yan Beh, Swapna Gottipati, David Lo, Venky Shankararaman
Semi-Automated Tool For Providing Effective Feedback On Programming Assignments, Min Yan Beh, Swapna Gottipati, David Lo, Venky Shankararaman
Research Collection School Of Computing and Information Systems
Human grading of introductory programming assignments is tedious and error-prone, hence researchers have attempted to develop tools that support automatic assessment of programming code. However, most such efforts often focus only on scoring solutions, rather than assessing whether students correctly understand the problems. To aid the students improve programming skills, effective feedback on programming assignments plays an important role. Individual feedback generation is tedious and painstaking process. We present a tool that not only automatically generates the static and dynamic program analysis outcomes, but also clusters similar code submissions to provide scalable and effective feedback to the students. We studied …
Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso
Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso
Davide Andrea Mauro
The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In order to devise tools that facilitate sound design by vocal sketching we attempt at organizing a database of short excerpts of vocal imitations. By clustering the sound samples on a space whose dimensionality has been reduced to the two principal components, it is experimentally checked how meaningful the resulting clusters are for humans. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark in the exploration of the …
Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso
Analyzing And Organizing The Sonic Space Of Vocal Imitation, Davide Andrea Mauro Phd, D. Rocchesso
Davide Andrea Mauro
The sonic space that can be spanned with the voice is vast and complex and, therefore, it is difficult to organize and explore. In order to devise tools that facilitate sound design by vocal sketching we attempt at organizing a database of short excerpts of vocal imitations. By clustering the sound samples on a space whose dimensionality has been reduced to the two principal components, it is experimentally checked how meaningful the resulting clusters are for humans. Eventually, a representative of each cluster, chosen to be close to its centroid, may serve as a landmark in the exploration of the …
Shape Analysis Of Traffic Flow Curves Using A Hybrid Computational Analysis, Wasim Irshad Kayani, Shikhar P. Acharya, Ivan G. Guardiola, Donald C. Wunsch, B. Schumacher, Isaac Wagner-Muns
Shape Analysis Of Traffic Flow Curves Using A Hybrid Computational Analysis, Wasim Irshad Kayani, Shikhar P. Acharya, Ivan G. Guardiola, Donald C. Wunsch, B. Schumacher, Isaac Wagner-Muns
Engineering Management and Systems Engineering Faculty Research & Creative Works
This paper highlights and validates the use of shape analysis using Mathematical Morphology tools as a means to develop meaningful clustering of historical data. Furthermore, through clustering more appropriate grouping can be accomplished that can result in the better parameterization or estimation of models. This results in more effective prediction model development. Hence, in an effort to highlight this within the research herein, a Back-Propagation Neural Network is used to validate the classification achieved through the employment of MM tools. Specifically, the Granulometric Size Distribution (GSD) is used to achieve clustering of daily traffic flow patterns based solely on their …
A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang
A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang
Computer Science Theses & Dissertations
Many real-world processes are dynamically changing over time. As a consequence, the observed complex data generated by these processes also evolve smoothly. For example, in computational biology, the expression data matrices are evolving, since gene expression controls are deployed sequentially during development in many biological processes. Investigations into the spatial and temporal gene expression dynamics are essential for understanding the regulatory biology governing development. In this dissertation, I mainly focus on two types of complex data: genome-wide spatial gene expression patterns in the model organism fruit fly and Allen Brain Atlas mouse brain data. I provide a framework to explore …
Optimizing Main Memory Usage In Modern Computing Systems To Improve Overall System Performance, Daniel Jose Campello
Optimizing Main Memory Usage In Modern Computing Systems To Improve Overall System Performance, Daniel Jose Campello
FIU Electronic Theses and Dissertations
Operating Systems use fast, CPU-addressable main memory to maintain an application’s temporary data as anonymous data and to cache copies of persistent data stored in slower block-based storage devices. However, the use of this faster memory comes at a high cost. Therefore, several techniques have been implemented to use main memory more efficiently in the literature. In this dissertation we introduce three distinct approaches to improve overall system performance by optimizing main memory usage.
First, DRAM and host-side caching of file system data are used for speeding up virtual machine performance in today’s virtualized data centers. The clustering of VM …
Statistical Modeling Of Carbon Dioxide And Cluster Analysis Of Time Dependent Information: Lag Target Time Series Clustering, Multi-Factor Time Series Clustering, And Multi-Level Time Series Clustering, Doo Young Kim
USF Tampa Graduate Theses and Dissertations
The current study consists of three major parts. Statistical modeling, the connection between statistical modeling and cluster analysis, and proposing new methods to cluster time dependent information.
First, we perform a statistical modeling of the Carbon Dioxide (CO2) emission in South Korea in order to identify the attributable variables including interaction effects. One of the hot issues in the earth in 21st century is Global warming which is caused by the marriage between atmospheric temperature and CO2 in the atmosphere. When we confront this global problem, we first need to verify what causes the problem then we …
Efficient Algorithms For Clustering Polygonal Obstacles, Sabbir Kumar Manandhar
Efficient Algorithms For Clustering Polygonal Obstacles, Sabbir Kumar Manandhar
UNLV Theses, Dissertations, Professional Papers, and Capstones
Clustering a set of points in Euclidean space is a well-known problem having applications in pattern recognition, document image analysis, big-data analytics, and robotics. While there are a lot of research publications for clustering point objects, only a few articles have been reported for clustering a given distribution of obstacles. In this thesis we examine the development of efficient algorithms for clustering a given set of convex obstacles in the 2D plane. One of the methods presented in this work uses a Voronoi diagram to extract obstacle clusters. We also consider the implementation issues of point/obstacle clustering algorithms.
Confirm: Clustering Of Noisy Form Images Using Robust Matching, Christopher Alan Tensmeyer
Confirm: Clustering Of Noisy Form Images Using Robust Matching, Christopher Alan Tensmeyer
Theses and Dissertations
Identifying the type of a scanned form greatly facilitates processing, including automated field segmentation and field recognition. Contrary to the majority of existing techniques, we focus on unsupervised type identification, where the set of form types are not known apriori, and on noisy collections that contain very similar document types. This work presents a novel algorithm: CONFIRM (Clustering Of Noisy Form Images using Robust Matching), which simultaneously discovers the types in a collection of forms and assigns each form to a type. CONFIRM matches type-set text and rule lines between forms to create domain specific features, which we show outperform …
Variance Of Clusterings On Graphs, Thomas Vlado Mulc
Variance Of Clusterings On Graphs, Thomas Vlado Mulc
Mathematical Sciences Technical Reports (MSTR)
Graphs that represent data often have structures or characteristics that can represent some relationships in the data. One of these structures is clusters or community structures. Most clustering algorithms for graphs are deterministic, which means they will output the same clustering each time. We investigated a few stochastic algorithms, and look into the consistency of their clusterings.
Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya
Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya
Open Access Theses
As the level of automation within an aircraft increases, the interactions between the pilot and autopilot play a crucial role in its proper operation. Issues with human machine interactions (HMI) have been cited as one of the main causes behind many aviation accidents. Due to the complexity of such interactions, it is challenging to identify all possible situations and develop the necessary contingencies. In this thesis, we propose a data-driven analysis tool to identify potential HMI issues in large-scale Flight Operational Quality Assurance (FOQA) dataset. The proposed tool is developed using a multi-level clustering framework, where a set of basic …
Macroconstants Of Development: A New Benchmark For The Strategic Development Of Advanced Countries And Firms, Andrey Bystrov, Vyacheslav Yusim, Tamilla Curtis
Macroconstants Of Development: A New Benchmark For The Strategic Development Of Advanced Countries And Firms, Andrey Bystrov, Vyacheslav Yusim, Tamilla Curtis
Dr. Tamilla Curtis
This research proposed a new indicator of countries’ development called “macroconstants of development”. The literature review indicates that the concept of "macroconstants of development" is not used at the moment in neither the theory nor the practice of industrial policy. Research of longitudinal data of total GDP, GDP per capita and their derivatives for most countries of the world was conducted. An analysis of statistical information has been done by employing econometric analyses.
Based on the analysis of the statistical data, which characterizes the development of large, technologically advanced countries in ordinary conditions, it was identified that the average acceleration …
Increment - Interactive Cluster Refinement, Logan Adam Mitchell
Increment - Interactive Cluster Refinement, Logan Adam Mitchell
Theses and Dissertations
We present INCREMENT, a cluster refinement algorithm which utilizes user feedback to refine clusterings. INCREMENT is capable of improving clusterings produced by arbitrary clustering algorithms. The initial clustering provided is first sub-clustered to improve query efficiency. A small set of select instances from each of these sub-clusters are presented to a user for labelling. Utilizing the user feedback, INCREMENT trains a feature embedder to map the input features to a new feature space. This space is learned such that spatial distance is inversely correlated with semantic similarity, determined from the user feedback. A final clustering is then formed in the …
Semantics And Result Disambiguation For Keyword Search On Tree Data, Cem Aksoy
Semantics And Result Disambiguation For Keyword Search On Tree Data, Cem Aksoy
Dissertations
Keyword search is a popular technique for searching tree-structured data (e.g., XML, JSON) on the web because it frees the user from learning a complex query language and the structure of the data sources. However, the convenience of keyword search comes with drawbacks. The imprecision of the keyword queries usually results in a very large number of results of which only very few are relevant to the query. Multiple previous approaches have tried to address this problem. Some of them exploit structural and semantic properties of the tree data in order to filter out irrelevant results while others use a …
Spatial Analysis Of Forest Crimes In Mark Twain National Forest, Missouri, Karun Pandit, Eddie Bevilacqua, Giorgos Mountrakis, Robert W. Malmsheimer
Spatial Analysis Of Forest Crimes In Mark Twain National Forest, Missouri, Karun Pandit, Eddie Bevilacqua, Giorgos Mountrakis, Robert W. Malmsheimer
Journal of Geospatial Applications in Natural Resources
Forest crime mitigation has been identified as a challenging issue in forest management in the United States. Knowledge of the spatial pattern of forest crimes would help in wisely allocating limited enforcement resources to curb forest crimes. This study explores the spatial pattern of three different types of forest crimes: fire crime, illegal timber logging crime, and occupancy use crime in the Salem-Patosi Ranger District of Mark Twain National Forest. Univariate and bivariate Ripley’s K-functions were applied to explore the spatial patterns in crime events, like clustering and attraction among forest crime types. Results reveal significant clustering for each forest …
Forecasting Customer Electricity Load Demand In The Power Trading Agent Competition Using Machine Learning, Saiful Abu
Forecasting Customer Electricity Load Demand In The Power Trading Agent Competition Using Machine Learning, Saiful Abu
Open Access Theses & Dissertations
Accurate electricity load demand forecasting is an important problem in managing the power grid for both economic and environmental reasons. The Power TAC simulation provides a platform to do research on smart grid energy generation and distribution systems. Brokers are the focus of the design task posed to developers by the system. The brokers work as self-interested entities that try to maximize profits by trading electricity across multiple markets. To be successful, a broker has to forecast the electricity demand for customers as accurately as possible so it can use this information to operate efficiently. My proposed forecasting method uses …
Enscat: Clustering Of Categorical Data Via Ensembling, Bertrand S. Clarke, Saeid Amiri, Jennifer L. Clarke
Enscat: Clustering Of Categorical Data Via Ensembling, Bertrand S. Clarke, Saeid Amiri, Jennifer L. Clarke
Department of Statistics: Faculty Publications
Background: Clustering is a widely used collection of unsupervised learning techniques for identifying natural classes within a data set. It is often used in bioinformatics to infer population substructure. Genomic data are often categorical and high dimensional, e.g., long sequences of nucleotides. This makes inference challenging: The distance metric is often not well-defined on categorical data; running time for computations using high dimensional data can be considerable; and the Curse of Dimensionality often impedes the interpretation of the results. Up to the present, however, the literature and software addressing clustering for categorical data has not yet led to a standard …
Topological Data Analysis For Systems Of Coupled Oscillators, Alec Dunton
Topological Data Analysis For Systems Of Coupled Oscillators, Alec Dunton
HMC Senior Theses
Coupled oscillators, such as groups of fireflies or clusters of neurons, are found throughout nature and are frequently modeled in the applied mathematics literature. Earlier work by Kuramoto, Strogatz, and others has led to a deep understanding of the emergent behavior of systems of such oscillators using traditional dynamical systems methods. In this project we outline the application of techniques from topological data analysis to understanding the dynamics of systems of coupled oscillators. This includes the examination of partitions, partial synchronization, and attractors. By looking for clustering in a data space consisting of the phase change of oscillators over a …
A Hybrid Approach To Semantic Hashtag Clustering In Social Media, Ali Javed
A Hybrid Approach To Semantic Hashtag Clustering In Social Media, Ali Javed
Graduate College Dissertations and Theses
The uncontrolled usage of hashtags in social media makes them vary a lot in the quality of semantics and the frequency of usage. Such variations pose a challenge to the current approaches which capitalize on either the lexical semantics of a hashtag by using metadata or the contextual semantics of a hashtag by using the texts associated with a hashtag. This thesis presents a hybrid approach to clustering hashtags based on their semantics, designed in two phases. The first phase is a sense-level metadata-based semantic clustering algorithm that has the ability to differentiate among distinct senses of a hashtag as …
Registration And Clustering Of Functional Observations, Zizhen Wu
Registration And Clustering Of Functional Observations, Zizhen Wu
Theses and Dissertations
As an important exploratory analysis, curves of similar shape are often classified into groups, which we call clustering of functional data. Phase variations or time distortions are often encountered in the biological processes, such as growth patterns or gene profiles. As a result of time distortion, curves of similar shape may not be aligned. Regular clustering methods for functional data usually ignore the presence of phase variations, which may result in low clustering accuracy. However, it is difficult to account for phase variation without knowing the cluster structure.
In this dissertation, we first propose a Bayesian method that simultaneously clusters …
Macroconstants Of Development: A New Benchmark For The Strategic Development Of Advanced Countries And Firms, Andrey V. Bystrov, Vyacheslav N. Yusim, Tamilla Curtis
Macroconstants Of Development: A New Benchmark For The Strategic Development Of Advanced Countries And Firms, Andrey V. Bystrov, Vyacheslav N. Yusim, Tamilla Curtis
Publications
This research proposed a new indicator of countries’ development called “macroconstants of development”. The literature review indicates that the concept of "macroconstants of development" is not used at the moment in neither the theory nor the practice of industrial policy. Research of longitudinal data of total GDP, GDP per capita and their derivatives for most countries of the world was conducted. An analysis of statistical information has been done by employing econometric analyses.
Based on the analysis of the statistical data, which characterizes the development of large, technologically advanced countries in ordinary conditions, it was identified that the average acceleration …