Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- China Simulation Federation (7)
- MBZUAI (2)
- College of Saint Benedict and Saint John's University (1)
- Dartmouth College (1)
- James Madison University (1)
-
- Louisiana Tech University (1)
- Missouri State University (1)
- New Jersey Institute of Technology (1)
- Selected Works (1)
- Singapore Management University (1)
- University of Louisville (1)
- University of Massachusetts Amherst (1)
- University of Missouri, St. Louis (1)
- University of Rhode Island (1)
- Washington University in St. Louis (1)
- West Virginia University (1)
- Western University (1)
- Wilfrid Laurier University (1)
- Publication Year
- Publication
-
- Journal of System Simulation (7)
- Doctoral Dissertations (2)
- All College Thesis Program, 2016-2019 (1)
- Computer Science Senior Theses (1)
- Computer Vision Faculty Publications (1)
-
- Dissertations (1)
- Electronic Theses and Dissertations (1)
- Electronic Thesis and Dissertation Repository (1)
- Graduate Theses, Dissertations, and Problem Reports (1)
- MSU Graduate Theses (1)
- Machine Learning Faculty Publications (1)
- Martin Masek (1)
- McKelvey School of Engineering Theses & Dissertations (1)
- Research Collection School Of Computing and Information Systems (1)
- Senior Honors Projects (1)
- Senior Honors Projects, 2020-current (1)
- Theses (1)
- Theses and Dissertations (Comprehensive) (1)
- Publication Type
- File Type
Articles 1 - 25 of 25
Full-Text Articles in Computer Sciences
Spatio-Temporal Association Rule Mining Of Traffic Congestion In A Large-Scale Road Network Based On Trajectory Data, Qifan Zhou, Haixu Liu, Zhipeng Dong, Yin Xu
Spatio-Temporal Association Rule Mining Of Traffic Congestion In A Large-Scale Road Network Based On Trajectory Data, Qifan Zhou, Haixu Liu, Zhipeng Dong, Yin Xu
Journal of System Simulation
Abstract: A K neighbor-RElim (KNR) algorithm and a sequential KNbr-RElim (SKNR) algorithm are proposed to mine traffic congestion association rules and congestion propagation spatio-temporal association rules by vehicle trajectory data in a large-scale road network. The KNR algorithm extends the spatial topology constraint based on the RElim algorithm. The KNR can be used to mine the road links prone to congestion from the large-scale trajectory dataset in a large-scale road network and quantify the strength of association for congested road links. The SKNR algorithm expands the time dimension in the form of sliding window and can be applied for mining …
Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner
Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner
Electronic Theses and Dissertations
As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and …
Design And Analysis Of Strategic Behavior In Networks, Sixie Yu
Design And Analysis Of Strategic Behavior In Networks, Sixie Yu
McKelvey School of Engineering Theses & Dissertations
Networks permeate every aspect of our social and professional life.A networked system with strategic individuals can represent a variety of real-world scenarios with socioeconomic origins. In such a system, the individuals' utilities are interdependent---one individual's decision influences the decisions of others and vice versa. In order to gain insights into the system, the highly complicated interactions necessitate some level of abstraction. To capture the otherwise complex interactions, I use a game theoretic model called Networked Public Goods (NPG) game. I develop a computational framework based on NPGs to understand strategic individuals' behavior in networked systems. The framework consists of three …
Design Demand Trend Acquisition Method Based On Short Text Mining Of User Comments In Shopping Websites, Zhiyong Xiong, Zhaoxiong Yan, Huanan Yao, Shangsong Liang
Design Demand Trend Acquisition Method Based On Short Text Mining Of User Comments In Shopping Websites, Zhiyong Xiong, Zhaoxiong Yan, Huanan Yao, Shangsong Liang
Machine Learning Faculty Publications
In order to facilitate designers to explore the market demand trend of laptops and to establish a better “network users-market feedback mechanism”, we propose a design and research method of a short text mining tool based on the K-means clustering algorithm and Kano mode. An improved short text clustering algorithm is used to extract the design elements of laptops. Based on the traditional questionnaire, we extract the user’s attention factors, score the emotional tendency, and analyze the user’s needs based on the Kano model. Then, we select 10 laptops, process them by the improved algorithm, cluster the evaluation words and …
Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub
Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub
Computer Vision Faculty Publications
For personalized medicines, very crucial intrinsic information is present in high dimensional omics data which is difficult to capture due to the large number of molecular features and small number of available samples. Different types of omics data show various aspects of samples. Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making. Omics data, mainly DNA methylation and gene expression profiles are usually high dimensional data with a lot of molecular features. In recent years, variational autoencoders (VAE) [13] have been extensively used in embedding image and text data into …
Research On Assocoation Information Mining Of Space Reconnaissance Equipment System Index, Han Chi, Xiong Wei
Research On Assocoation Information Mining Of Space Reconnaissance Equipment System Index, Han Chi, Xiong Wei
Journal of System Simulation
Abstract: The system effectiveness and system contribution rate of the Space Reconnaissance Equipment System (SRES) has a large number of mutally associated indicators. How to identify relationships the association, select the key indicators and clarify the assocition between core indicators and system contribution rate are the key of the evaluation of system effectiveness and contribution rate. Through the joint simulation of MATLAB and STK, the underlying index data of SRES is obtained. Based on the Frequent Pattern-Tree (FP-Tree) algorithm, the assocition information is discovered, the redundancy is removed and the type of indicator assocition is determined, and an optimization model …
Modeling Of Argon Bombardment And Densification Of Low Temperature Organic Precursors Using Reactive Md Simulations And Machine Learning, Kwabena Asante-Boahen
Modeling Of Argon Bombardment And Densification Of Low Temperature Organic Precursors Using Reactive Md Simulations And Machine Learning, Kwabena Asante-Boahen
MSU Graduate Theses
In this study, an important aspect of the synthesis process for a-BxC:Hy was systematically modeled by utilizing the Reactive Molecular Dynamics (MD) in modeling the argon bombardment from the orthocarborane molecules as the precursor. The MD simulations are used to assess the dynamics associated with the free radicals that result from the ion bombardment. By applying the Data Mining/Machine Learning analysis into the datasets generated from the large reactive MD simulations, I was able to identify and quality the kinetics of these radicals. Overall, this approach allows for a better understanding of the overall mechanism at the atomistic level of …
Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur
Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur
Computer Science Senior Theses
The growing popularity of social media as a platform to obtain information and share one's opinions on various topics makes it a rich source of information for research. In this study, we aimed to develop a framework to infer relationships between demographic and psychographic characteristics of a user and their opinion on a specific narrative - in this case, their stance on taking the COVID-19 vaccine. Twitter was the chosen platform due to the large USA user base and easily available data. Demographic traits included Race, Age, Gender, and Human-vs-Organization Status. Psychographic traits included the Big Five personality traits (Conscientiousness, …
Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi
Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi
Theses and Dissertations (Comprehensive)
This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. …
Detecting Credit Card Fraud: An Analysis Of Fraud Detection Techniques, William Lovo
Detecting Credit Card Fraud: An Analysis Of Fraud Detection Techniques, William Lovo
Senior Honors Projects, 2020-current
Advancements in the modern age have brought many conveniences, one of those being credit cards. Providing an individual the ability to hold their entire purchasing power in the form of pocket-sized plastic cards have made credit cards the preferred method to complete financial transactions. However, these systems are not infallible and may provide criminals and other bad actors the opportunity to abuse them. Financial institutions and their customers lose billions of dollars every year to credit card fraud. To combat this issue, fraud detection systems are deployed to discover fraudulent activity after they have occurred. Such systems rely on advanced …
Analysis And Optimization Of Combustion Characteristics Of Cement Kiln Cooperatively Disposing Domestic Refuse, Jingbing Wu, Hanqing Tang, Xu Jun
Analysis And Optimization Of Combustion Characteristics Of Cement Kiln Cooperatively Disposing Domestic Refuse, Jingbing Wu, Hanqing Tang, Xu Jun
Journal of System Simulation
Abstract: Because the traditional methods can hardly analyze the complex combustion characteristics of cement kiln mixed with domestic refuse, a data mining technology is introduced. A domestic cement plant is selected as the object, and its operating data and relevant parameters are collected. The influence coefficient of each parameter on coal consumption and NOx emission is analyzed by using Stability Selection algorithm. The mathematical model of coal consumption and NOx emission is established with Random Forest algorithm, and the key optimization parameters and their optimal values are obtained by K-means clustering algorithm. The result shows that this method …
Searching For Needles In The Cosmic Haystack, Thomas Ryan Devine
Searching For Needles In The Cosmic Haystack, Thomas Ryan Devine
Graduate Theses, Dissertations, and Problem Reports
Searching for pulsar signals in radio astronomy data sets is a difficult task. The data sets are extremely large, approaching the petabyte scale, and are growing larger as instruments become more advanced. Big Data brings with it big challenges. Processing the data to identify candidate pulsar signals is computationally expensive and must utilize parallelism to be scalable. Labeling benchmarks for supervised classification is costly. To compound the problem, pulsar signals are very rare, e.g., only 0.05% of the instances in one data set represent pulsars. Furthermore, there are many different approaches to candidate classification with no consensus on a best …
Energy Efficiency Data Mining And Scheduling Optimization Of Discrete Workshop, Yugu Lin, Wang Yan
Energy Efficiency Data Mining And Scheduling Optimization Of Discrete Workshop, Yugu Lin, Wang Yan
Journal of System Simulation
Abstract: This paper addresses the optimization of energy consumption in discrete workshops and establishes the energy efficiency optimization model of discrete workshops. The relationship between data mining and knowledge discovery is established. Through scheduling data preprocessing and C4.5 decision tree learning algorithm, the discovery of scheduling knowledge is realized. Energy efficiency optimization calculation is achieved in discrete workshops by the combination of scheduling knowledge and improved differential evolution algorithm (IDE). By comparing with TLBO, GA and PSO, the feasibility of IDE algorithm is verified.
Feature Space Modeling For Accurate And Efficient Learning From Non-Stationary Data, Ayesha Akter
Feature Space Modeling For Accurate And Efficient Learning From Non-Stationary Data, Ayesha Akter
Doctoral Dissertations
A non-stationary dataset is one whose statistical properties such as the mean, variance, correlation, probability distribution, etc. change over a specific interval of time. On the contrary, a stationary dataset is one whose statistical properties remain constant over time. Apart from the volatile statistical properties, non-stationary data poses other challenges such as time and memory management due to the limitation of computational resources mostly caused by the recent advancements in data collection technologies which generate a variety of data at an alarming pace and volume. Additionally, when the collected data is complex, managing data complexity, emerging from its dimensionality and …
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Statistical Machine Learning Methods For Mining Spatial And Temporal Data, Fei Tan
Dissertations
Spatial and temporal dependencies are ubiquitous properties of data in numerous domains. The popularity of spatial and temporal data mining has thus grown with the increasing prevalence of massive data. The presence of spatial and temporal attributes not only provides complementary useful perspectives, but also poses new challenges to the representation and integration into the learning procedure. In this dissertation, the involved spatial and temporal dependencies are explored with three genres: sample-wise, feature-wise, and target-wise. A family of novel methodologies is developed accordingly for the dependency representation in respective scenarios.
First, dependencies among discrete, continuous and repeated observations are studied …
Optimization Of Material Release For Printed Circuit Board Template Based On Data Mining, Shengping Lü, Qiangsheng Yue, Liu Tao
Optimization Of Material Release For Printed Circuit Board Template Based On Data Mining, Shengping Lü, Qiangsheng Yue, Liu Tao
Journal of System Simulation
Abstract: Data mining were employed for the optimization of material release of PCB (Printed Circuit Board) template. PCB scrap ratio related parameters were specified and prediction model variables were chosen according to hypothesis test. Multiple linear regression (MLR), Chi-squared automatic interaction detector, artificial neural network and support vector machine approaches for the prediction of scrap ratio were employed. Evaluation indictors called as superfluous ratio, supplement release ratio and weighted sum of the two were presented; the material release simulation was conducted and then the four approaches were compared and MLR was taken as the preferred one. Adjust coefficient …
Mining And Validation Of Attacking Behavior In The Robocup 2d Simulation, Chen Bing, Zhang Heng, Zekai Cheng, Dong Peng, Lin Chao
Mining And Validation Of Attacking Behavior In The Robocup 2d Simulation, Chen Bing, Zhang Heng, Zekai Cheng, Dong Peng, Lin Chao
Journal of System Simulation
Abstract: Robocup is an international academic competition which focuses on artificial intelligence and robotics. The 2D simulation is one of the earliest and most influential projects in Robocup. Attacking is the core behaviour of the simulated football game, as well as the attack recognition is considered as an important part in team-confrontations. This paper selects some active and contribution index of attacking, extracts lots of attacking behaviour data of the key agents, proposes two kinds of attacking patterns of 2D simulation, as ‘separate attack’ and ‘cooperative attack’, according to the human-player actions. The following simulation tests give the accuracy of …
Clustering Method Based On Graph Data Model And Reliability Detection, Yanyun Cheng, Huisong Bian, Changsheng Bian
Clustering Method Based On Graph Data Model And Reliability Detection, Yanyun Cheng, Huisong Bian, Changsheng Bian
Journal of System Simulation
Abstract: For the data in feature space, traditional clustering algorithm can take clustering analysis directly. High-dimensional spatial data cannot achieve intuitive and effective graphical visualization of clustering results in 2D plane. Graph data can clearly reflect the similarity relationship between objects. According to the distance of the data objects, the feature space data are modeled as graph data by iteration. Cluster analysis based on modularity is carried out on the modeling graph data. The two-dimensional visualization of non-spherical-shape distribution data cluster and result is achieved. The concept of credibility of the clustering result is proposed, and a method is proposed, …
Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao
Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao
Theses
The problem of community structure identification has been an extensively investigated area for biology, physics, social sciences, and computer science in recent years for studying the properties of networks representing complex relationships. Most traditional methods, such as K-means and hierarchical clustering, are based on the assumption that communities have spherical configurations. Lately, Genetic Algorithms (GA) are being utilized for efficient community detection without imposing sphericity. GAs are machine learning methods which mimic natural selection and scale with the complexity of the network. However, traditional GA approaches employ a representation method that dramatically increases the solution space to be searched by …
The Algorithmic Composition Of Classical Music Through Data Mining, Tom Donald Richmond, Imad Rahal
The Algorithmic Composition Of Classical Music Through Data Mining, Tom Donald Richmond, Imad Rahal
All College Thesis Program, 2016-2019
The desire to teach a computer how to algorithmically compose music has been a topic in the world of computer science since the 1950’s, with roots of computer-less algorithmic composition dating back to Mozart himself. One limitation of algorithmically composing music has been the difficulty of eliminating the human intervention required to achieve a musically homogeneous composition. We attempt to remedy this issue by teaching a computer how the rules of composition differ between the six distinct eras of classical music by having it examine a dataset of musical scores, rather than explicitly telling the computer the formal rules of …
Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland
Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland
Doctoral Dissertations
In this work, the goal is to detect closely-linked entities within a data set. The entities of interest have a tie causing them to be similar, such as a shared origin or a channel of influence. Given a collection of people or other entities with their attributes or behavior, we identify unusually similar pairs, and we pose the question: Are these two people linked, or can their similarity be explained by chance? Computing similarities is a core operation in many domains, but two constraints differentiate our version of the problem. First, the score assigned to a pair should account for …
Clustering-Based Personalization, Seyed Nima Mirbakhsh
Clustering-Based Personalization, Seyed Nima Mirbakhsh
Electronic Thesis and Dissertation Repository
Recommendation systems have been the most emerging technology in the last decade as one of the key parts in e-commerce ecosystem. Businesses offer a wide variety of items and contents through different channels such as Internet, Smart TVs, Digital Screens, etc. The number of these items sometimes goes over millions for some businesses. Therefore, users can have trouble finding the products that they are looking for. Recommendation systems address this problem by providing powerful methods which enable users to filter through large information and product space based on their preferences. Moreover, users have different preferences. Thus, businesses can employ recommendation …
Rough-Fuzzy Hybrid Approach For Identification Of Bio-Markers And Classification On Alzheimer's Disease Data, Changsu Lee, Chiou-Peng Lam, Martin Masek
Rough-Fuzzy Hybrid Approach For Identification Of Bio-Markers And Classification On Alzheimer's Disease Data, Changsu Lee, Chiou-Peng Lam, Martin Masek
Martin Masek
A new approach is proposed in this paper for identification of biomarkers and classification on Alzheimer's disease data by employing a rough-fuzzy hybrid approach called ARFIS (a framework for Adaptive TS-type Rough-Fuzzy Inference Systems). In this approach, the entropy-based discretization technique is employed first on the training data to generate clusters for each attribute with respect to the output information. The rough set-based feature reduction method is then utilized to reduce the number of features in a decision table obtained using the cluster information. Another rough set-based approach is employed for the generation of decision rules. After the construction and …
A Continuous Learning Strategy For Self-Organizing Maps Based On Convergence Windows, Gregory T. Breard
A Continuous Learning Strategy For Self-Organizing Maps Based On Convergence Windows, Gregory T. Breard
Senior Honors Projects
A self-organizing map (SOM) is a type of artificial neural network that has applications in a variety of fields and disciplines. The SOM algorithm uses unsupervised learning to produce a low-dimensional representation of high- dimensional data. This is done by 'fitting' a grid of nodes to a data set over a fixed number of iterations. With each iteration, the nodes of the map are adjusted so that they appear more like the data points. The low-dimensionality of the resulting map means that it can be presented graphically and be more intuitively interpreted by humans. However, it is still essential to …
Disclosing Climate Change Patterns Using An Adaptive Markov Chain Pattern Detection Method, Zhaoxia Wang, Gary Lee, Hoong Maeng Chan, Reuben Li, Xiuju Fu, Rick Goh, Pauline A. W. Poh Kim, Martin L. Hibberd, Hoong Chor Chin
Disclosing Climate Change Patterns Using An Adaptive Markov Chain Pattern Detection Method, Zhaoxia Wang, Gary Lee, Hoong Maeng Chan, Reuben Li, Xiuju Fu, Rick Goh, Pauline A. W. Poh Kim, Martin L. Hibberd, Hoong Chor Chin
Research Collection School Of Computing and Information Systems
This paper proposes an adaptive Markov chain pattern detection (AMCPD) method for disclosing the climate change patterns of Singapore through meteorological data mining. Meteorological variables, including daily mean temperature, mean dew point temperature, mean visibility, mean wind speed, maximum sustained wind speed, maximum temperature and minimum temperature are simultaneously considered for identifying climate change patterns in this study. The results depict various weather patterns from 1962 to 2011 in Singapore, based on the records of the Changi Meteorological Station. Different scenarios with varied cluster thresholds are employed for testing the sensitivity of the proposed method. The robustness of the proposed …