Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Data mining

2021

Discipline
Institution
Publication
Publication Type

Articles 1 - 14 of 14

Full-Text Articles in Computer Sciences

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii Dec 2021

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii

Publications and Research

The spaces we live in go through many transformations over the course of a year, a month, or a day; My room has seen tremendous clutter and pristine order within the span of a few hours. My goal is to discover patterns within my space and formulate an understanding of the changes that occur. This insight will provide actionable direction for maintaining a cleaner environment, as well as provide some information about the optimal times for productivity and energy preservation.

Using a Raspberry Pi, I will set up automated image capture in a room in my home. These images will …


Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma Dec 2021

Data-Driven Operational And Safety Analysis Of Emerging Shared Electric Scooter Systems, Qingyu Ma

Computational Modeling & Simulation Engineering Theses & Dissertations

The rapid rise of shared electric scooter (E-Scooter) systems offers many urban areas a new micro-mobility solution. The portable and flexible characteristics have made E-Scooters a competitive mode for short-distance trips. Compared to other modes such as bikes, E-Scooters allow riders to freely ride on different facilities such as streets, sidewalks, and bike lanes. However, sharing lanes with vehicles and other users tends to cause safety issues for riding E-Scooters. Conventional methods are often not applicable for analyzing such safety issues because well-archived historical crash records are not commonly available for emerging E-Scooters.

Perceiving the growth of such a micro-mobility …


Improving Accurate Candidates For Missing Data Using Benefit Performance Of (Ml-Som), Abeer Abdullah Al-Mohdar, Mohamed Abdullah Bamatraf Nov 2021

Improving Accurate Candidates For Missing Data Using Benefit Performance Of (Ml-Som), Abeer Abdullah Al-Mohdar, Mohamed Abdullah Bamatraf

Hadhramout University Journal of Natural & Applied Sciences

Missing data is one of the major challenges in extracting and analyzing knowledge from datasets. The performance of training quality was affected by the appearance of missing data in a dataset. For this reason, there is a need for a quick and reliable method to find possible solutions in order to provide an accurate system. Therefore, the previous studies provided robust ability of Self Organizing Map (SOM) algorithm to deal with the missing values [6, 20]. However, it has a drawback such as an error rate(ERR) in the missing values that increase huge dataset. This study is mainly based on …


Research On Assocoation Information Mining Of Space Reconnaissance Equipment System Index, Han Chi, Xiong Wei Oct 2021

Research On Assocoation Information Mining Of Space Reconnaissance Equipment System Index, Han Chi, Xiong Wei

Journal of System Simulation

Abstract: The system effectiveness and system contribution rate of the Space Reconnaissance Equipment System (SRES) has a large number of mutally associated indicators. How to identify relationships the association, select the key indicators and clarify the assocition between core indicators and system contribution rate are the key of the evaluation of system effectiveness and contribution rate. Through the joint simulation of MATLAB and STK, the underlying index data of SRES is obtained. Based on the Frequent Pattern-Tree (FP-Tree) algorithm, the assocition information is discovered, the redundancy is removed and the type of indicator assocition is determined, and an optimization model …


Tweet-To-Act: Towards Tweet-Mining Framework For Extracting Terrorist Attack-Related Information And Reporting, Farkhund Iqbal, Rabia Batool, Benjamin C. M. Fung, Saiqa Aleem, Ahmed Abbasi, Abdul Rehman Javed Aug 2021

Tweet-To-Act: Towards Tweet-Mining Framework For Extracting Terrorist Attack-Related Information And Reporting, Farkhund Iqbal, Rabia Batool, Benjamin C. M. Fung, Saiqa Aleem, Ahmed Abbasi, Abdul Rehman Javed

All Works

The widespread popularity of social networking is leading to the adoption of Twitter as an information dissemination tool. Existing research has shown that information dissemination over Twitter has a much broader reach than traditional media and can be used for effective post-incident measures. People use informal language on Twitter, including acronyms, misspelled words, synonyms, transliteration, and ambiguous terms. This makes incident-related information extraction a non-trivial task. However, this information can be valuable for public safety organizations that need to respond in an emergency. This paper proposes an early event-related information extraction and reporting framework that monitors Twitter streams, synthesizes event-specific …


Characterizing Search Activities On Stack Overflow, Jiakun Liu, Sebastian Baltes, Christoph Treude, David Lo, Yun Zhang, Xin Xia Aug 2021

Characterizing Search Activities On Stack Overflow, Jiakun Liu, Sebastian Baltes, Christoph Treude, David Lo, Yun Zhang, Xin Xia

Research Collection School Of Computing and Information Systems

To solve programming issues, developers commonly search on Stack Overflow to seek potential solutions. However, there is a gap between the knowledge developers are interested in and the knowledge they are able to retrieve using search engines. To help developers efficiently retrieve relevant knowledge on Stack Overflow, prior studies proposed several techniques to reformulate queries and generate summarized answers. However, few studies performed a large-scale analysis using real-world search logs. In this paper, we characterize how developers search on Stack Overflow using such logs. By doing so, we identify the challenges developers face when searching on Stack Overflow and seek …


Modeling Of Argon Bombardment And Densification Of Low Temperature Organic Precursors Using Reactive Md Simulations And Machine Learning, Kwabena Asante-Boahen Aug 2021

Modeling Of Argon Bombardment And Densification Of Low Temperature Organic Precursors Using Reactive Md Simulations And Machine Learning, Kwabena Asante-Boahen

MSU Graduate Theses

In this study, an important aspect of the synthesis process for a-BxC:Hy was systematically modeled by utilizing the Reactive Molecular Dynamics (MD) in modeling the argon bombardment from the orthocarborane molecules as the precursor. The MD simulations are used to assess the dynamics associated with the free radicals that result from the ion bombardment. By applying the Data Mining/Machine Learning analysis into the datasets generated from the large reactive MD simulations, I was able to identify and quality the kinetics of these radicals. Overall, this approach allows for a better understanding of the overall mechanism at the atomistic level of …


Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur Jun 2021

Exploring The Use Of Social Media To Infer Relationships Between Demographics, Psychographics And Vaccine Hesitancy, Abhimanyu Kapur

Computer Science Senior Theses

The growing popularity of social media as a platform to obtain information and share one's opinions on various topics makes it a rich source of information for research. In this study, we aimed to develop a framework to infer relationships between demographic and psychographic characteristics of a user and their opinion on a specific narrative - in this case, their stance on taking the COVID-19 vaccine. Twitter was the chosen platform due to the large USA user base and easily available data. Demographic traits included Race, Age, Gender, and Human-vs-Organization Status. Psychographic traits included the Big Five personality traits (Conscientiousness, …


Mining Subgroups From Temporal Data : From The Parts To The Whole, Alexander Gorovits May 2021

Mining Subgroups From Temporal Data : From The Parts To The Whole, Alexander Gorovits

Legacy Theses & Dissertations (2009 - 2024)

A variety of dynamic systems can be broken down into potentially overlapping subcomponents with varying temporal behavior, ranging from communities in networks, to clusters of trajectories in spatiotemporal data, to co-evolving subsets within multivariate time series. Using explicit regularization on various temporal behaviors within a tensor factorizationframework, I demonstrate means to mine these subgroups along with their temporal activities, as well as how that yields information about the overall systems. Additionally, I adapt this notion of temporal communities to the spatiotemporal setting to develop a reinforcement learning approach for optimizing co-ordinated communication between independent agents.


Robust Inference Of Kinase Activity Using Functional Networks, Serhan Yılmaz, Marzieh Ayati, Daniela Schlatzer, A. Ercüment Çiçek, Mark A. Chance, Mehmet Koyutürk Feb 2021

Robust Inference Of Kinase Activity Using Functional Networks, Serhan Yılmaz, Marzieh Ayati, Daniela Schlatzer, A. Ercüment Çiçek, Mark A. Chance, Mehmet Koyutürk

Computer Science Faculty Publications and Presentations

Mass spectrometry enables high-throughput screening of phosphoproteins across a broad range of biological contexts. When complemented by computational algorithms, phospho-proteomic data allows the inference of kinase activity, facilitating the identification of dysregulated kinases in various diseases including cancer, Alzheimer’s disease and Parkinson’s disease. To enhance the reliability of kinase activity inference, we present a network-based framework, RoKAI, that integrates various sources of functional information to capture coordinated changes in signaling. Through computational experiments, we show that phosphorylation of sites in the functional neighborhood of a kinase are significantly predictive of its activity. The incorporation of this knowledge in RoKAI consistently …


Occam Manual, Martin Zwick Jan 2021

Occam Manual, Martin Zwick

Systems Science Faculty Publications and Presentations

Occam is a Discrete Multivariate Modeling (DMM) tool based on the methodology of Reconstructability Analysis (RA). Its typical usage is for analysis of problems involving large numbers of discrete variables. Models are developed which consist of one or more components, which are then evaluated for their fit and statistical significance. Occam can search the lattice of all possible models, or can do detailed analysis on a specific model.

In Variable-Based Modeling (VBM), model components are collections of variables. In State-Based Modeling (SBM), components identify one or more specific states or substates.

Occam provides a web-based interface, which …


Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi Jan 2021

Binary Black Widow Optimization Algorithm For Feature Selection Problems, Ahmed Al-Saedi

Theses and Dissertations (Comprehensive)

This thesis addresses feature selection (FS) problems, which is a primary stage in data mining. FS is a significant pre-processing stage to enhance the performance of the process with regards to computation cost and accuracy to offer a better comprehension of stored data by removing the unnecessary and irrelevant features from the basic dataset. However, because of the size of the problem, FS is known to be very challenging and has been classified as an NP-hard problem. Traditional methods can only be used to solve small problems. Therefore, metaheuristic algorithms (MAs) are becoming powerful methods for addressing the FS problems. …


A Case Study On Player Selection And Team Formation In Football With Machinelearning, Di̇dem Abi̇di̇n Jan 2021

A Case Study On Player Selection And Team Formation In Football With Machinelearning, Di̇dem Abi̇di̇n

Turkish Journal of Electrical Engineering and Computer Sciences

Machine learning has been widely used in different domains to extract information from raw data. Sports is one of the popular domains for researchers to work on recently. Although score prediction for matches is the most preferred application area for artificial intelligence, player selection, and team formation is also an application area worth working on. There are some studies in the literature about player selection and team formation which are examined in this study. The study has two important contributions: First one is to apply seven different machine learning algorithms on our dataset to find the best player combination for …


Toward Tweet-Mining Framework For Extracting Terrorist Attack-Related Information And Reporting, Farkhund Iqbal, Rabia Batool, Benjamin C. M. Fung, Saiqa Aleem, Ahmed Abbasi, Abdul Rehman Javed Jan 2021

Toward Tweet-Mining Framework For Extracting Terrorist Attack-Related Information And Reporting, Farkhund Iqbal, Rabia Batool, Benjamin C. M. Fung, Saiqa Aleem, Ahmed Abbasi, Abdul Rehman Javed

All Works

The widespread popularity of social networking is leading to the adoption of Twitter as an information dissemination tool. Existing research has shown that information dissemination over Twitter has a much broader reach than traditional media and can be used for effective post-incident measures. People use informal language on Twitter, including acronyms, misspelled words, synonyms, transliteration, and ambiguous terms. This makes incident-related information extraction a non-trivial task. However, this information can be valuable for public safety organizations that need to respond in an emergency. This paper proposes an early event-related information extraction and reporting framework that monitors Twitter streams synthesizes event-specific …