Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Computer Sciences

Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim Dec 2018

Data Mining Approach To The Detection Of Suicide In Social Media: A Case Study Of Singapore, Jane H. K. Seah, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

In this research, we focus on the social phenomenon of suicide. Specifically, we perform social sensing on digital traces obtained from Reddit. We analyze the posts and comments in that are related to depression and suicide. We perform natural language processing to better understand different aspects of human life that relate to suicide.


Enhancing Value-Based Healthcare With Reconstructability Analysis: Predicting Cost Of Care In Total Hip Replacement, Cecily Corrine Froemke, Martin Zwick Nov 2018

Enhancing Value-Based Healthcare With Reconstructability Analysis: Predicting Cost Of Care In Total Hip Replacement, Cecily Corrine Froemke, Martin Zwick

Systems Science Faculty Publications and Presentations

Legislative reforms aimed at slowing growth of US healthcare costs are focused on achieving greater value per dollar. To increase value healthcare providers must not only provide high quality care, but deliver this care at a sustainable cost. Predicting risks that may lead to poor outcomes and higher costs enable providers to augment decision making for optimizing patient care and inform the risk stratification necessary in emerging reimbursement models. Healthcare delivery systems are looking at their high volume service lines and identifying variation in cost and outcomes in order to determine the patient factors that are driving this variation and …


Malware Analysis On Android Using Supervised Machine Learning Techniques, Md Shohel Rana, Andrew H. Sung Oct 2018

Malware Analysis On Android Using Supervised Machine Learning Techniques, Md Shohel Rana, Andrew H. Sung

Faculty Publications

In recent years, a widespread research is conducted with the growth of malware resulted in the domain of malware analysis and detection in Android devices. Android, a mobile-based operating system currently having more than one billion active users with a high market impact that have inspired the expansion of malware by cyber criminals. Android implements a different architecture and security controls to solve the problems caused by malware, such as unique user ID (UID) for each application, system permissions, and its distribution platform Google Play. There are numerous ways to violate that fortification, and how the complexity of creating a …


Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy Oct 2018

Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Developers frequently discuss aspects of the systems they are developing online. The comments they post to discussions form a rich information source about the system. Intention mining, a process introduced by Di Sorbo et al., classifies sentences in developer discussions to enable further analysis. As one example of use, intention mining has been used to help build various recommenders for software developers. The technique introduced by Di Sorbo et al. to categorize sentences is based on linguistic patterns derived from two projects. The limited number of data sources used in this earlier work introduces questions about the comprehensiveness of intention …


Traffic-Cascade: Mining And Visualizing Lifecycles Of Traffic Congestion Events Using Public Bus Trajectories, Agus Trisnajaya Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim Oct 2018

Traffic-Cascade: Mining And Visualizing Lifecycles Of Traffic Congestion Events Using Public Bus Trajectories, Agus Trisnajaya Kwee, Meng-Fen Chiang, Philips Kokoh Prasetyo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

As road transportation supports both economic and social activities in developed cities, it is important to maintain smooth traffic on all highways and local roads. Whenever possible, traffic congestions should be detected early and resolved quickly. While existing traffic monitoring dashboard systems have been put in place in many cities, these systems require high-cost vehicle speed monitoring instruments and detect traffic congestion as independent events. There is a lack of low-cost dashboards to inspect and analyze the lifecycle of traffic congestion which is critical in assessing the overall impact of congestion, determining the possible the source(s) of congestion and its …


Preliminary Results Of Bayesian Networks And Reconstructability Analysis Applied To The Electric Grid, Marcus Harris, Martin Zwick Jul 2018

Preliminary Results Of Bayesian Networks And Reconstructability Analysis Applied To The Electric Grid, Marcus Harris, Martin Zwick

Systems Science Faculty Publications and Presentations

Reconstructability Analysis (RA) is an analytical approach developed in the systems community that combines graph theory and information theory. Graph theory provides the structure of relations (model of the data) between variables and information theory characterizes the strength and the nature of the relations. RA has three primary approaches to model data: variable based (VB) models without loops (acyclic graphs), VB models with loops (cyclic graphs) and state-based models (nearly always cyclic, individual states specifying model constraints). These models can either be directed or neutral. Directed models focus on a single response variable whereas neutral models focus on all relations …


Reconstructability & Dynamics Of Elementary Cellular Automata, Martin Zwick Jul 2018

Reconstructability & Dynamics Of Elementary Cellular Automata, Martin Zwick

Systems Science Faculty Publications and Presentations

Reconstructability analysis (RA) is a method to determine whether a multivariate relation, defined set- or information-theoretically, is decomposable with or without loss into lower ordinality relations. Set-theoretic RA (SRA) is used to characterize the mappings of elementary cellular automata. The decomposition possible for each mapping w/o loss is a better predictor than the λ parameter (Walker & Ashby, Langton) of chaos, & non-decomposable mappings tend to produce chaos. SRA yields not only the simplest lossless structure but also a vector of losses for all structures, indexed by parameter τ. These losses are analogous to transmissions in information-theoretic RA (IRA). IRA …


Efficient Representative Subset Selection Over Sliding Windows, Yanhao Wang, Yuchen Li, Kian-Lee Tan Jul 2018

Efficient Representative Subset Selection Over Sliding Windows, Yanhao Wang, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Representative subset selection (RSS) is an important tool for users to draw insights from massive datasets. Existing literature models RSS as submodular maximization to capture the "diminishing returns" property of representativeness, but often only has a single constraint, which limits its applications to many real-world problems. To capture the recency issue and support various constraints, we formulate dynamic RSS as maximizing submodular functions subject to general d -knapsack constraints (SMDK) over sliding windows. We propose a KnapWindow framework (KW) for SMDK. KW utilizes KnapStream (KS) for SMDK in append-only streams as a subroutine. It maintains a sequence of checkpoints and …


Introduction To Reconstructability Analysis, Martin Zwick Jul 2018

Introduction To Reconstructability Analysis, Martin Zwick

Systems Science Faculty Publications and Presentations

This talk will introduce Reconstructability Analysis (RA), a data modeling methodology deriving from the 1960s work of Ross Ashby and developed in the systems community in the 1980s and afterwards. RA, based on information theory and graph theory, is a member of the family of methods known as ‘graphical models,’ which also include Bayesian networks and log-linear techniques. It is designed for exploratory modeling, although it can also be used for confirmatory hypothesis testing. RA can discover high ordinality and nonlinear interactions that are not hypothesized in advance. Its conceptual framework illuminates the relationships between wholes and parts, a subject …


The Algorithmic Composition Of Classical Music Through Data Mining, Tom Donald Richmond, Imad Rahal Apr 2018

The Algorithmic Composition Of Classical Music Through Data Mining, Tom Donald Richmond, Imad Rahal

All College Thesis Program, 2016-2019

The desire to teach a computer how to algorithmically compose music has been a topic in the world of computer science since the 1950’s, with roots of computer-less algorithmic composition dating back to Mozart himself. One limitation of algorithmically composing music has been the difficulty of eliminating the human intervention required to achieve a musically homogeneous composition. We attempt to remedy this issue by teaching a computer how the rules of composition differ between the six distinct eras of classical music by having it examine a dataset of musical scores, rather than explicitly telling the computer the formal rules of …


Statistical Analysis Of Network Change, Teresa D. Schmidt, Martin Zwick Feb 2018

Statistical Analysis Of Network Change, Teresa D. Schmidt, Martin Zwick

Systems Science Faculty Publications and Presentations

Networks are rarely subjected to hypothesis tests for difference, but when they are inferred from datasets of independent observations statistical testing is feasible. To demonstrate, a healthcare provider network is tested for significant change after an intervention using Medicaid claims data. First, the network is inferred for each time period with (1) partial least squares (PLS) regression and (2) reconstructability analysis (RA). Second, network distance (i.e., change between time periods) is measured as the mean absolute difference in (1) coefficient matrices for PLS and (2) calculated probability distributions for RA. Third, the network distance is compared against a reference distribution …


Classification Using Association Rules, Colin Kane Jan 2018

Classification Using Association Rules, Colin Kane

Dissertations

This research investigates the use of an unsupervised learning technique, association rules, to make class predictions. The use of association rules to make class predictions is a growing area of focus within data mining research. The research to date has focused predominately on balanced datasets or synthetized imbalanced datasets. There have been concerns raised that the algorithms using association rules to make classifications do not perform well on imbalanced datasets. This research comprehensively evaluates the accuracy of a number of association rule classifiers in predicting home loan sales in an Irish retail banking context. The experiments designed test three associative …


Continuous Restricted Boltzmann Machines, Robert W. Harrison Jan 2018

Continuous Restricted Boltzmann Machines, Robert W. Harrison

EBCS Articles

Restricted Boltzmann machines are a generative neural network. They summarize their input data to build a probabilistic model that can then be used to reconstruct missing data or to classify new data. Unlike discrete Boltzmann machines, where the data are mapped to the space of integers or bitstrings, continuous Boltzmann machines directly use floating point numbers and therefore represent the data with higher fidelity. The primary limitation in using Boltzmann machines for big-data problems is the efficiency of the training algorithm. This paper describes an efficient deterministic algorithm for training continuous machines.


Clicking Into Mortgage Arrears: A Study Into Arrears Prediction With Clickstream Data, Gavin O'Brien Jan 2018

Clicking Into Mortgage Arrears: A Study Into Arrears Prediction With Clickstream Data, Gavin O'Brien

Dissertations

This research project investigates the predictive capability of clickstream data when used for the purpose of mortgage arrears prediction. With an ever growing number of people switching to digital channels to handle their daily banking requirements, there is a wealth of ever increasing online usage data, otherwise known as clickstream data. If leveraged correctly, this clickstream data can be a powerful data source for organisations as it provides detailed information about how their customers are interacting with their digital channels. Much of the current literature associated with clickstream data relates to organisations employing it within their customer relationship management mechanisms …


Exploratory Reconstructability Analysis Of Accident Tbi Data, Martin Zwick, Nancy Ann Carney, Rosemary Nettleton Jan 2018

Exploratory Reconstructability Analysis Of Accident Tbi Data, Martin Zwick, Nancy Ann Carney, Rosemary Nettleton

Systems Science Faculty Publications and Presentations

This paper describes the use of reconstructability analysis to perform a secondary study of traumatic brain injury data from automobile accidents. Neutral searches were done and their results displayed with a hypergraph. Directed searches, using both variable-based and state-based models, were applied to predict performance on two cognitive tests and one neurological test. Very simple state-based models gave large uncertainty reductions for all three DVs and sizeable improvements in percent correct for the two cognitive test DVs which were equally sampled. Conditional probability distributions for these models are easily visualized with simple decision trees. Confounding variables and counter-intuitive findings are …