Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 24 of 24

Full-Text Articles in Computer Sciences

Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang Jan 2024

Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang

Research Collection School Of Computing and Information Systems

Massive Open Online Courses (MOOCs) platforms are becoming increasingly popular in recent years. Online learners need to watch the whole course video on MOOC platforms to learn the underlying new knowledge, which is often tedious and time-consuming due to the lack of a quick overview of the covered knowledge and their structures. In this paper, we propose ConceptThread , a visual analytics approach to effectively show the concepts and the relations among them to facilitate effective online learning. Specifically, given that the majority of MOOC videos contain slides, we first leverage video processing and speech analysis techniques, including shot recognition, …


An Empirical Study Of Blockchain System Vulnerabilities: Modules, Types, And Patterns, Xiao Yi, Daoyuan Wu, Lingxiao Jiang, Yuzhou Fang, Kehuan Zhang, Wei Zhang Nov 2022

An Empirical Study Of Blockchain System Vulnerabilities: Modules, Types, And Patterns, Xiao Yi, Daoyuan Wu, Lingxiao Jiang, Yuzhou Fang, Kehuan Zhang, Wei Zhang

Research Collection School Of Computing and Information Systems

Blockchain, as a distributed ledger technology, becomes increasingly popular, especially for enabling valuable cryptocurrencies and smart contracts. However, the blockchain software systems inevitably have many bugs. Although bugs in smart contracts have been extensively investigated, security bugs of the underlying blockchain systems are much less explored. In this paper, we conduct an empirical study on blockchain’s system vulnerabilities from four representative blockchains, Bitcoin, Ethereum, Monero, and Stellar. Specifically, we first design a systematic filtering process to effectively identify 1,037 vulnerabilities and their 2,317 patches from 34,245 issues/PRs (pull requests) and 85,164 commits on GitHub. We thus build the first blockchain …


Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii Dec 2021

Messiness: Automating Iot Data Streaming Spatial Analysis, Christopher White, Atilio Barreda Ii

Publications and Research

The spaces we live in go through many transformations over the course of a year, a month, or a day; My room has seen tremendous clutter and pristine order within the span of a few hours. My goal is to discover patterns within my space and formulate an understanding of the changes that occur. This insight will provide actionable direction for maintaining a cleaner environment, as well as provide some information about the optimal times for productivity and energy preservation.

Using a Raspberry Pi, I will set up automated image capture in a room in my home. These images will …


Prevalence, Contents And Automatic Detection Of Kl-Satd, Leevi Rantala, Mika Mantyla, David Lo Aug 2020

Prevalence, Contents And Automatic Detection Of Kl-Satd, Leevi Rantala, Mika Mantyla, David Lo

Research Collection School Of Computing and Information Systems

When developers use different keywords such as TODO and FIXME in source code comments to describe self-admitted technical debt (SATD), we refer it as Keyword-Labeled SATD (KL-SATD). We study KL-SATD from 33 software repositories with 13,588 KL-SATD comments. We find that the median percentage of KL-SATD comments among all comments is only 1,52%. We find that KL-SATD comment contents include words expressing code changes and uncertainty, such as remove, fix, maybe and probably. This makes them different compared to other comments. KL-SATD comment contents are similar to manually labeled SATD comments of prior work. Our machine learning classifier using logistic …


Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy Oct 2018

Automating Intention Mining, Qiao Huang, Xin Xia, David Lo, Gail C. Murphy

Research Collection School Of Computing and Information Systems

Developers frequently discuss aspects of the systems they are developing online. The comments they post to discussions form a rich information source about the system. Intention mining, a process introduced by Di Sorbo et al., classifies sentences in developer discussions to enable further analysis. As one example of use, intention mining has been used to help build various recommenders for software developers. The technique introduced by Di Sorbo et al. to categorize sentences is based on linguistic patterns derived from two projects. The limited number of data sources used in this earlier work introduces questions about the comprehensiveness of intention …


Who Will Leave The Company?: A Large-Scale Industry Study Of Developer Turnover By Mining Monthly Work Report, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li May 2017

Who Will Leave The Company?: A Large-Scale Industry Study Of Developer Turnover By Mining Monthly Work Report, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Software developer turnover has become a big challenge for information technology (IT) companies. The departure of key software developers might cause big loss to an IT company since they also depart with important business knowledge and critical technical skills. Understanding developer turnover is very important for IT companies to retain talented developers and reduce the loss due to developers' departure. Previous studies mainly perform qualitative observations or simple statistical analysis of developers' activity data to understand developer turnover. In this paper, we investigate whether we can predict the turnover of software developers in non-open source companies by automatically analyzing monthly …


Data Mining By Grid Computing In The Search For Extrasolar Planets, Oisin Creaner [Thesis] Jan 2017

Data Mining By Grid Computing In The Search For Extrasolar Planets, Oisin Creaner [Thesis]

Doctoral

A system is presented here to provide improved precision in ensemble differential photometry. This is achieved by using the power of grid computing to analyse astronomical catalogues. This produces new catalogues of optimised pointings for each star, which maximise the number and quality of reference stars available. Astronomical phenomena such as exoplanet transits and small-scale structure within quasars may be observed by means of millimagnitude photometric variability on the timescale of minutes to hours. Because of atmospheric distortion, ground-based observations of these phenomena require the use of differential photometry whereby the target is compared with one or more reference stars. …


The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar Oct 2015

The Importance Of Being Isolated: An Empirical Study On Chromium Reviews, Subhajit Datta, Devarshi Bhatt, Manish Jain, Proshanta Sarkar, Santonu Sarkar

Research Collection School Of Computing and Information Systems

As large scale software development has become more collaborative, and software teams more globally distributed, several studies have explored how developer interaction influences software development outcomes. The emphasis so far has been largely on outcomes like defect count, the time to close modification requests etc. In the paper, we examine data from the Chromium project to understand how different aspects of developer discussion relate to the closure time of reviews. On the basis of analyzing reviews discussed by 2000+ developers, our results indicate that quicker closure of reviews owned by a developer relates to higher reception of information and insights …


Improving Software Quality And Productivity Leveraging Mining Techniques: [Summary Of The Second Workshop On Software Mining, At Ase 2013], Ming Li, Hongyu Zhang, David Lo, Lucia Lucia Jan 2015

Improving Software Quality And Productivity Leveraging Mining Techniques: [Summary Of The Second Workshop On Software Mining, At Ase 2013], Ming Li, Hongyu Zhang, David Lo, Lucia Lucia

Research Collection School Of Computing and Information Systems

The second International Workshop on Software Mining (Soft-mine) was held on the 11th of November 2013. The workshop was held in conjunction with the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE) in Silicon Valley, California, USA. The workshop has facilitated researchers who are interested in mining various types of software-related data and in applying data mining techniques to support software engineering tasks. During the workshop, seven papers on software mining and behavior models, execution trace mining, and bug localization and fixing were presented. One of the papers received the best paper award. Furthermore, there were two invited talk …


Mining Branching-Time Scenarios, Dirk Fahland, David Lo, Shahar Maoz Jun 2014

Mining Branching-Time Scenarios, Dirk Fahland, David Lo, Shahar Maoz

David LO

Specification mining extracts candidate specification from existing systems, to be used for downstream tasks such as testing and verification. Specifically, we are interested in the extraction of behavior models from execution traces. In this paper we introduce mining of branching-time scenarios in the form of existential, conditional Live Sequence Charts, using a statistical data-mining algorithm. We show the power of branching scenarios to reveal alternative scenario-based behaviors, which could not be mined by previous approaches. The work contrasts and complements previous works on mining linear-time scenarios. An implementation and evaluation over execution trace sets recorded from several real-world applications shows …


Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall Jun 2014

Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall

David LO

Many third party libraries are available to be downloaded and used. Using such libraries can reduce development time and make the developed software more reliable. However, developers are often unaware of suitable libraries to be used for their projects and thus they miss out on these benefits. To help developers better take advantage of the available libraries, we propose a new technique that automatically recommends libraries to developers. Our technique takes as input the set of libraries that an application currently uses, and recommends other libraries that are likely to be relevant. We follow a hybrid approach that combines association …


Ranking-Based Approaches For Localizing Faults, Lucia Lucia Jun 2014

Ranking-Based Approaches For Localizing Faults, Lucia Lucia

Dissertations and Theses Collection (Open Access)

A fault is the root cause of program failures where a program behaves differently from the intended behavior. Finding or localizing faults is often laborious (especially so for complex programs), yet it is an important task in the software lifecycle. An automated technique that can accurately and quickly identify the faulty code is greatly needed to alleviate the costs of software debugging. Many fault localization techniques assume that faults are localizable, i.e., each fault manifests only in a single or a few lines of code that are close to one another. To verify this assumption, we study how faults spread …


Machine Learning In Wireless Sensor Networks: Algorithms, Strategies, And Applications, Mohammad Abu Alsheikh, Shaowei Lin, Dusit Niyato, Hwee-Pink Tan Apr 2014

Machine Learning In Wireless Sensor Networks: Algorithms, Strategies, And Applications, Mohammad Abu Alsheikh, Shaowei Lin, Dusit Niyato, Hwee-Pink Tan

Research Collection School Of Computing and Information Systems

Wireless sensor networks (WSNs) monitor dynamic environments that change rapidly over time. This dynamic behavior is either caused by external factors or initiated by the system designers themselves. To adapt to such conditions, sensor networks often adopt machine learning techniques to eliminate the need for unnecessary redesign. Machine learning also inspires many practical solutions that maximize resource utilization and prolong the lifespan of the network. In this paper, we present an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in WSNs. The advantages and disadvantages of each proposed algorithm are …


Towards A Hybrid Framework For Detecting Input Manipulation Vulnerabilities, Sun Ding, Hee Beng Kuan Tan, Lwin Khin Shar, Bindu Madhavi Padmanabhuni Dec 2013

Towards A Hybrid Framework For Detecting Input Manipulation Vulnerabilities, Sun Ding, Hee Beng Kuan Tan, Lwin Khin Shar, Bindu Madhavi Padmanabhuni

Research Collection School Of Computing and Information Systems

Input manipulation vulnerabilities such as SQL Injection, Cross-site scripting, Buffer Overflow vulnerabilities are highly prevalent and pose critical security risks. As a result, many methods have been proposed to apply static analysis, dynamic analysis or a combination of them, to detect such security vulnerabilities. Most of the existing methods classify vulnerabilities into safe and unsafe. They have both false-positive and false-negative cases. In general, security vulnerability can be classified into three cases: (1) provable safe, (2) provable unsafe, (3) unsure. In this paper, we propose a hybrid framework-Detecting Input Manipulation Vulnerabilities (DIMV), to verify the adequacy of security vulnerability defenses …


Mining Branching-Time Scenarios, Dirk Fahland, David Lo, Shahar Maoz Nov 2013

Mining Branching-Time Scenarios, Dirk Fahland, David Lo, Shahar Maoz

Research Collection School Of Computing and Information Systems

Specification mining extracts candidate specification from existing systems, to be used for downstream tasks such as testing and verification. Specifically, we are interested in the extraction of behavior models from execution traces. In this paper we introduce mining of branching-time scenarios in the form of existential, conditional Live Sequence Charts, using a statistical data-mining algorithm. We show the power of branching scenarios to reveal alternative scenario-based behaviors, which could not be mined by previous approaches. The work contrasts and complements previous works on mining linear-time scenarios. An implementation and evaluation over execution trace sets recorded from several real-world applications shows …


Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall Oct 2013

Automated Library Recommendation, Ferdian Thung, David Lo, Julia Lawall

Research Collection School Of Computing and Information Systems

Many third party libraries are available to be downloaded and used. Using such libraries can reduce development time and make the developed software more reliable. However, developers are often unaware of suitable libraries to be used for their projects and thus they miss out on these benefits. To help developers better take advantage of the available libraries, we propose a new technique that automatically recommends libraries to developers. Our technique takes as input the set of libraries that an application currently uses, and recommends other libraries that are likely to be relevant. We follow a hybrid approach that combines association …


Localizing State-Dependent Faults Using Associated Sequence Mining, Shaimaa Ali May 2013

Localizing State-Dependent Faults Using Associated Sequence Mining, Shaimaa Ali

Electronic Thesis and Dissertation Repository

In this thesis we developed a new fault localization process to localize faults in object oriented software. The process is built upon the "Encapsulation'' principle and aims to locate state-dependent discrepancies in the software's behavior. We experimented with the proposed process on 50 seeded faults in 8 subject programs, and were able to locate the faulty class in 100% of the cases when objects with constant states were taken into consideration, while we missed 24% percent of the faults when these objects were not considered. We also developed a customized data mining technique "Associated sequence mining'' to be used in …


Predicting Sql Injection And Cross Site Scripting Vulnerabilities Through Mining Input Sanitization Patterns, Lwin Khin Shar, Hee Beng Kuan Tan Apr 2013

Predicting Sql Injection And Cross Site Scripting Vulnerabilities Through Mining Input Sanitization Patterns, Lwin Khin Shar, Hee Beng Kuan Tan

Research Collection School Of Computing and Information Systems

ContextSQL injection (SQLI) and cross site scripting (XSS) are the two most common and serious web application vulnerabilities for the past decade. To mitigate these two security threats, many vulnerability detection approaches based on static and dynamic taint analysis techniques have been proposed. Alternatively, there are also vulnerability prediction approaches based on machine learning techniques, which showed that static code attributes such as code complexity measures are cheap and useful predictors. However, current prediction approaches target general vulnerabilities. And most of these approaches locate vulnerable code only at software component or file levels. Some approaches also involve process attributes that …


Mining Input Sanitization Patterns For Predicting Sql Injection And Cross Site Scripting Vulnerabilities, Lwin Khin Shar, Hee Beng Kuan Tan Jun 2012

Mining Input Sanitization Patterns For Predicting Sql Injection And Cross Site Scripting Vulnerabilities, Lwin Khin Shar, Hee Beng Kuan Tan

Research Collection School Of Computing and Information Systems

Static code attributes such as lines of code and cyclomatic complexity have been shown to be useful indicators of defects in software modules. As web applications adopt input sanitization routines to prevent web security risks, static code attributes that represent the characteristics of these routines may be useful for predicting web application vulnerabilities. In this paper, we classify various input sanitization methods into different types and propose a set of static code attributes that represent these types. Then we use data mining methods to predict SQL injection and cross site scripting vulnerabilities in web applications. Preliminary experiments show that our …


Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi Nov 2011

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi

David LO

In statistics and data mining communities, there have been many measures proposed to gauge the strength of association between two variables of interest, such as odds ratio, confidence, Yule-Y, Yule-Q, Kappa, and gini index. These association measures have been used in various domains, for example, to evaluate whether a particular medical practice is associated positively to a cure of a disease or whether a particular marketing strategy is associated positively to an increase in revenue, etc. This paper models the problem of locating faults as association between the execution or non-execution of particular program elements with failures. There have been …


Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi Sep 2010

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi

Research Collection School Of Computing and Information Systems

In statistics and data mining communities, there have been many measures proposed to gauge the strength of association between two variables of interest, such as odds ratio, confidence, Yule-Y, Yule-Q, Kappa, and gini index. These association measures have been used in various domains, for example, to evaluate whether a particular medical practice is associated positively to a cure of a disease or whether a particular marketing strategy is associated positively to an increase in revenue, etc. This paper models the problem of locating faults as association between the execution or non-execution of particular program elements with failures. There have been …


Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu Aug 2009

Data Mining For Software Engineering, Tao Xie, Suresh Thummalapenta, David Lo, Chao Liu

Research Collection School Of Computing and Information Systems

To improve software productivity and quality, software engineers are increasingly applying data mining algorithms to various software engineering tasks. However, mining SE data poses several challenges. The authors present various algorithms to effectively mine sequences, graphs, and text from such data.


A Dynamic Weight Assignment Approach For Ir Systems, M. Shoaib, Prof Dr. Abad Ali Shah, A. Vashishta Aug 2005

A Dynamic Weight Assignment Approach For Ir Systems, M. Shoaib, Prof Dr. Abad Ali Shah, A. Vashishta

International Conference on Information and Communication Technologies

Weights are assigned to the extracted keywords for partial matching and computing ranking in an IR system. Weight assignment technique is suggested by the IR model that is used for an IR system. Currently suggested weight assignment techniques are static which means that once weight is assigned a keyword it remains unchanged during life-span of an IR system. In this paper, we suggest a dynamic weight assignment technique. This technique can be used by any IR model that supports partial matching.


A Software Architecture For Reconstructability Analysis, Kenneth Willett, Martin Zwick Jan 2004

A Software Architecture For Reconstructability Analysis, Kenneth Willett, Martin Zwick

Systems Science Faculty Publications and Presentations

Software packages for reconstructability analysis (RA), as well as for related log linear modeling, generally provide a fixed set of functions. Such packages are suitable for end‐users applying RA in various domains, but do not provide a platform for research into the RA methods themselves. A new software system, Occam3, is being developed which is intended to address three goals which often conflict with one another to provide: a general and flexible infrastructure for experimentation with RA methods and algorithms; an easily‐configured system allowing methods to be combined in novel ways, without requiring deep software expertise; and a system which …