Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication
-
- Research Collection School Of Computing and Information Systems (5)
- Department of Information Systems & Computer Science Faculty Publications (1)
- Engineering Management & Systems Engineering Faculty Publications (1)
- Integrated Engineering Department Publications (1)
- School of Business: Faculty Publications and Other Works (1)
- Publication Type
Articles 1 - 11 of 11
Full-Text Articles in Physical Sciences and Mathematics
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
Creating Data From Unstructured Text With Context Rule Assisted Machine Learning (Craml), Stephen Meisenbacher, Peter Norlander
School of Business: Faculty Publications and Other Works
Popular approaches to building data from unstructured text come with limitations, such as scalability, interpretability, replicability, and real-world applicability. These can be overcome with Context Rule Assisted Machine Learning (CRAML), a method and no-code suite of software tools that builds structured, labeled datasets which are accurate and reproducible. CRAML enables domain experts to access uncommon constructs within a document corpus in a low-resource, transparent, and flexible manner. CRAML produces document-level datasets for quantitative research and makes qualitative classification schemes scalable over large volumes of text. We demonstrate that the method is useful for bibliographic analysis, transparent analysis of proprietary data, …
Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim
Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim
Research Collection School Of Computing and Information Systems
Programming courses provide students with the skills to develop complex business applications. Teaching and learning programming is challenging, and collaborative learning is proposed to help with this challenge. Online discussion forums promote networking with other learners such that they can build knowledge collaboratively. It aids students open their horizons of thought processes to acquire cognitive skills. Cognitive analysis of discussion is critical to understand students' learning process. In this paper, we propose Bloom's taxonomy based cognitive model for programming discussion forums. We present machine learning (ML) based solution to extract students' cognitive skills. Our evaluations on compupting courses show that …
Understanding Learners' Motivation Through Machine Learning Analysis On Reflection Writing, Elizabeth Pluskwik, Yuezhou Wang, Lauren Singelmann
Understanding Learners' Motivation Through Machine Learning Analysis On Reflection Writing, Elizabeth Pluskwik, Yuezhou Wang, Lauren Singelmann
Integrated Engineering Department Publications
Educational data mining (EDM) is an emerging interdisciplinary field that utilizes a machine learning (ML) algorithm to collect and analyze educational data, aiming to better predict students' performance and retention. In this WIP paper, we report our methodology and preliminary results from utilizing a ML program to assess students’ motivation through their upper-division years in the XYZ project-based learning (PBL) program. ML, or more specifically, the clustering algorithm, opens the door to processing large amounts of student-written artifacts, such as reflection journals, project reports, and written assignments, and then identifies keywords that signal their levels of motivation (i.e., extrinsic vs. …
Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen
Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen
Research Collection School Of Computing and Information Systems
This research investigates changes in online behavior of users who publish in multiple communities on Reddit by measuring their toxicity at two levels. With the aid of crowdsourcing, we built a labeled dataset of 10,083 Reddit comments, then used the dataset to train and fine-tune a Bidirectional Encoder Representations from Transformers (BERT) neural network model. The model predicted the toxicity levels of 87,376,912 posts from 577,835 users and 2,205,581,786 comments from 890,913 users on Reddit over 16 years, from 2005 to 2020. This study utilized the toxicity levels of user content to identify toxicity changes by the user within the …
Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu
Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu
Research Collection School Of Computing and Information Systems
With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can benefit various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. …
Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo
Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo
Research Collection School Of Computing and Information Systems
Software engineers depend heavily on software libraries and have to update their dependencies once vulnerabilities are found in them. Software Composition Analysis (SCA) helps developers identify vulnerable libraries used by an application. A key challenge is the identification of libraries related to a given reported vulnerability in the National Vulnerability Database (NVD), which may not explicitly indicate the affected libraries. Recently, researchers have tried to address the problem of identifying the libraries from an NVD report by treating it as an extreme multi-label learning (XML) problem, characterized by its large number of possible labels and severe data sparsity. As input, …
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
A Predictive Model To Predict Cyberattack Using Self-Normalizing Neural Networks, Oluwapelumi Eniodunmo
Theses, Dissertations and Capstones
Cyberattack is a never-ending war that has greatly threatened secured information systems. The development of automated and intelligent systems provides more computing power to hackers to steal information, destroy data or system resources, and has raised global security issues. Statistical and Data mining tools have received continuous research and improvements. These tools have been adopted to create sophisticated intrusion detection systems that help information systems mitigate and defend against cyberattacks. However, the advancement in technology and accessibility of information makes more identifiable elements that can be used to gain unauthorized access to systems and resources. Data mining and classification tools …
Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang
Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang
Research Collection School Of Computing and Information Systems
Predictive models are one of the most important techniques that are widely applied in many areas of software engineering. There have been a large number of primary studies that apply predictive models and that present well-performed studies in various research domains, including software requirements, software design and development, testing and debugging, and software maintenance. This article is a first attempt to systematically organize knowledge in this area by surveying a body of 421 papers on predictive models published between 2009 and 2020. We describe the key models and approaches used, classify the different models, summarize the range of key application …
Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen
Predicting League Of Legends Ranked Games Outcome, Ngoc Linh Chi Nguyen
Senior Projects Spring 2022
League of Legends (LoL) is the one of most popular multiplayer online battle arena (MOBA) games in the world. For LoL, the most competitive way to evaluate a player’s skill level, below the professional Esports level, is competitive ranked games. These ranked games utilize a matchmaking system based on the player’s ranks to form a fair team for each game. However, a rank game's outcome cannot necessarily be predicted using just players’ ranks, there are a significant number of different variables impacting a rank game depending on how well each team plays. In this paper, I propose a method to …
Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng
Machine Learning In Requirements Elicitation: A Literature Review, Cheligeer Cheligeer, Jingwei Huang, Guosong Wu, Nadia Bhuiyan, Yuan Xu, Yong Zeng
Engineering Management & Systems Engineering Faculty Publications
A growing trend in requirements elicitation is the use of machine learning (ML) techniques to automate the cumbersome requirement handling process. This literature review summarizes and analyzes studies that incorporate ML and natural language processing (NLP) into demand elicitation. We answer the following research questions: (1) What requirement elicitation activities are supported by ML? (2) What data sources are used to build ML-based requirement solutions? (3) What technologies, algorithms, and tools are used to build ML-based requirement elicitation? (4) How to construct an ML-based requirements elicitation method? (5) What are the available tools to support ML-based requirements elicitation methodology? Keywords …
Non-Parametric Stochastic Autoencoder Model For Anomaly Detection, Raphael B. Alampay, Patricia Angela R. Abu
Non-Parametric Stochastic Autoencoder Model For Anomaly Detection, Raphael B. Alampay, Patricia Angela R. Abu
Department of Information Systems & Computer Science Faculty Publications
Anomaly detection is a widely studied field in computer science with applications ranging from intrusion detection, fraud detection, medical diagnosis and quality assurance in manufacturing. The underlying premise is that an anomaly is an observation that does not conform to what is considered to be normal. This study addresses two major problems in the field. First, anomalies are defined in a local context, that is, being able to give quantitative measures as to how anomalies are categorized within its own problem domain and cannot be generalized to other domains. Commonly, anomalies are measured according to statistical probabilities relative to the …