Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 87

Full-Text Articles in Physical Sciences and Mathematics

Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng Dec 2023

Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng

Research Collection School Of Computing and Information Systems

Unprecedented demand for data science professionals in the industry has led to many educational institutions launching new data science courses. It is however imperative that students of data science programmes learn through execution of real-world, authentic projects on top of acquiring foundational knowledge on the basics of data science. In the process of working on authentic, real-world projects, students not only create new knowledge but also learn to solve open, sophisticated, and ill-structured problems in an inter-disciplinary fashion. In this paper, we detailed our approach to design a data science curriculum premised on learners solving authentic data science problems sourced …


Development Of An Explainable Artificial Intelligence Model For Asian Vascular Wound Images, Zhiwen Joseph Lo, Malcolm Han Wen Mak, Shanying Liang, Yam Meng Chan, Cheng Cheng Goh, Tina Peiting Lai, Audrey Hui Min Tan, Patrick Thng, Patrick Thng, Tillman Weyde, Sylvia Smit Dec 2023

Development Of An Explainable Artificial Intelligence Model For Asian Vascular Wound Images, Zhiwen Joseph Lo, Malcolm Han Wen Mak, Shanying Liang, Yam Meng Chan, Cheng Cheng Goh, Tina Peiting Lai, Audrey Hui Min Tan, Patrick Thng, Patrick Thng, Tillman Weyde, Sylvia Smit

Research Collection School Of Computing and Information Systems

Chronic wounds contribute to significant healthcare and economic burden worldwide. Wound assessment remains challenging given its complex and dynamic nature. The use of artificial intelligence (AI) and machine learning methods in wound analysis is promising. Explainable modelling can help its integration and acceptance in healthcare systems. We aim to develop an explainable AI model for analysing vascular wound images among an Asian population. Two thousand nine hundred and fifty-seven wound images from a vascular wound image registry from a tertiary institution in Singapore were utilized. The dataset was split into training, validation and test sets. Wound images were classified into …


Data Provenance Via Differential Auditing, Xin Mu, Ming Pang, Feida Zhu Nov 2023

Data Provenance Via Differential Auditing, Xin Mu, Ming Pang, Feida Zhu

Research Collection School Of Computing and Information Systems

With the rising awareness of data assets, data governance, which is to understand where data comes from, how it is collected, and how it is used, has been assuming evergrowing importance. One critical component of data governance gaining increasing attention is auditing machine learning models to determine if specific data has been used for training. Existing auditing techniques, like shadow auditing methods, have shown feasibility under specific conditions such as having access to label information and knowledge of training protocols. However, these conditions are often not met in most real-world applications. In this paper, we introduce a practical framework for …


Faire: Repairing Fairness Of Neural Networks Via Neuron Condition Synthesis, Tianlin Li, Xiaofei Xie, Jian Wang, Qing Guo, Aishan Liu, Lei Ma, Yang Liu Nov 2023

Faire: Repairing Fairness Of Neural Networks Via Neuron Condition Synthesis, Tianlin Li, Xiaofei Xie, Jian Wang, Qing Guo, Aishan Liu, Lei Ma, Yang Liu

Research Collection School Of Computing and Information Systems

Deep Neural Networks (DNNs) have achieved tremendous success in many applications, while it has been demonstrated that DNNs can exhibit some undesirable behaviors on concerns such as robustness, privacy, and other trustworthiness issues. Among them, fairness (i.e., non-discrimination) is one important property, especially when they are applied to some sensitive applications (e.g., finance and employment). However, DNNs easily learn spurious correlations between protected attributes (e.g., age, gender, race) and the classification task and develop discriminatory behaviors if the training data is imbalanced. Such discriminatory decisions in sensitive applications would introduce severe social impacts. To expose potential discrimination problems in DNNs …


Constructing Cyber-Physical System Testing Suites Using Active Sensor Fuzzing, Fan. Zhang, Qianmei. Wu, Bohan. Xuan, Yuqi. Chen, Wei. Lin, Christopher M. Poskitt, Jun Sun, Binbin. Chen Oct 2023

Constructing Cyber-Physical System Testing Suites Using Active Sensor Fuzzing, Fan. Zhang, Qianmei. Wu, Bohan. Xuan, Yuqi. Chen, Wei. Lin, Christopher M. Poskitt, Jun Sun, Binbin. Chen

Research Collection School Of Computing and Information Systems

Cyber-physical systems (CPSs) automating critical public infrastructure face a pervasive threat of attack, motivating research into different types of countermeasures. Assessing the effectiveness of these countermeasures is challenging, however, as benchmarks are difficult to construct manually, existing automated testing solutions often make unrealistic assumptions, and blindly fuzzing is ineffective at finding attacks due to the enormous search spaces and resource requirements. In this work, we propose active sensor fuzzing , a fully automated approach for building test suites without requiring any a prior knowledge about a CPS. Our approach employs active learning techniques. Applied to a real-world water treatment system, …


On Predicting Esg Ratings Using Dynamic Company Networks, Gary Ang, Zhiling Guo, Ee-Peng Lim Sep 2023

On Predicting Esg Ratings Using Dynamic Company Networks, Gary Ang, Zhiling Guo, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Environmental, social and governance (ESG) considerations play an increasingly important role due to the growing focus on sustainability globally. Entities, such as banks and investors, utilize ESG ratings of companies issued by specialized rating agencies to evaluate ESG risks of companies. The process of assigning ESG ratings by human analysts is however laborious and time intensive. Developing methods to predict ESG ratings could alleviate such challenges, allow ESG ratings to be generated in a more timely manner, cover more companies, and be more accessible. Most works study the effects of ESG ratings on target variables such as stock prices or …


Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert Sep 2023

Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert

Research Collection School Of Computing and Information Systems

Android malware detection has been an active area of research. In the past decade, several machine learning-based approaches based on different types of features that may characterize Android malware behaviors have been proposed. The usually-analyzed features include API usages and sequences at various abstraction levels (e.g., class and package), extracted using static or dynamic analysis. Additionally, features that characterize permission uses, native API calls and reflection have also been analyzed. Initial works used conventional classifiers such as Random Forest to learn on those features. In recent years, deep learning-based classifiers such as Recurrent Neural Network have been explored. Considering various …


Multi-Granularity Detector For Vulnerability Fixes, Truong Giang Nguyen, Cong, Thanh Le, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, David Lo, David Lo Aug 2023

Multi-Granularity Detector For Vulnerability Fixes, Truong Giang Nguyen, Cong, Thanh Le, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, David Lo, David Lo

Research Collection School Of Computing and Information Systems

With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and …


Investigating Collaborative Problem Solving Temporal Dynamics Using Interactions Within A Digital Whiteboard, Hua Leong Fwa Apr 2023

Investigating Collaborative Problem Solving Temporal Dynamics Using Interactions Within A Digital Whiteboard, Hua Leong Fwa

Research Collection School Of Computing and Information Systems

Collaborative Problem Solving, the resolution of complex problems with the collaboration of multiple peoplepooling their knowledge, skills and effort is postulated as an essential 21st century skills for the futureworkforce. Collaborative Problem Solving has been embraced in schools where both online and face-to-face collaboration are afforded through the proliferation of educational technology tools. Assessing the amount of collaboration that has taken place among the students has however been challenging. In this research, we seek to identify the collaboration patterns of our students by mining the temporal sequence of their actions logs captured within a digital whiteboard tool. With the use …


Learning-Based Stock Trending Prediction By Incorporating Technical Indicators And Social Media Sentiment, Zhaoxia Wang, Zhenda Hu, Fang Li, Seng-Beng Ho, Erik Cambria Mar 2023

Learning-Based Stock Trending Prediction By Incorporating Technical Indicators And Social Media Sentiment, Zhaoxia Wang, Zhenda Hu, Fang Li, Seng-Beng Ho, Erik Cambria

Research Collection School Of Computing and Information Systems

Stock trending prediction is a challenging task due to its dynamic and nonlinear characteristics. With the development of social platform and artificial intelligence (AI), incorporating timely news and social media information into stock trending models becomes possible. However, most of the existing works focus on classification or regression problems when predicting stock market trending without fully considering the effects of different influence factors in different phases. To address this gap, this research solves stock trending prediction problem utilizing both technical indicators and sentiments of the social media text as influence factors in different situations. A 3-phase hybrid model is proposed …


Towards Reinterpreting Neural Topic Models Via Composite Activations, Jia Peng Lim, Hady Wirawan Lauw Dec 2022

Towards Reinterpreting Neural Topic Models Via Composite Activations, Jia Peng Lim, Hady Wirawan Lauw

Research Collection School Of Computing and Information Systems

Most Neural Topic Models (NTM) use a variational auto-encoder framework producing K topics limited to the size of the encoder’s output. These topics are interpreted through the selection of the top activated words via the weights or reconstructed vector of the decoder that are directly connected to each neuron. In this paper, we present a model-free two-stage process to reinterpret NTM and derive further insights on the state of the trained model. Firstly, building on the original information from a trained NTM, we generate a pool of potential candidate “composite topics” by exploiting possible co-occurrences within the original set of …


Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim Nov 2022

Investigating Bloom's Cognitive Skills In Foundation And Advanced Programming Courses From Students' Discussions, Joel Jer Wei Lim, Gottipati Swapna, Kyong Jin Shim

Research Collection School Of Computing and Information Systems

Programming courses provide students with the skills to develop complex business applications. Teaching and learning programming is challenging, and collaborative learning is proposed to help with this challenge. Online discussion forums promote networking with other learners such that they can build knowledge collaboratively. It aids students open their horizons of thought processes to acquire cognitive skills. Cognitive analysis of discussion is critical to understand students' learning process. In this paper, we propose Bloom's taxonomy based cognitive model for programming discussion forums. We present machine learning (ML) based solution to extract students' cognitive skills. Our evaluations on compupting courses show that …


Shell Theory: A Statistical Model Of Reality, Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita Oct 2022

Shell Theory: A Statistical Model Of Reality, Wen-Yan Lin, Siying Liu, Changhao Ren, Ngai-Man Cheung, Hongdong Li, Yasuyuki Matsushita

Research Collection School Of Computing and Information Systems

Machine learning's grand ambition is the mathematical modeling of reality. The recent years have seen major advances using deep-learned techniques that model reality implicitly; however, corresponding advances in explicit mathematical models have been noticeably lacking. We believe this dichotomy is rooted in the limitations of the current statistical tools, which struggle to make sense of the high dimensional generative processes that natural data seems to originate from. This paper proposes a new, distance based statistical technique which allows us to develop elegant mathematical models of such generative processes. Our model suggests that each semantic concept has an associated distinctive-shell which …


Right To Know, Right To Refuse: Towards Ui Perception-Based Automated Fine-Grained Permission Controls For Android Apps, Vikas Kumar Malviya, Chee Wei Leow, Ashok Kasthuri, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang Oct 2022

Right To Know, Right To Refuse: Towards Ui Perception-Based Automated Fine-Grained Permission Controls For Android Apps, Vikas Kumar Malviya, Chee Wei Leow, Ashok Kasthuri, Naing Tun Yan, Lwin Khin Shar, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

It is the basic right of a user to know how the permissions are used within the Android app’s scope and to refuse the app if granted permissions are used for the activities other than specified use which can amount to malicious behavior. This paper proposes an approach and a vision to automatically model the permissions necessary for Android apps from users’ perspective and enable fine-grained permission controls by users, thus facilitating users in making more well-informed and flexible permission decisions for different app functionalities, which in turn improve the security and data privacy of the App and enforce apps …


On The Effectiveness Of Using Graphics Interrupt As A Side Channel For User Behavior Snooping, Haoyu Ma, Jianwen Tian, Debin Gao, Chunfu Jia Sep 2022

On The Effectiveness Of Using Graphics Interrupt As A Side Channel For User Behavior Snooping, Haoyu Ma, Jianwen Tian, Debin Gao, Chunfu Jia

Research Collection School Of Computing and Information Systems

Graphics Processing Units (GPUs) are now a key component of many devices and systems, including those in the cloud and data centers, thus are also subject to side-channel attacks. Existing side-channel attacks on GPUs typically leak information from graphics libraries like OpenGL and CUDA, which require creating contentions within the GPU resource space and are being mitigated with software patches. This paper evaluates potential side channels exposed at a lower-level interface between GPUs and CPUs, namely the graphics interrupts. These signals could indicate unique signatures of GPU workload, allowing a spy process to infer the behavior of other processes. We …


Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen Aug 2022

Investigating Toxicity Changes Of Cross-Community Redditors From 2 Billion Posts And Comments, Hind Almerekhi, Haewoon Kwak, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

This research investigates changes in online behavior of users who publish in multiple communities on Reddit by measuring their toxicity at two levels. With the aid of crowdsourcing, we built a labeled dataset of 10,083 Reddit comments, then used the dataset to train and fine-tune a Bidirectional Encoder Representations from Transformers (BERT) neural network model. The model predicted the toxicity levels of 87,376,912 posts from 577,835 users and 2,205,581,786 comments from 890,913 users on Reddit over 16 years, from 2005 to 2020. This study utilized the toxicity levels of user content to identify toxicity changes by the user within the …


Quantum Machine Learning For Credit Scoring, N. Schetakis, D. Aghamalyan, M. Boguslavsky, A. Rees, Marc Rakotomalala, Paul Griffin Jul 2022

Quantum Machine Learning For Credit Scoring, N. Schetakis, D. Aghamalyan, M. Boguslavsky, A. Rees, Marc Rakotomalala, Paul Griffin

Research Collection School Of Computing and Information Systems

In this paper we explore the use of quantum machine learning (QML) applied to credit scoring for small and medium size businesses (SMEs). A quantum/classical hybrid approach has been used for two years of experimentation with several models, activation functions, epochs, other parameters. Results are shown from the best model, using two quantum classifiers and a classical neural network, applied to data for companies in Singapore. We observe significantly more efficient training for the quantum models over the classical models for comparable prediction performance. Practical issues are also explored including a quadratic computational slow down with the number of qubits …


Imagining New Futures Beyond Predictive Systems In Child Welfare: A Qualitative Study With Impacted Stakeholders, Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, Haiyi Zhu Jun 2022

Imagining New Futures Beyond Predictive Systems In Child Welfare: A Qualitative Study With Impacted Stakeholders, Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu, Haiyi Zhu

Research Collection School Of Computing and Information Systems

Child welfare agencies across the United States are turning to datadriven predictive technologies (commonly called predictive analytics) which use government administrative data to assist workers’ decision-making. While some prior work has explored impacted stakeholders’ concerns with current uses of data-driven predictive risk models (PRMs), less work has asked stakeholders whether such tools ought to be used in the first place. In this work, we conducted a set of seven design workshops with 35 stakeholders who have been impacted by the child welfare system or who work in it to understand their beliefs and concerns around PRMs, and to engage them …


Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu May 2022

Structure-Aware Visualization Retrieval, Haotian Li, Yong Wang, Aoyu Wu, Huan Wei, Huamin. Qu

Research Collection School Of Computing and Information Systems

With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can benefit various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. …


Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo May 2022

Automated Identification Of Libraries From Vulnerability Data: Can We Do Better?, Stefanus A. Haryono, Hong Jin Kang, Abhishek Sharma, Asankhaya Sharma, Andrew E. Santosa, Ming Yi Ang, David Lo

Research Collection School Of Computing and Information Systems

Software engineers depend heavily on software libraries and have to update their dependencies once vulnerabilities are found in them. Software Composition Analysis (SCA) helps developers identify vulnerable libraries used by an application. A key challenge is the identification of libraries related to a given reported vulnerability in the National Vulnerability Database (NVD), which may not explicitly indicate the affected libraries. Recently, researchers have tried to address the problem of identifying the libraries from an NVD report by treating it as an extreme multi-label learning (XML) problem, characterized by its large number of possible labels and severe data sparsity. As input, …


Automated Reverse Engineering Of Role-Based Access Control Policies Of Web Applications, Ha Thanh Le, Lwin Khin Shar, Domenico Bianculli, Lionel C. Briand, Cu Duy Nguyen Feb 2022

Automated Reverse Engineering Of Role-Based Access Control Policies Of Web Applications, Ha Thanh Le, Lwin Khin Shar, Domenico Bianculli, Lionel C. Briand, Cu Duy Nguyen

Research Collection School Of Computing and Information Systems

Access control (AC) is an important security mechanism used in software systems to restrict access to sensitive resources. Therefore, it is essential to validate the correctness of AC implementations with respect to policy specifications or intended access rights. However, in practice, AC policy specifications are often missing or poorly documented; in some cases, AC policies are hard-coded in business logic implementations. This leads to difficulties in validating the correctness of policy implementations and detecting AC defects.In this paper, we present a semi-automated framework for reverse-engineering of AC policies from Web applications. Our goal is to learn and recover role-based access …


A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo Jan 2022

A Survey On Deep Learning For Software Engineering, Yanming Yang, Xin Xia, David Lo

Research Collection School Of Computing and Information Systems

In 2006, Geoffrey Hinton proposed the concept of training "Deep Neural Networks (DNNs)" and an improved model training method to break the bottleneck of neural network development. More recently, the introduction of AlphaGo in 2016 demonstrated the powerful learning ability of deep learning and its enormous potential. Deep learning has been increasingly used to develop state-of-the-art software engineering (SE) research tools due to its ability to boost performance for various SE tasks. There are many factors, e.g., deep learning model selection, internal structure differences, and model optimization techniques, that may have an impact on the performance of DNNs applied in …


Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang Jan 2022

Predictive Models In Software Engineering: Challenges And Opportunities, Yanming Yang, Xin Xia, David Lo, Tingting Bi, John C. Grundy, Xiaohu Yang

Research Collection School Of Computing and Information Systems

Predictive models are one of the most important techniques that are widely applied in many areas of software engineering. There have been a large number of primary studies that apply predictive models and that present well-performed studies in various research domains, including software requirements, software design and development, testing and debugging, and software maintenance. This article is a first attempt to systematically organize knowledge in this area by surveying a body of 421 papers on predictive models published between 2009 and 2020. We describe the key models and approaches used, classify the different models, summarize the range of key application …


Binary Classifiers For Noisy Datasets: A Comparative Study Of Existing Quantum Machine Learning Frameworks And Some New Approaches, Nikolaos Schetakis, Davit Aghamalyan, Paul Robert Griffin, Michael Boguslavsky Nov 2021

Binary Classifiers For Noisy Datasets: A Comparative Study Of Existing Quantum Machine Learning Frameworks And Some New Approaches, Nikolaos Schetakis, Davit Aghamalyan, Paul Robert Griffin, Michael Boguslavsky

Research Collection School Of Computing and Information Systems

This technology offer is a quantum machine learning algorithm applied to binary classification models for noisy datasets which are prevalent in financial and other datasets. By combining hybrid-neural networks, quantum parametric circuits, and data re-uploading we have improved the classification of non-convex 2-dimensional figures by understanding learning stability as noise increases in the dataset. The metric we use for assessing the performance of our quantum classifiers is the area under the receiver operator curve (ROC AUC). We are interested to collaborate with partners with use cases for binary classification of noisy data. Also, as quantum technology is still insufficient for …


Measuring Data Collection Diligence For Community Healthcare, Galawala Ramesha Samurdhi Karunasena, M. S. Ambiya, Arunesh Sinha, R. Nagar, S. Dalal, Abdullah. H., D. Thakkar, D. Narayanan, M. Tambe Oct 2021

Measuring Data Collection Diligence For Community Healthcare, Galawala Ramesha Samurdhi Karunasena, M. S. Ambiya, Arunesh Sinha, R. Nagar, S. Dalal, Abdullah. H., D. Thakkar, D. Narayanan, M. Tambe

Research Collection School Of Computing and Information Systems

Data analytics has tremendous potential to provide targeted benefit in low-resource communities, however the availability of highquality public health data is a significant challenge in developing countries primarily due to non-diligent data collection by community health workers (CHWs). Our use of the word non-diligence here is to emphasize that poor data collection is often not a deliberate action by CHW but arises due to a myriad of factors, sometime beyond the control of the CHW. In this work, we define and test a data collection diligence score. This challenging unlabeled data problem is handled by building upon domain expert’s guidance …


Orthogonal Inductive Matrix Completion, Antoine Ledent, Rrodrigo Alves, Marius Kloft Sep 2021

Orthogonal Inductive Matrix Completion, Antoine Ledent, Rrodrigo Alves, Marius Kloft

Research Collection School Of Computing and Information Systems

We propose orthogonal inductive matrix completion (OMIC), an interpretable approach to matrix completion based on a sum of multiple orthonormal side information terms, together with nuclear-norm regularization. The approach allows us to inject prior knowledge about the singular vectors of the ground-truth matrix. We optimize the approach by a provably converging algorithm, which optimizes all components of the model simultaneously. We study the generalization capabilities of our method in both the distribution-free setting and in the case where the sampling distribution admits uniform marginals, yielding learning guarantees that improve with the quality of the injected knowledge in both cases. As …


Estimating Homophily In Social Networks Using Dyadic Predictions, George Berry, Antonio Sirianni, Ingmar Weber, Jisun An, Michael Macy Aug 2021

Estimating Homophily In Social Networks Using Dyadic Predictions, George Berry, Antonio Sirianni, Ingmar Weber, Jisun An, Michael Macy

Research Collection School Of Computing and Information Systems

Predictions of node categories are commonly used to estimate homophily and other relational properties in networks. However, little is known about the validity of using predictions for this task. We show that estimating homophily in a network is a problem of predicting categories of dyads (edges) in the graph. Homophily estimates are unbiased when predictions of dyad categories are unbiased. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally produce unbiased predictions of dyad categories and therefore produce biased homophily estimates. Bias comes from three sources: sampling bias, correlation between model errors …


Revman: Revenue-Aware Multi-Task Online Insurance Recommendation, Yu Li, Yi Zhang, Lu Gan, Gengwei Hong, Zimu Zhou, Qiang Li Feb 2021

Revman: Revenue-Aware Multi-Task Online Insurance Recommendation, Yu Li, Yi Zhang, Lu Gan, Gengwei Hong, Zimu Zhou, Qiang Li

Research Collection School Of Computing and Information Systems

Online insurance is a new type of e-commerce with exponential growth. An effective recommendation model that maximizes the total revenue of insurance products listed in multiple customized sales scenarios is crucial for the success of online insurance business. Prior recommendation models are ineffective because they fail to characterize the complex relatedness of insurance products in multiple sales scenarios and maximize the overall conversion rate rather than the total revenue. Even worse, it is impractical to collect training data online for total revenue maximization due to the business logic of online insurance. We propose RevMan, a Revenue-aware Multi-task Network for online …


Walls Have Ears: Eavesdropping User Behaviors Via Graphics-Interrupt-Based Side Channel, Haoyu Ma, Jianwen Tian, Debin Gao, Jia Chunfu Dec 2020

Walls Have Ears: Eavesdropping User Behaviors Via Graphics-Interrupt-Based Side Channel, Haoyu Ma, Jianwen Tian, Debin Gao, Jia Chunfu

Research Collection School Of Computing and Information Systems

Graphics Processing Units (GPUs) are now playing a vital role in many devices and systems including computing devices, data centers, and clouds, making them the next target of side-channel attacks. Unlike those targeting CPUs, existing side-channel attacks on GPUs exploited vulnerabilities exposed by application interfaces like OpenGL and CUDA, which can be easily mitigated with software patches. In this paper, we investigate the lower-level and native interface between GPUs and CPUs, i.e., the graphics interrupts, and evaluate the side channel they expose. Being an intrinsic profile in the communication between a GPU and a CPU, the pattern of graphics interrupts …


Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas Dec 2020

Nearest Centroid: A Bridge Between Statistics And Machine Learning, Manoj Thulasidas

Research Collection School Of Computing and Information Systems

In order to guide our students of machine learning in their statistical thinking, we need conceptually simple and mathematically defensible algorithms. In this paper, we present the Nearest Centroid algorithm (NC) algorithm as a pedagogical tool, combining the key concepts behind two foundational algorithms: K-Means clustering and K Nearest Neighbors (k- NN). In NC, we use the centroid (as defined in the K-Means algorithm) of the observations belonging to each class in our training data set and its distance from a new observation (similar to k-NN) for class prediction. Using this obvious extension, we will illustrate how the concepts of …