Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng Dec 2023

Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng

Research Collection School Of Computing and Information Systems

Fuzzing is one of the prevailing methods for vulnerability detection. However, even state-of-the-art fuzzing methods become ineffective after some period of time, i.e., the coverage hardly improves as existing methods are ineffective to focus the attention of fuzzing on covering the hard-to-trigger program paths. In other words, they cannot generate inputs that can break the bottleneck due to the fundamental difficulty in capturing the complex relations between the test inputs and program coverage. In particular, existing fuzzers suffer from the following main limitations: 1) lacking an overall analysis of the program to identify the most “rewarding” seeds, and 2) lacking …


Understanding The Impact Of Trade Policy Effect Uncertainty On Firm-Level Innovation Investment: A Deep Learning Approach, Daniel Chang, Nan Hu, Peng Liang, Morgan Swink Dec 2023

Understanding The Impact Of Trade Policy Effect Uncertainty On Firm-Level Innovation Investment: A Deep Learning Approach, Daniel Chang, Nan Hu, Peng Liang, Morgan Swink

Research Collection School Of Computing and Information Systems

Integrating the real options perspective and resource dependence theory, this study examines how firms adjust their innovation investments to trade policy effect uncertainty (TPEU), a less studied type of firm specific, perceived environmental uncertainty in which managers have difficulty predicting how potential policy changes will affect business operations. To develop a text-based, context-dependent, time-varying measure of firm-level perceived TPEU, we apply Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art deep learning approach. We apply BERT to analyze the texts of mandatory Management Discussion and Analysis (MD&A) sections of annual reports for a sample of 22,669 firm-year observations from 3,181 unique …


Distxplore: Distribution-Guided Testing For Evaluating And Enhancing Deep Learning Systems, Longtian Wang, Xiaofei Xie, Xiaoning Du, Meng Tian, Qing Guo, Zheng Yang, Chao Shen Dec 2023

Distxplore: Distribution-Guided Testing For Evaluating And Enhancing Deep Learning Systems, Longtian Wang, Xiaofei Xie, Xiaoning Du, Meng Tian, Qing Guo, Zheng Yang, Chao Shen

Research Collection School Of Computing and Information Systems

Deep learning (DL) models are trained on sampled data, where the distribution of training data differs from that of real-world data (i.e., the distribution shift), which reduces the model's robustness. Various testing techniques have been proposed, including distribution-unaware and distribution-aware methods. However, distribution-unaware testing lacks effectiveness by not explicitly considering the distribution of test cases and may generate redundant errors (within same distribution). Distribution-aware testing techniques primarily focus on generating test cases that follow the training distribution, missing out-of-distribution data that may also be valid and should be considered in the testing process. In this paper, we propose a novel …


Understanding The Impact Of Trade Policy Effect Uncertainty On Firm-Level Innovation Investment: A Deep Learning Approach, Daniel. Chen, Nan Hu, Peng. Liang, Morgan. Swink Nov 2023

Understanding The Impact Of Trade Policy Effect Uncertainty On Firm-Level Innovation Investment: A Deep Learning Approach, Daniel. Chen, Nan Hu, Peng. Liang, Morgan. Swink

Research Collection School Of Computing and Information Systems

Integrating the real options perspective and resource dependence theory, this study examines how firms adjust their innovation investments to trade policy effect uncertainty (TPEU), a less studied type of firm specific, perceived environmental uncertainty in which managers have difficulty predicting how potential policy changes will affect business operations. To develop a text-based, context-dependent, time-varying measure of firm-level perceived TPEU, we apply Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art deep learning approach. We apply BERT to analyze the texts of mandatory Management Discussion and Analysis (MD&A) sections of annual reports for a sample of 22,669 firm-year observations from 3,181 unique …


On The Sustainability Of Deep Learning Projects: Maintainers' Perspective, Junxiao Han, Jiakun Liu, David Lo, Chen Zhi, Yishan Chen, Shuiguang Deng Nov 2023

On The Sustainability Of Deep Learning Projects: Maintainers' Perspective, Junxiao Han, Jiakun Liu, David Lo, Chen Zhi, Yishan Chen, Shuiguang Deng

Research Collection School Of Computing and Information Systems

Deep learning (DL) techniques have grown in leaps and bounds in both academia and industry over the past few years. Despite the growth of DL projects, there has been little study on how DL projects evolve, whether maintainers in this domain encounter a dramatic increase in workload and whether or not existing maintainers can guarantee the sustained development of projects. To address this gap, we perform an empirical study to investigate the sustainability of DL projects, understand maintainers' workloads and workloads growth in DL projects, and compare them with traditional open-source software (OSS) projects. In this regard, we first investigate …


Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert Sep 2023

Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert

Research Collection School Of Computing and Information Systems

Android malware detection has been an active area of research. In the past decade, several machine learning-based approaches based on different types of features that may characterize Android malware behaviors have been proposed. The usually-analyzed features include API usages and sequences at various abstraction levels (e.g., class and package), extracted using static or dynamic analysis. Additionally, features that characterize permission uses, native API calls and reflection have also been analyzed. Initial works used conventional classifiers such as Random Forest to learn on those features. In recent years, deep learning-based classifiers such as Recurrent Neural Network have been explored. Considering various …


Rosas: Deep Semi-Supervised Anomaly Detection With Contamination-Resilient Continuous Supervision, Hongzuo Xu, Yijie Wang, Guansong Pang, Songlei Jian, Ning Liu, Yongjun Wang Sep 2023

Rosas: Deep Semi-Supervised Anomaly Detection With Contamination-Resilient Continuous Supervision, Hongzuo Xu, Yijie Wang, Guansong Pang, Songlei Jian, Ning Liu, Yongjun Wang

Research Collection School Of Computing and Information Systems

Semi-supervised anomaly detection methods leverage a few anomaly examples to yield drastically improved performance compared to unsupervised models. However, they still suffer from two limitations: 1) unlabeled anomalies (i.e., anomaly contamination) may mislead the learning process when all the unlabeled data are employed as inliers for model training; 2) only discrete supervision information (such as binary or ordinal data labels) is exploited, which leads to suboptimal learning of anomaly scores that essentially take on a continuous distribution. Therefore, this paper proposes a novel semi-supervised anomaly detection method, which devises contamination-resilient continuous supervisory signals. Specifically, we propose a mass interpolation method …


Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang Sep 2023

Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Writing code for Arduino poses unique challenges. A developer 1) needs hardware-specific knowledge about the interface configuration between the Arduino controller and the I/Ohardware, 2) identifies a suitable driver library for the I/O hardware, and 3) follows certain usage patterns of the driver library in order to use them properly. In this work, based on a study of real-world user queries posted in the Arduino forum, we propose ArduinoProg to address such challenges. ArduinoProg consists of three components, i.e., Library Retriever, Configuration Classifier, and Pattern Generator. Given a query, Library Retriever retrieves library names relevant to the I/O hardware identified …


Multi-Granularity Detector For Vulnerability Fixes, Truong Giang Nguyen, Cong, Thanh Le, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, David Lo, David Lo Aug 2023

Multi-Granularity Detector For Vulnerability Fixes, Truong Giang Nguyen, Cong, Thanh Le, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, David Lo, David Lo

Research Collection School Of Computing and Information Systems

With the increasing reliance on Open Source Software, users are exposed to third-party library vulnerabilities. Software Composition Analysis (SCA) tools have been created to alert users of such vulnerabilities. SCA requires the identification of vulnerability-fixing commits. Prior works have proposed methods that can automatically identify such vulnerability-fixing commits. However, identifying such commits is highly challenging, as only a very small minority of commits are vulnerability fixing. Moreover, code changes can be noisy and difficult to analyze. We observe that noise can occur at different levels of detail, making it challenging to detect vulnerability fixes accurately. To address these challenges and …


Learning Deep Time-Index Models For Time Series Forecasting, Jiale Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi Jul 2023

Learning Deep Time-Index Models For Time Series Forecasting, Jiale Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

Research Collection School Of Computing and Information Systems

Deep learning has been actively applied to time series forecasting, leading to a deluge of new methods, belonging to the class of historicalvalue models. Yet, despite the attractive properties of time-index models, such as being able to model the continuous nature of underlying time series dynamics, little attention has been given to them. Indeed, while naive deep timeindex models are far more expressive than the manually predefined function representations of classical time-index models, they are inadequate for forecasting, being unable to generalize to unseen time steps due to the lack of inductive bias. In this paper, we propose DeepTime, a …


Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang May 2023

Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

An embedded system is a system consisting of software code, controller hardware, and I/O (Input/Output) hardware that performs a specific task. Developing an embedded system presents several challenges. First, the development often involves configuring hardware that requires domain-specific knowledge. Second, the library for the hardware may have API usage patterns that must be followed. To overcome such challenges, we propose a framework called ArduinoProg towards the automatic generation of Arduino applications. ArduinoProg takes a natural language query as input and outputs the configuration and API usage pattern for the hardware described in the query. Motivated by our findings on the …


Learning-Based Stock Trending Prediction By Incorporating Technical Indicators And Social Media Sentiment, Zhaoxia Wang, Zhenda Hu, Fang Li, Seng-Beng Ho, Erik Cambria Mar 2023

Learning-Based Stock Trending Prediction By Incorporating Technical Indicators And Social Media Sentiment, Zhaoxia Wang, Zhenda Hu, Fang Li, Seng-Beng Ho, Erik Cambria

Research Collection School Of Computing and Information Systems

Stock trending prediction is a challenging task due to its dynamic and nonlinear characteristics. With the development of social platform and artificial intelligence (AI), incorporating timely news and social media information into stock trending models becomes possible. However, most of the existing works focus on classification or regression problems when predicting stock market trending without fully considering the effects of different influence factors in different phases. To address this gap, this research solves stock trending prediction problem utilizing both technical indicators and sentiments of the social media text as influence factors in different situations. A 3-phase hybrid model is proposed …


Causal Interventional Training For Image Recognition, Wei Qin, Hanwang Zhang, Richang Hong, Ee-Peng Lim, Qianru Sun Jan 2023

Causal Interventional Training For Image Recognition, Wei Qin, Hanwang Zhang, Richang Hong, Ee-Peng Lim, Qianru Sun

Research Collection School Of Computing and Information Systems

Deep learning models often fit undesired dataset bias in training. In this paper, we formulate the bias using causal inference, which helps us uncover the ever-elusive causalities among the key factors in training, and thus pursue the desired causal effect without the bias. We start from revisiting the process of building a visual recognition system, and then propose a structural causal model (SCM) for the key variables involved in dataset collection and recognition model: object, common sense, bias, context, and label prediction. Based on the SCM, one can observe that there are “good” and “bad” biases. Intuitively, in the image …


Learning Large Neighborhood Search For Vehicle Routing In Airport Ground Handling, Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen Song, Jie Zhang, Zhenghua Chen Jan 2023

Learning Large Neighborhood Search For Vehicle Routing In Airport Ground Handling, Jianan Zhou, Yaoxin Wu, Zhiguang Cao, Wen Song, Jie Zhang, Zhenghua Chen

Research Collection School Of Computing and Information Systems

Dispatching vehicle fleets to serve flights is a key task in airport ground handling (AGH). Due to the notable growth of flights, it is challenging to simultaneously schedule multiple types of operations (services) for a large number of flights, where each type of operation is performed by one specific vehicle fleet. To tackle this issue, we first represent the operation scheduling as a complex vehicle routing problem and formulate it as a mixed integer linear programming (MILP) model. Then given the graph representation of the MILP model, we propose a learning assisted large neighborhood search (LNS) method using data generated …