Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 10 of 10

Full-Text Articles in Computer Sciences

Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng Dec 2023

Better Pay Attention Whilst Fuzzing, Shunkai Zhu, Jingyi Wang, Jun Sun, Jie Yang, Xingwei Lin, Liyi Zhang, Peng Cheng

Research Collection School Of Computing and Information Systems

Fuzzing is one of the prevailing methods for vulnerability detection. However, even state-of-the-art fuzzing methods become ineffective after some period of time, i.e., the coverage hardly improves as existing methods are ineffective to focus the attention of fuzzing on covering the hard-to-trigger program paths. In other words, they cannot generate inputs that can break the bottleneck due to the fundamental difficulty in capturing the complex relations between the test inputs and program coverage. In particular, existing fuzzers suffer from the following main limitations: 1) lacking an overall analysis of the program to identify the most “rewarding” seeds, and 2) lacking …


On The Sustainability Of Deep Learning Projects: Maintainers' Perspective, Junxiao Han, Jiakun Liu, David Lo, Chen Zhi, Yishan Chen, Shuiguang Deng Nov 2023

On The Sustainability Of Deep Learning Projects: Maintainers' Perspective, Junxiao Han, Jiakun Liu, David Lo, Chen Zhi, Yishan Chen, Shuiguang Deng

Research Collection School Of Computing and Information Systems

Deep learning (DL) techniques have grown in leaps and bounds in both academia and industry over the past few years. Despite the growth of DL projects, there has been little study on how DL projects evolve, whether maintainers in this domain encounter a dramatic increase in workload and whether or not existing maintainers can guarantee the sustained development of projects. To address this gap, we perform an empirical study to investigate the sustainability of DL projects, understand maintainers' workloads and workloads growth in DL projects, and compare them with traditional open-source software (OSS) projects. In this regard, we first investigate …


Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi Takvor Khatchadourian Ph.D., Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja Sep 2023

Towards Safe Automated Refactoring Of Imperative Deep Learning Programs To Graph Execution, Raffi Takvor Khatchadourian Ph.D., Tatiana Castro Vélez, Mehdi Bagherzadeh, Nan Jia, Anita Raja

Publications and Research

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code—supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. Though hybrid approaches aim for the “best of both worlds,” using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution. We present our ongoing work on automated refactoring that assists developers in specifying whether …


Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert Sep 2023

Experimental Comparison Of Features, Analyses, And Classifiers For Android Malware Detection, Lwin Khin Shar, Biniam Fisseha Demissie, Mariano Ceccato, Naing Tun Yan, David Lo, Lingxiao Jiang, Christoph Bienert

Research Collection School Of Computing and Information Systems

Android malware detection has been an active area of research. In the past decade, several machine learning-based approaches based on different types of features that may characterize Android malware behaviors have been proposed. The usually-analyzed features include API usages and sequences at various abstraction levels (e.g., class and package), extracted using static or dynamic analysis. Additionally, features that characterize permission uses, native API calls and reflection have also been analyzed. Initial works used conventional classifiers such as Random Forest to learn on those features. In recent years, deep learning-based classifiers such as Recurrent Neural Network have been explored. Considering various …


Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang Sep 2023

Arduinoprog: Towards Automating Arduino Programming, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Writing code for Arduino poses unique challenges. A developer 1) needs hardware-specific knowledge about the interface configuration between the Arduino controller and the I/Ohardware, 2) identifies a suitable driver library for the I/O hardware, and 3) follows certain usage patterns of the driver library in order to use them properly. In this work, based on a study of real-world user queries posted in the Arduino forum, we propose ArduinoProg to address such challenges. ArduinoProg consists of three components, i.e., Library Retriever, Configuration Classifier, and Pattern Generator. Given a query, Library Retriever retrieves library names relevant to the I/O hardware identified …


Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li Aug 2023

Optimizing Collective Communication For Scalable Scientific Computing And Deep Learning, Jiali Li

Doctoral Dissertations

In the realm of distributed computing, collective operations involve coordinated communication and synchronization among multiple processing units, enabling efficient data exchange and collaboration. Scientific applications, such as simulations, computational fluid dynamics, and scalable deep learning, require complex computations that can be parallelized across multiple nodes in a distributed system. These applications often involve data-dependent communication patterns, where collective operations are critical for achieving high performance in data exchange. Optimizing collective operations for scientific applications and deep learning involves improving the algorithms, communication patterns, and data distribution strategies to minimize communication overhead and maximize computational efficiency.

Within the context of this …


Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang May 2023

Automating Arduino Programming: From Hardware Setups To Sample Source Code Generation, Imam Nur Bani Yusuf, Diyanah Binte Abdul Jamal, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

An embedded system is a system consisting of software code, controller hardware, and I/O (Input/Output) hardware that performs a specific task. Developing an embedded system presents several challenges. First, the development often involves configuring hardware that requires domain-specific knowledge. Second, the library for the hardware may have API usage patterns that must be followed. To overcome such challenges, we propose a framework called ArduinoProg towards the automatic generation of Arduino applications. ArduinoProg takes a natural language query as input and outputs the configuration and API usage pattern for the hardware described in the query. Motivated by our findings on the …


Does Deep Learning Improve The Performance Of Duplicate Bug Report Detection? An Empirical Study, Yuan Jiang, Xiaohong Su, Christoph Treude, Chao Shang, Tiantian Wang Apr 2023

Does Deep Learning Improve The Performance Of Duplicate Bug Report Detection? An Empirical Study, Yuan Jiang, Xiaohong Su, Christoph Treude, Chao Shang, Tiantian Wang

Research Collection School Of Computing and Information Systems

Do Deep Learning (DL) techniques actually help to improve the performance of duplicate bug report detection? Prior studies suggest that they do, if the duplicate bug report detection task is treated as a binary classification problem. However, in realistic scenarios, the task is often viewed as a ranking problem, which predicts potential duplicate bug reports by ranking based on similarities with existing historical bug reports. There is little empirical evidence to support that DL can be effectively applied to detect duplicate bug reports in the ranking scenario. Therefore, in this paper, we investigate whether well-known DL-based methods outperform classic information …


An Empirical Study Of Pre-Trained Model Reuse In The Hugging Face Deep Learning Model Registry, Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer, Rohan Sethi, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis Jan 2023

An Empirical Study Of Pre-Trained Model Reuse In The Hugging Face Deep Learning Model Registry, Wenxin Jiang, Nicholas Synovic, Matt Hyatt, Taylor R. Schorlemmer, Rohan Sethi, Yung-Hsiang Lu, George K. Thiruvathukal, James C. Davis

Department of Electrical and Computer Engineering Faculty Publications

Deep Neural Networks (DNNs) are being adopted as components in software systems. Creating and specializing DNNs from scratch has grown increasingly difficult as state-of-the-art architectures grow more complex. Following the path of traditional software engineering, machine learning engineers have begun to reuse large-scale pre-trained models (PTMs) and fine-tune these models for downstream tasks. Prior works have studied reuse practices for traditional software packages to guide software engineers towards better package maintenance and dependency management. We lack a similar foundation of knowledge to guide behaviors in pre-trained model ecosystems.

In this work, we present the first empirical investigation of PTM reuse. …


Face Anti-Spoofing And Deep Learning Based Unsupervised Image Recognition Systems, Enoch Solomon Jan 2023

Face Anti-Spoofing And Deep Learning Based Unsupervised Image Recognition Systems, Enoch Solomon

Theses and Dissertations

One of the main problems of a supervised deep learning approach is that it requires large amounts of labeled training data, which are not always easily available. This PhD dissertation addresses the above-mentioned problem by using a novel unsupervised deep learning face verification system called UFace, that does not require labeled training data as it automatically, in an unsupervised way, generates training data from even a relatively small size of data. The method starts by selecting, in unsupervised way, k-most similar and k-most dissimilar images for a given face image. Moreover, this PhD dissertation proposes a new loss function to …