Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 29 of 29

Full-Text Articles in Physical Sciences and Mathematics

Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell Dec 2019

Identifying Regional Trends In Avatar Customization, Peter Mawhorter, Sercan Sengun, Haewoon Kwak, D. Fox Harrell

Research Collection School Of Computing and Information Systems

Since virtual identities such as social media profiles and avatars have become a common venue for self-expression, it has become important to consider the ways in which existing systems embed the values of their designers. In order to design virtual identity systems that reflect the needs and preferences of diverse users, understanding how the virtual identity construction differs between groups is important. This paper presents a new methodology that leverages deep learning and differential clustering for comparative analysis of profile images, with a case study of almost 100 000 avatars from a large online community using a popular avatar creation …


Harmony Search Algorithm For Time-Dependent Vehicle Routing Problem With Time Windows, Yun-Chia Liang, Vanny Minanda, Aldy Gunawan, Angela Hsiang-Ling Chen Dec 2019

Harmony Search Algorithm For Time-Dependent Vehicle Routing Problem With Time Windows, Yun-Chia Liang, Vanny Minanda, Aldy Gunawan, Angela Hsiang-Ling Chen

Research Collection School Of Computing and Information Systems

Vehicle Routing Problem (VRP) is a combinatorial problem where a certain set of nodes must be visited within a certain amount of time as well as the vehicle’s capacity. There are numerous variants of VRP such as VRP with time windows, where each node has opening and closing time, therefore, the visiting time must be during that interval. Another variant takes time-dependent constraint into account. This variant fits real-world scenarios, where at different period of time, the speed on the road varies depending on the traffic congestion. In this study, three objectives – total traveling time, total traveling distance, and …


Selective Discrete Particle Swarm Optimization For The Team Orienteering Problem With Time Windows And Partial Scores, Vincent F. Yu, Perwira A. A. N. Redi, Parida Jewpanya, Aldy Gunawan Dec 2019

Selective Discrete Particle Swarm Optimization For The Team Orienteering Problem With Time Windows And Partial Scores, Vincent F. Yu, Perwira A. A. N. Redi, Parida Jewpanya, Aldy Gunawan

Research Collection School Of Computing and Information Systems

This paper introduces the Team Orienteering Problem with Time Windows and Partial Scores (TOPTW-PS),which is an extension of the Team Orienteering Problem with Time Windows (TOPTW). In the context of theTOPTW-PS, each node is associated with a set of scores with respect to a set of attributes. The objective ofTOPTW-PS is to find a set of routes that maximizes the total score collected from a subset of attributes whenvisiting the nodes subject to the time budget and the time window at each visited node. We develop a mathematical model and propose a discrete version of the Particle Swarm Optimization (PSO), …


Agile Earth Observation Satellite Scheduling: An Orienteering Problem With Time-Dependent Profits And Travel Times, Guansheng Peng, Reginald Dewil, Cédric Verbeeck, Aldy Gunawan, Lining Xing, Pieter Vansteenwegen Nov 2019

Agile Earth Observation Satellite Scheduling: An Orienteering Problem With Time-Dependent Profits And Travel Times, Guansheng Peng, Reginald Dewil, Cédric Verbeeck, Aldy Gunawan, Lining Xing, Pieter Vansteenwegen

Research Collection School Of Computing and Information Systems

The scheduling problem of an Agile Earth Observation Satellite is to schedule a subset of weighted observation tasks with each a specific “profit” in order to maximize the total collected profit, under its operational constraints. The “time-dependent transition time” and the “time-dependent profit” are two crucial features of this problem. The former relates to the fact that each pair of consecutive tasks requires a transition time to maneuver the look angle of the camera from the previous task to the next task. The latter follows from the fact that a different look angle of an observation leads to a different …


Emotion-Aware Chat Machine: Automatic Emotional Response Generation For Human-Like Emotional Interaction, Wei Wei, Jiayi Liu, Xianling Mao, Guibing Guo, Feida Zhu, Pan Zhou, Yuchong Hu Nov 2019

Emotion-Aware Chat Machine: Automatic Emotional Response Generation For Human-Like Emotional Interaction, Wei Wei, Jiayi Liu, Xianling Mao, Guibing Guo, Feida Zhu, Pan Zhou, Yuchong Hu

Research Collection School Of Computing and Information Systems

The consistency of a response to a given post at semantic-level and emotional-level is essential for a dialogue system to deliver human-like interactions. However, this challenge is not well addressed in the literature, since most of the approaches neglect the emotional information conveyed by a post while generating responses. This article addresses this problem by proposing a unified end-to-end neural architecture, which is capable of simultaneously encoding the semantics and the emotions in a post for generating more intelligent responses with appropriately expressed emotions. Extensive experiments on real-world data demonstrate that the proposed method outperforms the state-of-the-art methods in terms …


Collaborative Online Ranking Algorithms For Multitask Learning, Guangxia Li, Peilin Zhao, Tao Mei, Peng Yang, Yulong Shen, Julian K. Y. Chang, Steven C. H. Hoi Oct 2019

Collaborative Online Ranking Algorithms For Multitask Learning, Guangxia Li, Peilin Zhao, Tao Mei, Peng Yang, Yulong Shen, Julian K. Y. Chang, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

There are many applications in which it is desirable to rank or order instances that belong to several different but related problems or tasks. Although unique, the individual ranking problem often shares characteristics with other problems in the group. Conventional ranking methods treat each task independently without considering the latent commonalities. In this paper, we study the problem of learning to rank instances that belong to multiple related tasks from the multitask learning perspective. We consider a case in which the information that is learned for a task can be used to enhance the learning of other tasks and propose …


Detecting Cyberattacks In Industrial Control Systems Using Online Learning Algorithms, Guangxia Li, Yulong Shen, Peilin Zhao, Xiao Lu, Jia Liu, Yangyang Liu, Steven C. H. Hoi Oct 2019

Detecting Cyberattacks In Industrial Control Systems Using Online Learning Algorithms, Guangxia Li, Yulong Shen, Peilin Zhao, Xiao Lu, Jia Liu, Yangyang Liu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

Industrial control systems are critical to the operation of industrial facilities, especially for critical infrastructures, such as refineries, power grids, and transportation systems. Similar to other information systems, a significant threat to industrial control systems is the attack from cyberspace-the offensive maneuvers launched by "anonymous" in the digital world that target computer-based assets with the goal of compromising a system's functions or probing for information. Owing to the importance of industrial control systems, and the possibly devastating consequences of being attacked, significant endeavors have been attempted to secure industrial control systems from cyberattacks. Among them are intrusion detection systems that …


Detecting Toxicity Triggers In Online Discussions, Hamad Bin Khalifa University, Haewoon Kwak Sep 2019

Detecting Toxicity Triggers In Online Discussions, Hamad Bin Khalifa University, Haewoon Kwak

Research Collection School Of Computing and Information Systems

Despite the considerable interest in the detection of toxic comments, there has been little research investigating the causes -- i.e., triggers -- of toxicity. In this work, we first propose a formal definition of triggers of toxicity in online communities. We proceed to build an LSTM neural network model using textual features of comments, and then, based on a comprehensive review of previous literature, we incorporate topical and sentiment shift in interactions as features. Our model achieves an average accuracy of 82.5% of detecting toxicity triggers from diverse Reddit communities.


A Common Approach For Consumer And Provider Fairness In Recommendations, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitrios Kleftogiannis Sep 2019

A Common Approach For Consumer And Provider Fairness In Recommendations, Dimitris Sacharidis, Kyriakos Mouratidis, Dimitrios Kleftogiannis

Research Collection School Of Computing and Information Systems

We present a common approach for handling consumer and provider fairness in recommendations. Our solution requires defining two key components, a classification of items and a target distribution, which together define the case of perfect fairness. This formulation allows distinct fairness concepts to be specified in a common framework. We further propose a novel reranking algorithm that optimizes for a desired trade-off between utility and fairness of a recommendation list.


Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen Sep 2019

Efficient Distributed Reachability Querying Of Massive Temporal Graphs, Tianming Zhang, Yunjun Gao, Chen Lu, Wei Guo, Shiliang Pu, Baihua Zheng, Christian S. Jensen

Research Collection School Of Computing and Information Systems

Reachability computation is a fundamental graph functionality with a wide range of applications. In spite of this, little work has as yet been done on efficient reachability queries over temporal graphs, which are used extensively to model time-varying networks, such as communication networks, social networks, and transportation schedule networks. Moreover, we are faced with increasingly large real-world temporal networks that may be distributed across multiple data centers. This state of affairs motivates the paper's study of efficient reachability queries on distributed temporal graphs. We propose an efficient index, called Temporal Vertex Labeling (TVL), which is a labeling scheme for distributed …


Inspect: Iterated Local Search For Solving Path Conditions, Fuxiang Chen, Aldy Gunawan, David Lo, Sunghun Kim Aug 2019

Inspect: Iterated Local Search For Solving Path Conditions, Fuxiang Chen, Aldy Gunawan, David Lo, Sunghun Kim

Research Collection School Of Computing and Information Systems

Automated test case generation is attractive as it can reduce developer workload. To generate test cases, many Symbolic Execution approaches first produce Path Conditions (PCs), a set of constraints, and pass them to a Satisfiability Modulo Theories (SMT) solver. Despite numerous prior studies, automated test case generation by Symbolic Execution is still slow, partly due to SMT solvers’ high computationally complexity. We introduce InSPeCT, a Path Condition solver, that leverages elements of ILS (Iterated Local Search) and Tabu List. ILS is not computational intensive and focuses on generating solutions in search spaces while Tabu List prevents the use of previously …


Simulated Annealing For The Multi-Vehicle Cyclic Inventory Routing Problem, Aldy Gunawan, Vincent F. Yu, Audrey Tedja Widjaja, Pieter Vansteenwegen Aug 2019

Simulated Annealing For The Multi-Vehicle Cyclic Inventory Routing Problem, Aldy Gunawan, Vincent F. Yu, Audrey Tedja Widjaja, Pieter Vansteenwegen

Research Collection School Of Computing and Information Systems

This paper studies the Multi-Vehicle Cyclic Inventory Routing Problem (MV-CIRP) as the extension of the Single-Vehicle CIRP (SV-CIRP). The objective is to minimize both distribution and inventory costs at the customers and to maximize the collected rewards simultaneously. The problem is treated as a single objective optimization problem. A subset of customers is selected for each vehicle including the quantity to be delivered to each customer. For each vehicle, a cyclic distribution plan is developed. We construct a mathematical programming model and propose a simulated annealing (SA) metaheuristic for solving both SV-CIRP and MV-CIRP. For SV-CIRP, experimental results on benchmark …


Redpc: A Residual Error-Based Density Peak Clustering Algorithm, Milan Parmar, Di Wang, Xiaofeng Zhang, Ah-Hwee Tan, Chunyan Miao, You Zhou Jul 2019

Redpc: A Residual Error-Based Density Peak Clustering Algorithm, Milan Parmar, Di Wang, Xiaofeng Zhang, Ah-Hwee Tan, Chunyan Miao, You Zhou

Research Collection School Of Computing and Information Systems

The density peak clustering (DPC) algorithm was designed to identify arbitrary-shaped clusters by finding density peaks in the underlying dataset. Due to its aptitudes of relatively low computational complexity and a small number of control parameters in use, DPC soon became widely adopted. However, because DPC takes the entire data space into consideration during the computation of local density, which is then used to generate a decision graph for the identification of cluster centroids, DPC may face difficulty in differentiating overlapping clusters and in dealing with low-density data points. In this paper, we propose a residual error-based density peak clustering …


A Review On Swarm Intelligence And Evolutionary Algorithms For Solving Flexible Job Shop Scheduling Problems, Kaizhou Gao, Zhiguang Cao, Le Zhang, Zhenghua Chen, Yuyan Han, Quanke Pan Jul 2019

A Review On Swarm Intelligence And Evolutionary Algorithms For Solving Flexible Job Shop Scheduling Problems, Kaizhou Gao, Zhiguang Cao, Le Zhang, Zhenghua Chen, Yuyan Han, Quanke Pan

Research Collection School Of Computing and Information Systems

Flexible job shop scheduling problems (FJSP) have received much attention from academia and industry for many years. Due to their exponential complexity, swarm intelligence (SI) and evolutionary algorithms (EA) are developed, employed and improved for solving them. More than 60% of the publications are related to SI and EA. This paper intents to give a comprehensive literature review of SI and EA for solving FJSP. First, the mathematical model of FJSP is presented and the constraints in applications are summarized. Then, the encoding and decoding strategies for connecting the problem and algorithms are reviewed. The strategies for initializing algorithms? population …


Simulated Annealing For The Single-Vehicle Cyclic Inventory Routing Problem, Aldy Gunawan, Vincent F. Yu, Audrey T. Widjaja, Pieter. Vansteenwegen Jul 2019

Simulated Annealing For The Single-Vehicle Cyclic Inventory Routing Problem, Aldy Gunawan, Vincent F. Yu, Audrey T. Widjaja, Pieter. Vansteenwegen

Research Collection School Of Computing and Information Systems

This paper studies the Single-Vehicle Cyclic Inventory Routing Problem (SV-CIRP) with the objective of simultaneously minimizing distribution and inventory costs for the customers and maximizing the collected rewards. A subset of customers is selected for the vehicle, including the quantity to be delivered to them. Simulated Annealing (SA) is proposed for solving the problem. Experimental results on 50 benchmark instances show that SA is comparable to the state-of-the-art algorithms. It is able to obtain 12 new best known solutions.


Correlated Learning For Aggregation Systems, Tanvi Verma, Pradeep Varakantham Jul 2019

Correlated Learning For Aggregation Systems, Tanvi Verma, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

Aggregation systems (e.g., Uber, Lyft, FoodPanda, Deliveroo) have been increasingly used to improve efficiency in numerous environments, including in transportation, logistics, food and grocery delivery. In these systems, a centralized entity (e.g., Uber) aggregates supply and assigns them to demand so as to optimize a central metric such as profit, number of requests, delay etc. Due to optimizing a metric of importance to the centralized entity, the interests of individuals (e.g., drivers, delivery boys) can be sacrificed. Therefore, in this paper, we focus on the problem of serving individual interests, i.e., learning revenue maximizing policies for individuals in the presence …


Volumetric Optimization Of Freight Cargo Loading: Case Study Of A Smu Forwarder, Tristan Lim, Michael Ser Chong Ping, Mark Goh, Shi Ying Jacelyn Tan Jul 2019

Volumetric Optimization Of Freight Cargo Loading: Case Study Of A Smu Forwarder, Tristan Lim, Michael Ser Chong Ping, Mark Goh, Shi Ying Jacelyn Tan

Research Collection School Of Computing and Information Systems

Purpose: Freight forwarders faces a challenging environment of high market volatility and margin compression risks. Hence, strategic consideration is given to undertaking capacity management and transport asset ownership to achieve longer term cost leadership. Doing so will also help to address management issues, such as better control of potential transport disruptions, improve scheduling flexibility and efficiency, and provide service level enhancement.Design/methodology/approach: The case company currently hastruck resource which is unprofitable, and the firm’s schedulers are having difficulty optimizing the loading capacity. We apply Genetic Algorithm (GA) to undertake volumetric optimization of truckcapacity and to build an easy-to-use platform to help …


Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao Jun 2019

Distributed Similarity Queries In Metric Spaces, Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao

Research Collection School Of Computing and Information Systems

Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In …


The Challenges Of Creating Engaging Content: Results From A Focus Group Study Of A Popular News Media Organization, Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen May 2019

The Challenges Of Creating Engaging Content: Results From A Focus Group Study Of A Popular News Media Organization, Kholoud Khalil Aldous, Jisun An, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

The process of content creation for distribution via social media platforms is not a trivial one for social media editors as the goal of creating both serious and engaging content is challenging, with no clear or differing guidelines or rules across and between platforms. For creators of serious content, such as news organizations, advertisers, or educational institutions, engagement has a deeper meaning beyond likes, shares, etc. that is aimed at the audience actually processing the underlying content associated with a social media post. In this research, we report findings from a group study that aimed to understand the process and …


Socially-Enriched Multimedia Data Co-Clustering, Ah-Hwee Tan May 2019

Socially-Enriched Multimedia Data Co-Clustering, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Heterogeneous data co-clustering is a commonly used technique for tapping the rich meta-information of multimedia web documents, including category, annotation, and description, for associative discovery. However, most co-clustering methods proposed for heterogeneous data do not consider the representation problem of short and noisy text and their performance is limited by the empirical weighting of the multimodal features. This chapter explains how to use the Generalized Heterogeneous Fusion Adaptive Resonance Theory (GHF-ART) generalized heterogeneous fusion adaptive resonance theory for clustering large-scale web multimedia documents. Specifically, GHF-ART is designed to handle multimedia data with an arbitrarily rich level of meta-information. For handling …


Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras Apr 2019

Maximizing Multifaceted Network Influence, Yuchen Li, Ju Fan, George V. Ovchinnikov, Panagiotis Karras

Research Collection School Of Computing and Information Systems

An information dissemination campaign is often multifaceted, involving several facets or pieces of information disseminating from different sources. The question then arises, how should we assign such pieces to eligible sources so as to achieve the best viral dissemination results? Past research has studied the problem of Influence Maximization (IM), which is to select a set of k promoters that maximizes the expected reach of a message over a network. However, in this classical IM problem, each promoter spreads out the same unitary piece of information. In this paper, we propose the Optimal Influential Pieces Assignment (OIPA) problem, which is …


The Capacitated Team Orienteering Problem, Aldy Gunawan, Kien Ming Ng, Vincent F. Yu, Gordy Adiprasetyo, Hoong Chuin Lau Apr 2019

The Capacitated Team Orienteering Problem, Aldy Gunawan, Kien Ming Ng, Vincent F. Yu, Gordy Adiprasetyo, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

This paper focuses on a recent variant of the Orienteering Problem (OP), namely the Capacitated Team OP (CTOP) which arises in the logistics industry. In this problem, each node is associated with a demand that needs to be satisfied and a score that need to be collected. Given a set of homogeneous fleet of vehicles, the objective is to find a path for each vehicle in order to maximize the total collected score, without violating the capacity and time budget. We propose an Iterated Local Search (ILS) algorithm for solving the CTOP. Two strategies, either accepting a new solution as …


Efficient Algorithms For Solving Aggregate Keyword Routing Problems, Qize Jiang, Weiwei Sun, Baihua Zheng, Kunjie Chen Apr 2019

Efficient Algorithms For Solving Aggregate Keyword Routing Problems, Qize Jiang, Weiwei Sun, Baihua Zheng, Kunjie Chen

Research Collection School Of Computing and Information Systems

With the emergence of smart phones and the popularity of GPS, the number of point of interest (POIs) is growing rapidly and spatial keyword search based on POIs has attracted significant attention. In this paper, we study a more sophistic type of spatial keyword searches that considers multiple query points and multiple query keywords, namely Aggregate Keyword Routing (AKR). AKR looks for an aggregate point m together with routes from each query point to m. The aggregate point has to satisfy the aggregate keywords, the routes from query points to the aggregate point have to pass POIs in order to …


Semantic And Influence Aware K-Representative Queries Over Social Streams, Yanhao Wang, Yuchen Li, Kianlee Tan Mar 2019

Semantic And Influence Aware K-Representative Queries Over Social Streams, Yanhao Wang, Yuchen Li, Kianlee Tan

Research Collection School Of Computing and Information Systems

Massive volumes of data continuously generated on social platforms have become an important information source for users. A primary method to obtain fresh and valuable information from social streams is social search. Although there have been extensive studies on social search, existing methods only focus on the relevance of query results but ignore the representativeness. In this paper, we propose a novel Semantic and Influence aware k-Representative (k-SIR) query for social streams based on topic modeling. Specifically, we consider that both user queries and elements are represented as vectors in the topic space. A k-SIR query retrieves a set of …


Distributed Gibbs: A Linear-Space Sampling-Based Dcop Algorithm, Duc Thien Nguyen, William Yeoh, Hoong Chuin Lau, Roie Zivan Mar 2019

Distributed Gibbs: A Linear-Space Sampling-Based Dcop Algorithm, Duc Thien Nguyen, William Yeoh, Hoong Chuin Lau, Roie Zivan

Research Collection School Of Computing and Information Systems

Researchers have used distributed constraint optimization problems (DCOPs) to model various multi-agent coordination and resource allocation problems. Very recently, Ottens et al. proposed a promising new approach to solve DCOPs that is based on confidence bounds via their Distributed UCT (DUCT) sampling-based algorithm. Unfortunately, its memory requirement per agent is exponential in the number of agents in the problem, which prohibits it from scaling up to large problems. Thus, in this article, we introduce two new sampling-based DCOP algorithms called Sequential Distributed Gibbs (SD-Gibbs) and Parallel Distributed Gibbs (PD-Gibbs). Both algorithms have memory requirements per agent that is linear in …


Send Hardest Problems My Way: Probabilistic Path Prioritization For Hybrid Fuzzing, Lei Zhao, Yue Duan, Jifeng Xuan Feb 2019

Send Hardest Problems My Way: Probabilistic Path Prioritization For Hybrid Fuzzing, Lei Zhao, Yue Duan, Jifeng Xuan

Research Collection School Of Computing and Information Systems

Hybrid fuzzing which combines fuzzing and concolic execution has become an advanced technique for software vulnerability detection. Based on the observation that fuzzing and concolic execution are complementary in nature, the state-of-the-art hybrid fuzzing systems deploy ``demand launch'' and ``optimal switch'' strategies. Although these ideas sound intriguing, we point out several fundamental limitations in them, due to oversimplified assumptions. We then propose a novel ``discriminative dispatch'' strategy to better utilize the capability of concolic execution. We design a novel Monte Carlo based probabilistic path prioritization model to quantify each path's difficulty and prioritize them for concolic execution. This model treats …


A Sampling Approach For Proactive Project Scheduling Under Generalized Time-Dependent Workability Uncertainty, Wen Song, Donghun Kang, Jie Zhang, Zhiguang Cao, Hui Xi Feb 2019

A Sampling Approach For Proactive Project Scheduling Under Generalized Time-Dependent Workability Uncertainty, Wen Song, Donghun Kang, Jie Zhang, Zhiguang Cao, Hui Xi

Research Collection School Of Computing and Information Systems

In real-world project scheduling applications, activity durations are often uncertain. Proactive scheduling can effectively cope with the duration uncertainties, by generating robust baseline solutions according to a priori stochastic knowledge. However, most of the existing proactive approaches assume that the duration uncertainty of an activity is not related to its scheduled start time, which may not hold in many real-world scenarios. In this paper, we relax this assumption by allowing the duration uncertainty to be time-dependent, which is caused by the uncertainty of whether the activity can be executed on each time slot. We propose a stochastic optimization model to …


Bilateral Dependency Neural Networks For Cross-Language Algorithm Classification, Duy Quoc Nghi Bui, Yijun Yu, Lingxiao Jiang Feb 2019

Bilateral Dependency Neural Networks For Cross-Language Algorithm Classification, Duy Quoc Nghi Bui, Yijun Yu, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Algorithm classification is to automatically identify the classes of a program based on the algorithm(s) and/or data structure(s) implemented in the program. It can be useful for various tasks, such as code reuse, code theft detection, and malware detection. Code similarity metrics, on the basis of features extracted from syntax and semantics, have been used to classify programs. Such features, however, often need manual selection effort and are specific to individual programming languages, limiting the classifiers to programs in the same language.To recognise the similarities and differences among algorithms implemented in different languages, this paper describes a framework of Bilateral …


Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, Doyen Sahoo, Steven C. H. Hoi, Bin Lin Jan 2019

Large Scale Online Multiple Kernel Regression With Application To Time-Series Prediction, Doyen Sahoo, Steven C. H. Hoi, Bin Lin

Research Collection School Of Computing and Information Systems

Kernel-based regression represents an important family of learning techniques for solving challenging regression tasks with non-linear patterns. Despite being studied extensively, most of the existing work suffers from two major drawbacks as follows: (i) they are often designed for solving regression tasks in a batch learning setting, making them not only computationally inefficient and but also poorly scalable in real-world applications where data arrives sequentially; and (ii) they usually assume that a fixed kernel function is given prior to the learning task, which could result in poor performance if the chosen kernel is inappropriate. To overcome these drawbacks, this work …