Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Physical Sciences and Mathematics

Feature-Based Transfer Learning In Natural Language Processing, Jianfei Yu Dec 2018

Feature-Based Transfer Learning In Natural Language Processing, Jianfei Yu

Dissertations and Theses Collection (Open Access)

In the past few decades, supervised machine learning approach is one of the most important methodologies in the Natural Language Processing (NLP) community. Although various kinds of supervised learning methods have been proposed to obtain the state-of-the-art performance across most NLP tasks, the bottleneck of them lies in the heavy reliance on the large amount of manually annotated data, which is not always available in our desired target domain/task. To alleviate the data sparsity issue in the target domain/task, an attractive solution is to find sufficient labeled data from a related source domain/task. However, for most NLP applications, due to …


Modeling Movement Decisions In Networks: A Discrete Choice Model Approach, Larry Lin Junjie Dec 2018

Modeling Movement Decisions In Networks: A Discrete Choice Model Approach, Larry Lin Junjie

Dissertations and Theses Collection (Open Access)

In this dissertation, we address the subject of modeling and simulation of agents and their movement decision in a network environment. We emphasize the development of high quality agent-based simulation models as a prerequisite before utilization of the model as an evaluation tool for various recommender systems and policies. To achieve this, we propose a methodological framework for development of agent-based models, combining approaches such as discrete choice models and data-driven modeling.

The discrete choice model is widely used in the field of transportation, with a distinct utility function (e.g., demand or revenue-driven). Through discrete choice models, the movement decision …


Early Prediction Of Merged Code Changes To Prioritize Reviewing Tasks, Yuanrui Fan, Xin Xia, David Lo, Shanping Li Dec 2018

Early Prediction Of Merged Code Changes To Prioritize Reviewing Tasks, Yuanrui Fan, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Modern Code Review (MCR) has been widely used by open source and proprietary software projects. Inspecting code changes consumes reviewers much time and effort since they need to comprehend patches, and many reviewers are often assigned to review many code changes. Note that a code change might be eventually abandoned, which causes waste of time and effort. Thus, a tool that predicts early on whether a code change will be merged can help developers prioritize changes to inspect, accomplish more things given tight schedule, and not waste reviewing effort on low quality changes. In this paper, motivated by the above …


Analyzing And Modeling Users In Multiple Online Social Platforms, Roy Lee Ka Wei Nov 2018

Analyzing And Modeling Users In Multiple Online Social Platforms, Roy Lee Ka Wei

Dissertations and Theses Collection (Open Access)

This dissertation addresses the empirical analysis on user-generated data from multiple online social platforms (OSPs) and modeling of latent user factors in multiple OSPs setting.

In the first part of this dissertation, we conducted cross-platform empirical studies to better understand user's social and work activities in multiple OSPs. In particular, we proposed new methodologies to analyze users' friendship maintenance and collaborative activities in multiple OSPs. We also apply the proposed methodologies on real-world OSP datasets, and the findings from our empirical studies have provided us with a better understanding on users' social and work activities which are previously not uncovered …


Exploring Experiential Learning Model And Risk Management Process For An Undergraduate Software Architecture Course, Eng Lieh Ouh, Yunghans Irawan Oct 2018

Exploring Experiential Learning Model And Risk Management Process For An Undergraduate Software Architecture Course, Eng Lieh Ouh, Yunghans Irawan

Research Collection School Of Computing and Information Systems

This paper shares our insights on exploring theexperiential learning model and risk management process todesign an undergraduate software architecture course. The keychallenge for undergraduate students to appreciate softwarearchitecture design is usually their limited experience in thesoftware industry. In software architecture, the high-level designprinciples are heuristics lacking the absoluteness of firstprinciples which for inexperienced undergraduate students, thisis a frustrating divergence from what they used to value. From aneducator's perspective, teaching software architecture requirescontending with the problem of how to express this level ofabstraction practically and also make the learning realistic. Inthis paper, we propose a model adapting the concepts ofexperiential learning …


Teaching Adult Learners On Software Architecture Design Skills, Eng Lieh Ouh, Yunghans Irawan Oct 2018

Teaching Adult Learners On Software Architecture Design Skills, Eng Lieh Ouh, Yunghans Irawan

Research Collection School Of Computing and Information Systems

Software architectures present high-level views ofsystems, enabling developers to abstract away the unnecessarydetails and focus on the overall big picture. Designing a softwarearchitecture is an essential skill in software engineering and adultlearners are seeking this skill to further progress in their career.With the technology revolution and advancements in this rapidlychanging world, the proportion of adult learners attendingcourses for continuing education are increasing. Their learningobjectives are no longer to obtain good grades but the practicalskills to enable them to perform better in their work and advancein their career. Teaching software architecture to upskill theseadult learners requires contending with the problem of …


Efficient Attribute-Based Encryption With Blackbox Traceability, Shengmin Xu, Guomin Yang, Yi Mu, Ximeng Liu Oct 2018

Efficient Attribute-Based Encryption With Blackbox Traceability, Shengmin Xu, Guomin Yang, Yi Mu, Ximeng Liu

Research Collection School Of Computing and Information Systems

Traitor tracing scheme can be used to identify a decryption key is illegally used in public-key encryption. In CCS’13, Liu et al. proposed an attribute-based traitor tracing (ABTT) scheme with blackbox traceability which can trace decryption keys embedded in a decryption blackbox/device rather than tracing a well-formed decryption key. However, the existing ABTT schemes with blackbox traceability are based on composite order group and the size of the decryption key depends on the policies and the number of system users. In this paper, we revisit blackbox ABTT and introduce a new primitive called attribute-based set encryption (ABSE) based on key-policy …


Visforum: A Visual Analysis System For Exploring User Groups In Online Forums, Siwei Fu, Yong Wang, Yi Yang, Qingqing Bi, Fangzhou Guo, Huamin Qu Oct 2018

Visforum: A Visual Analysis System For Exploring User Groups In Online Forums, Siwei Fu, Yong Wang, Yi Yang, Qingqing Bi, Fangzhou Guo, Huamin Qu

Research Collection School Of Computing and Information Systems

User grouping in asynchronous online forums is a common phenomenon nowadays. People with similar backgrounds or shared interests like to get together in group discussions. As tens of thousands of archived conversational posts accumulate, challenges emerge for forum administrators and analysts to effectively explore user groups in large-volume threads and gain meaningful insights into the hierarchical discussions. Identifying and comparing groups in discussion threads are nontrivial, since the number of users and posts increases with time and noises may hamper the detection of user groups. Researchers in data mining fields have proposed a large body of algorithms to explore user …


Code Smells For Model-View-Controller Architectures, Maurício Aniche, Gabriele Bavota, Christoph Treude, Marco Aurélio Gerosa, Arie Van Deursen Aug 2018

Code Smells For Model-View-Controller Architectures, Maurício Aniche, Gabriele Bavota, Christoph Treude, Marco Aurélio Gerosa, Arie Van Deursen

Research Collection School Of Computing and Information Systems

Previous studies have shown the negative effects that low-quality code can have on maintainability proxies, such as code change- and defect-proneness. One of the symptoms of low-quality code are code smells, defined as sub-optimal implementation choices. While this definition is quite general and seems to suggest a wide spectrum of smells that can affect software systems, the research literature mostly focuses on the set of smells defined in the catalog by Fowler and Beck, reporting design issues that can potentially affect any kind of system, regardless of their architecture (e.g., Complex Class). However, systems adopting a specific architecture (e.g., the …


Customer Segmentation Using Online Platforms: Isolating Behavioral And Demographic Segments For Persona Creation Via Aggregated User Data, Jisun An, Haewoon Kwak, Soon‑Gyo Jung, Joni Salminen, Bernard J. Jansen Aug 2018

Customer Segmentation Using Online Platforms: Isolating Behavioral And Demographic Segments For Persona Creation Via Aggregated User Data, Jisun An, Haewoon Kwak, Soon‑Gyo Jung, Joni Salminen, Bernard J. Jansen

Research Collection School Of Computing and Information Systems

We propose a novel approach for isolating customer segments using online customer data for products that are distributed via online social media platforms. We use non-negative matrix factorization to first identify behavioral customer segments and then to identify demographic customer segments. We employ a methodology for linking the two segments to present integrated and holistic customer segments, also known as personas. Behavioral segments are generated from customer interactions with online content. Demographic segments are generated using the gender, age, and location of these customers. In addition to evaluating our approach, we demonstrate its practicality via a system leveraging these customer …


Dimensionality's Blessing: Clustering Images By Underlying Distribution, Wen-Yan Lin, Jian-Huang Lai, Siying Liu, Yasuyuki Matsushita Jun 2018

Dimensionality's Blessing: Clustering Images By Underlying Distribution, Wen-Yan Lin, Jian-Huang Lai, Siying Liu, Yasuyuki Matsushita

Research Collection School Of Computing and Information Systems

Many high dimensional vector distances tend to a constant. This is typically considered a negative “contrastloss” phenomenon that hinders clustering and other machine learning techniques. We reinterpret “contrast-loss” as a blessing. Re-deriving “contrast-loss” using the law of large numbers, we show it results in a distribution’s instances concentrating on a thin “hyper-shell”. The hollow center means apparently chaotically overlapping distributions are actually intrinsically separable. We use this to develop distribution-clustering, an elegant algorithm for grouping of data points by their (unknown) underlying distribution. Distribution-clustering, creates notably clean clusters from raw unlabeled data, estimates the number of clusters for itself and …


Entagrec(++): An Enhanced Tag Recommendation System For Software Information Sites, Shawei Wang, David Lo, Bogdan Vasilescu, Alexander Serebrenik Apr 2018

Entagrec(++): An Enhanced Tag Recommendation System For Software Information Sites, Shawei Wang, David Lo, Bogdan Vasilescu, Alexander Serebrenik

Research Collection School Of Computing and Information Systems

Software engineers share experiences with modern technologies using software information sites, such as Stack Overflow. These sites allow developers to label posted content, referred to as software objects, with short descriptions, known as tags. Tags help to improve the organization of questions and simplify the browsing of questions for users. However, tags assigned to objects tend to be noisy and some objects are not well tagged. For instance, 14.7% of the questions that were posted in 2015 on Stack Overflow needed tag re-editing after the initial assignment. To improve the quality of tags in software information sites, we propose EnTagRec …


Fixation And Confusion: Investigating Eye-Tracking Participants' Exposure To Information In Personas, Joni Salminen, Bernard J. Jansen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak Mar 2018

Fixation And Confusion: Investigating Eye-Tracking Participants' Exposure To Information In Personas, Joni Salminen, Bernard J. Jansen, Jisun An, Soon-Gyo Jung, Lene Nielsen, Haewoon Kwak

Research Collection School Of Computing and Information Systems

To more effectively convey relevant information to end users of persona profiles, we conducted a user study consisting of 29 participants engaging with three persona layout treatments. We were interested in confusion engendered by the treatments on the participants, and conducted a within-subjects study in the actual work environment, using eye-tracking and talk-aloud data collection. We coded the verbal data into classes of informativeness and confusion and correlated it with fixations and durations on the Areas of Interests recorded by the eye-tracking device. We used various analysis techniques, including Mann-Whitney, regression, and Levenshtein distance, to investigate how confused users differed …


A New Revocable And Re-Delegable Proxy Signature And Its Application, Shengmin Xu, Guomin Yang, Yi Mu Mar 2018

A New Revocable And Re-Delegable Proxy Signature And Its Application, Shengmin Xu, Guomin Yang, Yi Mu

Research Collection School Of Computing and Information Systems

With the popularity of cloud computing and mobile Apps, on-demand services such as on-line music or audio streaming and vehicle booking are widely available nowadays. In order to allow efficient delivery and management of the services, for large-scale on-demand systems, there is usually a hierarchy where the service provider can delegate its service to a top-tier (e.g., countrywide) proxy who can then further delegate the service to lower level (e.g., region-wide) proxies. Secure (re-)delegation and revocation are among the most crucial factors for such systems. In this paper, we investigate the practical solutions for achieving re-delegation and revocation utilizing proxy …


Integrated Reward Scheme And Surge Pricing In A Ride Sourcing Market, Hai Yang, Chaoyi Shao, Hai Wang, Jieping Ye Jan 2018

Integrated Reward Scheme And Surge Pricing In A Ride Sourcing Market, Hai Yang, Chaoyi Shao, Hai Wang, Jieping Ye

Research Collection School Of Computing and Information Systems

Surge pricing is commonly used in on-demand ride-sourcing platforms (e.g., Uber, Lyft and Didi) to dynamically balance demand and supply. However, since the price for ride service cannot be unlimited, there is usually a reasonable or legitimate range of prices in practice. Such a constrained surge pricing strategy fails to balance demand and supply in certain cases, e.g., even adopting the maximum allowed price cannot reduce the demand to an affordable level during peak hours. In addition, the practice of surge pricing is controversial and has stimulated long debate regarding its pros and cons. In this paper, to address the …