Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

2016

Series

Institution
Keyword
Publication

Articles 31 - 60 of 205

Full-Text Articles in Physical Sciences and Mathematics

Aspect-Based Helpfulness Prediction For Online Product Reviews, Yinfei Yang, Cen Chen, Forrest Sheng Bao Nov 2016

Aspect-Based Helpfulness Prediction For Online Product Reviews, Yinfei Yang, Cen Chen, Forrest Sheng Bao

Research Collection School Of Computing and Information Systems

Product reviews greatly influence purchase decisions in online shopping. A common burden of online shopping is that consumers have to search for the right answers through massive reviews, especially on popular products. Hence, estimating and predicting the helpfulness of reviews become important tasks to directly improve shopping experience. In this paper, we propose a new approach to helpfulness prediction by leveraging aspect analysis of reviews. Our hypothesis is that a helpful review will cover many aspects of a product at different emphasis levels. The first step to tackle this problem is to extract proper aspects. Because related products share common …


Summarization Of Egocentric Videos: A Comprehensive Survey, Ana Garcia Del Molino, Cheston Tan, Joo-Hwee Lim, Ah-Hwee Tan Nov 2016

Summarization Of Egocentric Videos: A Comprehensive Survey, Ana Garcia Del Molino, Cheston Tan, Joo-Hwee Lim, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

The introduction of wearable video cameras (e.g., GoPro) in the consumer market has promoted video life-logging, motivating users to generate large amounts of video data. This increasing flow of first-person video has led to a growing need for automatic video summarization adapted to the characteristics and applications of egocentric video. With this paper, we provide the first comprehensive survey of the techniques used specifically to summarize egocentric videos. We present a framework for first-person view summarization and compare the segmentation methods and selection algorithms used by the related work in the literature. Next, we describe the existing egocentric video datasets …


A Method Of Integrating Correlation Structures For A Generalized Recursive Route Choice Model, Tien Mai Nov 2016

A Method Of Integrating Correlation Structures For A Generalized Recursive Route Choice Model, Tien Mai

Research Collection School Of Computing and Information Systems

We propose a way to estimate a generalized recursive route choice model. The model generalizes other existing recursive models in the literature, i.e., (Fosgerau et al., 2013b; Mai et al., 2015c), while being more flexible since it allows the choice at each stage to be any member of the network multivariate extreme value (network MEV) model (Daly and Bierlaire, 2006). The estimation of the generalized model requires defining a contraction mapping and performing contraction iterations to solve the Bellman’s equation. Given the fact that the contraction mapping is defined based on the choice probability generating functions (CPGF) (Fosgerau et al., …


Github: An Introduction, Craig A. Boman Oct 2016

Github: An Introduction, Craig A. Boman

Roesch Library Staff Presentations

Tech startups have been using version control software to maximize their collaborative technology projects since their inception, but what more can librarians do to leverage this suite of tools? In this presentation, we will briefly describe how version control apps like Github may drastically improve technology collaborations in your library, specifically ILS web refreshes. After the Github introduction, those who participated in the pre-conference "hackathon" session will discuss their projects and talk about the successes and challenges they encountered.


Rediscovering Physical Collections Through The Digital Archive: The Jesuit Libraries Provenance Project, Kyle Roberts Oct 2016

Rediscovering Physical Collections Through The Digital Archive: The Jesuit Libraries Provenance Project, Kyle Roberts

History: Faculty Publications and Other Works

Historic library collections offer a rich and underexplored resource for teaching undergraduate and graduate students about new digital approaches, methodologies, and platforms. Their scope and scale can make them difficult to analyze in their physical form, but remediated onto a digital platform, they offer valuable insights into the process of archive creation and the importance of making their content available to audiences that cannot normally access it. The Jesuit Libraries Provenance Project (JLPP) was launched by students, faculty, and library professionals in 2014 to create an online archive of marks of ownership—bookplates, stamps, inscriptions—contained within books from the original library …


Active Snort Rules And The Needs For Computing Resources: Computing Resources Needed To Activate Different Numbers Of Snort Rules, Chad A. Arney, Xinli Wang Oct 2016

Active Snort Rules And The Needs For Computing Resources: Computing Resources Needed To Activate Different Numbers Of Snort Rules, Chad A. Arney, Xinli Wang

School of Technology Publications

This project was designed to discover the relationship between the number of enabled rules maintained by Snort and the amount of computing resources necessary to operate this intrusion detection system (IDS) as a sensor. A physical environment was set up to loosely simulate a network and an IDS sensor monitoring it.

The experiment was conducted in five trials. A different number of Snort rules was enabled in each trial and the corresponding utilization of computing resources was measured. Remarkable variation and a clear trend of CPU usage were observed in the experiment.


Attractiveness Versus Competition: Towards An Unified Model For User Visitation, Thanh-Nam Doan, Ee-Peng Lim Oct 2016

Attractiveness Versus Competition: Towards An Unified Model For User Visitation, Thanh-Nam Doan, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

Modeling user check-in behavior provides useful insights about venues as well as the users visiting them. These insights can be used in urban planning and recommender system applications. Unlike previous works that focus on modeling distance effect on user’s choice of check-in venues, this paper studies check-in behaviors affected by two venue-related factors, namely, area attractiveness and neighborhood competitiveness. The former refers to the ability of an area with multiple venues to collectively attract checkins from users, while the latter represents the ability of a venue to compete with its neighbors in the same area for check-ins. We first embark …


Tracking Virality And Susceptibility In Social Media, Tuan Anh Hoang, Ee-Peng Lim Oct 2016

Tracking Virality And Susceptibility In Social Media, Tuan Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

In social media, the magnitude of information propagation hinges on the virality and susceptibility of users spreading and receiving the information respectively, as well as the virality of information items. These users' and items' behavioral factors evolve dynamically at the same time interacting with one another. Previous works however measure the factors statically and independently in a restricted case: each user has only a single adoption on each item, and/or users' exposure to items are observable. In this work, we investigate the inter-relationship among the factors and users' multiple adoptions on items to propose both new static and temporal models …


Mabic: Mobile Application Builder For Interactive Communication, Huy Manh Nguyen Oct 2016

Mabic: Mobile Application Builder For Interactive Communication, Huy Manh Nguyen

Masters Theses & Specialist Projects

Nowadays, the web services and mobile technology advance to a whole new level. These technologies make the modern communication faster and more convenient than the traditional way. People can also easily share data, picture, image and video instantly. It also saves time and money. For example: sending an email or text message is cheaper and faster than a letter. Interactive communication allows the instant exchange of feedback and enables two-way communication between people and people, or people and computer. It increases the engagement of sender and receiver in communication.

Although many systems such as REDCap and Taverna are built for …


Inferring Links Between Concerns And Methods With Multi-Abstraction Vector Space Model, Yun Zhang, David Lo, Xin Xia, Tien-Duy B. Le, Giuseppe Scanniello, Jianling Sun Oct 2016

Inferring Links Between Concerns And Methods With Multi-Abstraction Vector Space Model, Yun Zhang, David Lo, Xin Xia, Tien-Duy B. Le, Giuseppe Scanniello, Jianling Sun

Research Collection School Of Computing and Information Systems

Concern localization refers to the process of locating code units that match a particular textual description. It takes as input textual documents such as bug reports and feature requests and outputs a list of candidate code units that are relevant to the bug reports or feature requests. Many information retrieval (IR) based concern localization techniques have been proposed in the literature. These techniques typically represent code units and textual descriptions as a bag of tokens at one level of abstraction, e.g., each token is a word, or each token is a topic. In this work, we propose a multi-abstraction concern …


Behavior Analysis In Social Networks: Challenges, Technologies, And Trends, Meng Wang, Ee-Peng Lim, Lei Li, Mehmet Orgun Oct 2016

Behavior Analysis In Social Networks: Challenges, Technologies, And Trends, Meng Wang, Ee-Peng Lim, Lei Li, Mehmet Orgun

Research Collection School Of Computing and Information Systems

The research on social networks has advanced significantly, which can be attributed to the prevalence of the online social websites and instant messaging systems as well as the popularity of mobile apps that support easy access to online social networks. These social networks are usually characterized by the complex network structures and rich contextual information. They now become the key platforms for, among others, content dissemination, professional networking, recommendation, alerting, and political campaigns. As online social network users perform activities on the social networks, they leave data traces of human behavior which allow the latter to be studied at scale. …


Arise-Pie: A People Information Integration Engine Over The Web, Vincent W. Zheng, Tao Hoang, Penghe Chen, Yuan Fang, Xiaoyan Yang Oct 2016

Arise-Pie: A People Information Integration Engine Over The Web, Vincent W. Zheng, Tao Hoang, Penghe Chen, Yuan Fang, Xiaoyan Yang

Research Collection School Of Computing and Information Systems

Searching for people information on the Web is a common practice in life. However, it is time consuming to search for such information manually. In this paper, we aim to develop an automatic people information search system, named ARISE-PIE. To build such a system, we tackle two major technical challenges: data harvesting and data integration. For data harvesting, we study how to leverage search engine to help crawl the relevant Web pages for a target entity; then we propose a novel learning to query model that can automatically select a set of "best" queries to maximize collective utility (e.g., precision …


Online Adaptive Passive-Aggressive Methods For Non-Negative Matrix Factorization And Its Applications, Chenghao Liu, Hoi, Steven C. H., Peilin Zhao, Jianling Sun, Ee-Peng Lim Oct 2016

Online Adaptive Passive-Aggressive Methods For Non-Negative Matrix Factorization And Its Applications, Chenghao Liu, Hoi, Steven C. H., Peilin Zhao, Jianling Sun, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

This paper aims to investigate efficient and scalable machine learning algorithms for resolving Non-negative Matrix Factorization (NMF), which is important for many real-world applications, particularly for collaborative filtering and recommender systems. Unlike traditional batch learning methods, a recently proposed online learning technique named "NN-PA" tackles NMF by applying the popular Passive-Aggressive (PA) online learning, and found promising results. Despite its simplicity and high efficiency, NN-PA falls short in at least two critical limitations: (i) it only exploits the first-order information and thus may converge slowly especially at the beginning of online learning tasks; (ii) it is sensitive to some key …


Plackett-Luce Regression Mixture Model For Heterogeneous Rankings, Maksim Tkachenko, Hady W. Lauw Oct 2016

Plackett-Luce Regression Mixture Model For Heterogeneous Rankings, Maksim Tkachenko, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Learning to rank is an important problem in many scenarios, such as information retrieval, natural language processing, recommender systems, etc. The objective is to learn a function that ranks a number of instances based on their features. In the vast majority of the learning to rank literature, there is an implicit assumption that the population of ranking instances are homogeneous, and thus can be modeled by a single central ranking function. In this work, we are concerned with learning to rank for a heterogeneous population, which may consist of a number of sub-populations, each of which may rank objects dierently. …


Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi Oct 2016

Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi

Computer Science Faculty Publications and Presentations

Sensing devices generate tremendous amounts of data each day, which include large quantities of multi-dimensional measurements. These data are expected to be immediately available for real-time analytics as they are streamed into storage. Such scenarios pose challenges to state-of-the-art indexing methods, as they must not only support efficient queries but also frequent updates. We propose here a novel indexing method that ingests multi-dimensional observational data in real time. This method primarily guarantees extremely high throughput for data ingestion, while it can be continuously refined in the background to improve query efficiency. Instead of representing collections of points using Minimal Bounding …


Deep-Based Ingredient Recognition For Cooking Recipe Retrieval, Jingjing Chen, Chong-Wah Ngo Oct 2016

Deep-Based Ingredient Recognition For Cooking Recipe Retrieval, Jingjing Chen, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Retrieving recipes corresponding to given dish pictures facilitates the estimation of nutrition facts, which is crucial to various health relevant applications. The current approaches mostly focus on recognition of food category based on global dish appearance without explicit analysis of ingredient composition. Such approaches are incapable for retrieval of recipes with unknown food categories, a problem referred to as zero-shot retrieval. On the other hand, content-based retrieval without knowledge of food categories is also difficult to attain satisfactory performance due to large visual variations in food appearance and ingredient composition. As the number of ingredients is far less than food …


Data Visualizations And Infographics, Darren Sweeper Sep 2016

Data Visualizations And Infographics, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Get Me To My Gate On Time: Efficiently Solving General-Sum Bayesian Threat Screening Games, Aaron Schlenker, Matthew Brown, Arunesh Sinha, Milind Tambe, Ruta Mehta Sep 2016

Get Me To My Gate On Time: Efficiently Solving General-Sum Bayesian Threat Screening Games, Aaron Schlenker, Matthew Brown, Arunesh Sinha, Milind Tambe, Ruta Mehta

Research Collection School Of Computing and Information Systems

Threat Screening Games (TSGs) are used in domains where there is a set of individuals or objects to screen with a limited amount of screening resources available to screen them. TSGs are broadly applicable to domains like airport passenger screening, stadium screening, cargo container screening, etc. Previous work on TSGs focused only on the Bayesian zero-sum case and provided the MGA algorithm to solve these games. In this paper, we solve Bayesian general-sum TSGs which we prove are NP-hard even when exploiting a compact marginal representation. We also present an algorithm based upon a adversary type hierarchical tree decomposition and …


Probabilistic Models For Contextual Agreement In Preferences, Loc Do, Hady W. Lauw Sep 2016

Probabilistic Models For Contextual Agreement In Preferences, Loc Do, Hady W. Lauw

Research Collection School Of Computing and Information Systems

The long-tail theory for consumer demand implies the need for more accurate personalization technologies to target items to the users who most desire them. A key tenet of personalization is the capacity to model user preferences. Most of the previous work on recommendation and personalization has focused primarily on individual preferences. While some focus on shared preferences between pairs of users, they assume that the same similarity value applies to all items. Here we investigate the notion of "context," hypothesizing that while two users may agree on their preferences on some items, they may also disagree on other items. To …


Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie Sep 2016

Metaflow: A Scalable Metadata Lookup Service For Distributed File Systems In Data Centers, Peng Sun, Yonggang Wen, Nguyen Binh Duong Ta, Haiyong Xie

Research Collection School Of Computing and Information Systems

In large-scale distributed file systems, efficient metadata operations are critical since most file operations have to interact with metadata servers first. In existing distributed hash table (DHT) based metadata management systems, the lookup service could be a performance bottleneck due to its significant CPU overhead. Our investigations showed that the lookup service could reduce system throughput by up to 70%, and increase system latency by a factor of up to 8 compared to ideal scenarios. In this paper, we present MetaFlow, a scalable metadata lookup service utilizing software-defined networking (SDN) techniques to distribute lookup workload over network components. MetaFlow tackles …


Control Flow Integrity Enforcement With Dynamic Code Optimization, Yan Lin, Xiaoxiao Tang, Debin Gao, Jianming Fu Sep 2016

Control Flow Integrity Enforcement With Dynamic Code Optimization, Yan Lin, Xiaoxiao Tang, Debin Gao, Jianming Fu

Research Collection School Of Computing and Information Systems

Control Flow Integrity (CFI) is an attractive security property with which most injected and code reuse attacks can be defeated, including advanced attacking techniques like Return-Oriented Programming (ROP). However, comprehensive enforcement of CFI is expensive due to additional supports needed (e.g., compiler support and presence of relocation or debug information) and performance overhead. Recent research has been trying to strike the balance among reasonable approximation of the CFI properties, minimal additional supports needed, and acceptable performance. We investigate existing dynamic code optimization techniques and find that they provide an architecture on which CFI can be enforced effectively and efficiently. In …


Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H. Sep 2016

Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H.

Research Collection School Of Computing and Information Systems

Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorithms have been proposed, where the correlation between thefeatures is utilized to improve the learning efficiency. Among them,Confidence-Weighted (CW) learning algorithms are very effective, which assumethat the classification model is drawn from a Gaussian distribution, whichenables the model to be effectively updated with the second-order informationof the data stream. Despite being studied actively, these CW algorithms cannothandle nonseparable datasets and noisy datasets very …


Detecting Community Pacemakers Of Burst Topic In Twitter, Guozhong Dong, Wu Yang, Feida Zhu, Wei Wang Sep 2016

Detecting Community Pacemakers Of Burst Topic In Twitter, Guozhong Dong, Wu Yang, Feida Zhu, Wei Wang

Research Collection School Of Computing and Information Systems

Twitter has become one of largest social networks for users to broad-cast burst topics. Influential users usually have a large number of followers and play an important role in the diffusion of burst topic. There have been many studies on how to detect influential users. However, traditional influential users detection approaches have largely ignored influential users in user community. In this paper, we investigate the problem of detecting community pacemakers. Community pacemakers are defined as the influential users that promote early diffusion in the user community of burst topic. To solve this problem, we present DCPBT, a framework that can …


Microblogging Content Propagation Modeling Using Topic-Specific Behavioral Factors, Tuan Anh Hoang, Ee-Peng Lim Sep 2016

Microblogging Content Propagation Modeling Using Topic-Specific Behavioral Factors, Tuan Anh Hoang, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

When a microblogging user adopts some content propagated to her, we can attribute that to three behavioral factors, namely, topic virality, user virality, and user susceptibility. Topic virality measures the degree to which a topic attracts propagations by users. User virality and susceptibility refer to the ability of a user to propagate content to other users, and the propensity of a user adopting content propagated to her, respectively. In this paper, we study the problem of mining these behavioral factors specific to topics from microblogging content propagation data. We first construct a three dimensional tensor for representing the propagation instances. …


Is Only One Gps Position Sufficient To Locate You To The Road Network Accurately?, Hao Wu, Weiwei Sun, Baihua Zheng Sep 2016

Is Only One Gps Position Sufficient To Locate You To The Road Network Accurately?, Hao Wu, Weiwei Sun, Baihua Zheng

Research Collection School Of Computing and Information Systems

Locating only one GPS position to a road segment accurately is crucial to many location-based services such as mobile taxihailing service, geo-tagging, POI check-in, etc. This problem is challenging because of errors including the GPS errors and the digital map errors (misalignment and the same representation of bidirectional roads) and a lack of context information. To the best of our knowledge, no existing work studies this problem directly and the work to reduce GPS signal errors by considering hardware aspect is the most relevant. Consequently, this work is the first attempt to solve the problem of locating one GPS position …


Human-Centred Design For Silver Assistants, Zhiwei Zheng, Di Wang, Ailiya Borjigin, Chunyan Miao, Ah-Hwee Tan, Cyril Leung Sep 2016

Human-Centred Design For Silver Assistants, Zhiwei Zheng, Di Wang, Ailiya Borjigin, Chunyan Miao, Ah-Hwee Tan, Cyril Leung

Research Collection School Of Computing and Information Systems

To alleviate the rapidly increasing need of the healthcare workforce to serve the enormous ageing population, leveraging intelligent and autonomous caring agents is one promising way. Working towards the design and development of dedicated personal silver assistants for older adults, we follow the human-centred design approach. Specifically, we identify a number of human factors that affect the user experience of the older adults and develop an agent named Mobile Intelligent Silver Assistant (MISA) by applying these human factors. Integrating multiple reusable services onto one platform, MISA acts as a single point of contact while simultaneously providing easy and convenient access …


Autoquery: Automatic Construction Of Dependency Queries For Code Search, Shaowei Wang, David Lo, Lingxiao Jiang Sep 2016

Autoquery: Automatic Construction Of Dependency Queries For Code Search, Shaowei Wang, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Many code search techniques have been proposed to return relevant code for a user query expressed as textual descriptions. However, source code is not mere text. It contains dependency relations among various program elements. To leverage these dependencies for more accurate code search results, techniques have been proposed to allow user queries to be expressed as control and data dependency relationships among program elements. Although such techniques have been shown to be effective for finding relevant code, it remains a question whether appropriate queries can be generated by average users. In this work, we address this concern by proposing a …


Modeling Sequential Preferences With Dynamic User And Context Factors, Duc Trong Le, Yuan Fang, Hady W. Lauw Sep 2016

Modeling Sequential Preferences With Dynamic User And Context Factors, Duc Trong Le, Yuan Fang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Users express their preferences for items in diverse forms, through their liking for items, as well as through the sequence in which they consume items. The latter, referred to as “sequential preference”, manifests itself in scenarios such as song or video playlists, topics one reads or writes about in social media, etc. The current approach to modeling sequential preferences relies primarily on the sequence information, i.e., which item follows another item. However, there are other important factors, due to either the user or the context, which may dynamically affect the way a sequence unfolds. In this work, we develop generative …


Efficient Community Maintenance For Dynamic Social Networks, Hongchao Qin, Ye Yuan, Feida Zhu, Guoren Wang Sep 2016

Efficient Community Maintenance For Dynamic Social Networks, Hongchao Qin, Ye Yuan, Feida Zhu, Guoren Wang

Research Collection School Of Computing and Information Systems

Community detection plays an important role in a wide range of research topics for social networks including personalized recommendation services and information dissemination. The highly dynamic nature of social platforms, and accordingly the constant updates to the underlying network, all present a serious challenge for efficient maintenance of the identified communities. How to avoid computing from scratch the whole community detection result in face of every update, which constitutes small changes more often than not. To solve this problem, we propose a novel and efficient algorithm to maintain the communities in dynamic social networks by identifying and updating only those …


Extracting Food Substitutes From Food Diary Via Distributional Similarity, Palakorn Achananuparp, Ingmar Weber Sep 2016

Extracting Food Substitutes From Food Diary Via Distributional Similarity, Palakorn Achananuparp, Ingmar Weber

Research Collection School Of Computing and Information Systems

In this paper, we explore the problem of identifying substitute relationship between food pairs from real-world food consumption data as the first step towards the healthier food recommendation. Our method is inspired by the distributional hypothesis in linguistics. Specifically, we assume that foods that are consumed in similar contexts are more likely to be similar dietarily. For example, a turkey sandwich can be considered a suitable substitute for a chicken sandwich if both tend to be consumed with french fries and salad. To evaluate our method, we constructed a real-world food consumption dataset from MyFitnessPal's public food diary entries and …