Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms

2014

Institution
Keyword
Publication
Publication Type

Articles 1 - 29 of 29

Full-Text Articles in Databases and Information Systems

Recommender Systems Research: A Connection-Centric Survey, Saverio Perugini, Marcos André Gonçalves, Edward A. Fox Dec 2014

Recommender Systems Research: A Connection-Centric Survey, Saverio Perugini, Marcos André Gonçalves, Edward A. Fox

Saverio Perugini

Recommender systems attempt to reduce information overload and retain customers by selecting a subset of items from a universal set based on user preferences. While research in recommender systems grew out of information retrieval and filtering, the topic has steadily advanced into a legitimate and challenging research area of its own. Recommender systems have traditionally been studied from a content-based filtering vs. collaborative design perspective. Recommendations, however, are not delivered within a vacuum, but rather cast within an informal community of users and social context. Therefore, ultimately all recommender systems make connections among people and thus should be surveyed from …


Symbolic Links In The Open Directory Project, Saverio Perugini Dec 2014

Symbolic Links In The Open Directory Project, Saverio Perugini

Saverio Perugini

We present a study to develop an improved understanding of symbolic links in web directories. A symbolic link is a hyperlink that makes a directed connection from a web page along one path through a directory to a page along another path. While symbolic links are ubiquitous in web directories such as Yahoo!, they are under-studied, and as a result, their uses are poorly understood. A cursory analysis of symbolic links reveals multiple uses: to provide navigational shortcuts deeper into a directory, backlinks to more general categories, and multiclassification. We investigated these uses in the Open Directory Project (ODP), the …


Interacting With Web Hierarchies, Saverio Perugini, Naren Ramakrishnan Dec 2014

Interacting With Web Hierarchies, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Web site interfaces are a particularly good fit for hierarchies in the broadest sense of that idea, i.e. a classification with multiple attributes, not necessarily a tree structure. Several adaptive interface designs are emerging that support flexible navigation orders, exposing and exploring dependencies, and procedural information-seeking tasks. This paper provides a context and vocabulary for thinking about hierarchical Web sites and their design. The paper identifies three features that interface to information hierarchies. These are flexible navigation orders, the ability to expose and explore dependencies, and support for procedural tasks. A few examples of these features are also provided


The Partial Evaluation Approach To Information Personalization, Naren Ramakrishnan, Saverio Perugini Dec 2014

The Partial Evaluation Approach To Information Personalization, Naren Ramakrishnan, Saverio Perugini

Saverio Perugini

Information personalization refers to the automatic adjustment of information content, structure, and presentation tailored to an individual user. By reducing information overload and customizing information access, personalization systems have emerged as an important segment of the Internet economy. This paper presents a systematic modeling methodology— PIPE (‘Personalization is Partial Evaluation’) — for personalization. Personalization systems are designed and implemented in PIPE by modeling an information-seeking interaction in a programmatic representation. The representation supports the description of information-seeking activities as partial information and their subsequent realization by partial evaluation, a technique for specializing programs. We describe the modeling methodology at a …


The Staging Transformation Approach To Mixing Initiative, Robert Capra, Michael Narayan, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones Dec 2014

The Staging Transformation Approach To Mixing Initiative, Robert Capra, Michael Narayan, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones

Saverio Perugini

Mixed-initiative interaction is an important facet of many conversational interfaces, flexible planning architectures, intelligent tutoring systems, and interactive information retrieval systems. Software systems for mixed-initiative interaction must enable us to both operationalize the mixing of initiative (i.e., support the creation of practical dialogs) and to reason in real-time about how a flexible mode of interaction can be supported (e.g., from a meta-dialog standpoint). In this paper, we present the staging transformation approach to mixing initiative, where a dialog script captures the structure of the dialog and dialog control processes are realized through generous use of program transformation techniques (e.g., partial …


Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson Dec 2014

Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson

Saverio Perugini

Hierarchies are ubiquitous on the web for structuring online catalogs and indexing multidimensional attributed data sets. They are a natural metaphor for information seeking if their levelwise structure mirrors the user's conception of the underlying domain. In other cases, they can be frustrating, especially if multiple drill‐downs are necessary to arrive at information of interest. To support a broad range of users, site designers often expose multiple faceted classifications or provide within‐page pruning mechanisms. We present a new technique, called out-of-turn interaction, that increases the richness of user interaction at hierarchical sites, without enumerating all possible completion paths in the …


Personalizing The Gams Cross-Index, Saverio Perugini, Priya Lakshminarayanan, Naren Ramakrishnan Dec 2014

Personalizing The Gams Cross-Index, Saverio Perugini, Priya Lakshminarayanan, Naren Ramakrishnan

Saverio Perugini

The NIST Guide to Available Mathematical Software (GAMS) system at http://gams.nist .gov serves as the gateway to thousands of scientific codes and modules for numerical computation. We describe the PIPE personalization facility for GAMS, whereby content from the cross-index is specialized for a user desiring software recommendations for a specific problem instance. The key idea is to (i) mine structure, and (ii) exploit it in a programmatic manner to generate personalized web pages. Our approach supports both content-based and collaborative personalization and enables information integration from multiple (and complementary) web resources. We present case studies for the domain of linear, …


Information Assurance Through Binary Vulnerability Auditing, William B. Kimball, Saverio Perugini Dec 2014

Information Assurance Through Binary Vulnerability Auditing, William B. Kimball, Saverio Perugini

Saverio Perugini

The goal of this research is to develop improved methods of discovering vulnerabilities in software. A large volume of software, from the most frequently used programs on a desktop computer, such as web browsers, e-mail programs, and word processing applications, to mission-critical services for the space shuttle, is unintentionally vulnerable to attacks and thus insecure. By seeking to improve the identification of vulnerabilities in software, the security community can save the time and money necessary to restore compromised computer systems. In addition, this research is imperative to activities of national security such as counterterrorism. The current approach involves a systematic …


Personalizing Interactions With Information Systems, Saverio Perugini, Naren Ramakrishnan Dec 2014

Personalizing Interactions With Information Systems, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Personalization constitutes the mechanisms and technologies necessary to customize information access to the end-user. It can be defined as the automatic adjustment of information content, structure, and presentation tailored to the individual. In this chapter, we study personalization from the viewpoint of personalizing interaction. The survey covers mechanisms for information-finding on the web, advanced information retrieval systems, dialog-based applications, and mobile access paradigms. Specific emphasis is placed on studying how users interact with an information system and how the system can encourage and foster interaction. This helps bring out the role of the personalization system as a facilitator which reconciles …


Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini Dec 2014

Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini

Saverio Perugini

We present three fundamental, interrelated approaches to support multiple access paths to each terminal object in information hierarchies: faceted classification, faceted search, and web directories with embedded symbolic links. This survey aims to demonstrate how each approach supports users who seek information from multiple perspectives. We achieve this by exploring each approach, the relationships between these approaches, including tradeoffs, and how they can be used in concert, while focusing on a core set of hypermedia elements common to all. This approach provides a foundation from which to study, understand, and synthesize applications which employ these techniques. This survey does not …


Program Transformations For Information Personalization, Saverio Perugini Dec 2014

Program Transformations For Information Personalization, Saverio Perugini

Saverio Perugini

Personalization constitutes the mechanisms and technologies necessary to customize information access to the end-user. It can be defined as the automatic adjustment of information content, structure, and presentation. The central thesis of this dissertation is that modeling interaction explicitly in a representation, and studying how partial information can be harnessed in it by program transformations to direct the flow of the interaction, can provide insight into, reveal opportunities for, and define a model for personalized interaction. To evaluate this thesis, a formal modeling methodology is developed for personalizing interactions with information systems, especially hierarchical hypermedia, based on program transformations. The …


Realtime Query Expansion And Procedural Interfaces For Information Hierarchies, Saverio Perugini Dec 2014

Realtime Query Expansion And Procedural Interfaces For Information Hierarchies, Saverio Perugini

Saverio Perugini

We demonstrate the use of two user interfaces for interacting with web hierarchies. One uses the dependencies underlying a hierarchy to perform real-time query expansion and, in this way, acts as an in situ feedback mechanism. The other enables the user to cascade the output from one interaction to the input of another, and so on, and, in this way, supports procedural information-seeking tasks without disrupting the flow of interaction.


Personalization By Website Transformation: Theory And Practice, Saverio Perugini Dec 2014

Personalization By Website Transformation: Theory And Practice, Saverio Perugini

Saverio Perugini

We present an analysis of a progressive series of out-of-turn transformations on a hierarchical website to personalize a user’s interaction with the site. We formalize the transformation in graph-theoretic terms and describe a toolkit we built that enumerates all of the traversals enabled by every possible complete series of these transformations in any site and computes a variety of metrics while simulating each traversal therein to qualify the relationship between a site’s structure and the cumulative effect of support for the transformation in a site. We employed this toolkit in two websites. The results indicate that the transformation enables users …


Staging Transformations For Multimodal Web Interaction Management, Michael Narayan, Christopher Williams, Saverio Perugini, Naren Ramakrishnan Dec 2014

Staging Transformations For Multimodal Web Interaction Management, Michael Narayan, Christopher Williams, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Multimodal interfaces are becoming increasingly ubiquitous with the advent of mobile devices, accessibility considerations, and novel software technologies that combine diverse interaction media. In addition to improving access and delivery capabilities, such interfaces enable flexible and personalized dialogs with websites, much like a conversation between humans. In this paper, we present a software framework for multimodal web interaction management that supports mixed-initiative dialogs between users and websites. A mixed-initiative dialog is one where the user and the website take turns changing the flow of interaction. The framework supports the functional specification and realization of such dialogs using staging transformations – …


Privacy-Preserving Sanitization In Data Sharing, Wentian Lu Nov 2014

Privacy-Preserving Sanitization In Data Sharing, Wentian Lu

Doctoral Dissertations

In the era of big data, the prospect of analyzing, monitoring and investigating all sources of data starts to stand out in every aspect of our life. The benefit of such practices becomes concrete only when analysts or investigators have the information shared from data owners. However, privacy is one of the main barriers that disrupt the sharing behavior, due to the fear of disclosing sensitive information. This dissertation describes data sanitization methods that disguise the sensitive information before sharing a dataset and our criteria are always protecting privacy while preserving utility as much as possible. In particular, we provide …


Band Selection For Hyperspectral Images Using Probabilistic Memetic Algorithm, Liang Feng, Ah-Hwee Tan, Meng-Hiot Lim, Si Wei Jiang Nov 2014

Band Selection For Hyperspectral Images Using Probabilistic Memetic Algorithm, Liang Feng, Ah-Hwee Tan, Meng-Hiot Lim, Si Wei Jiang

Research Collection School Of Computing and Information Systems

Band selection plays an important role in identifying the most useful and valuable information contained in the hyperspectral images for further data analysis such as classification, clustering, etc. Memetic algorithm (MA), among other metaheuristic search methods, has been shown to achieve competitive performances in solving the NP-hard band selection problem. In this paper, we propose a formal probabilistic memetic algorithm for band selection, which is able to adaptively control the degree of global exploration against local exploitation as the search progresses. To verify the effectiveness of the proposed probabilistic mechanism, empirical studies conducted on five well-known hyperspectral images against two …


Networked Employment Discrimination, Tamara Kneese Oct 2014

Networked Employment Discrimination, Tamara Kneese

Media Studies

Employers often struggle to assess qualified applicants, particularly in contexts where they receive hundreds of applications for job openings. In an effort to increase efficiency and improve the process, many have begun employing new tools to sift through these applications, looking for signals that a candidate is “the best fit.” Some companies use tools that offer algorithmic assessments of workforce data to identify the variables that lead to stronger employee performance, or to high employee attrition rates, while others turn to third party ranking services to identify the top applicants in a labor pool. Still others eschew automated systems, but …


Ultimate Codes: Near-Optimal Mds Array Codes For Raid-6, Zhijie Huang, Hong Jiang, Chong Wang, Ke Zhou, Yuhong Zhao Jul 2014

Ultimate Codes: Near-Optimal Mds Array Codes For Raid-6, Zhijie Huang, Hong Jiang, Chong Wang, Ke Zhou, Yuhong Zhao

CSE Technical Reports

As modern storage systems have grown in size and complexity, RAID-6 is poised to replace RAID-5 as the dominant form of RAID architectures due to its ability to protect against double disk failures. Many excellent erasure codes specially designed for RAID-6 have emerged in recent years. However, all of them have limitations. In this paper, we present a class of near perfect erasure codes for RAID-6, called the Ultimate codes. These codes encode, update and decode either optimally or nearly optimally, regardless of what the code length is. This implies that utilizing these codes we can build highly efficient and …


Structure Preserving Large Imagery Reconstruction, Ju Shen, Jianjun Yang, Sami Taha Abu Sneineh, Bryson Payne, Markus Hitz Jul 2014

Structure Preserving Large Imagery Reconstruction, Ju Shen, Jianjun Yang, Sami Taha Abu Sneineh, Bryson Payne, Markus Hitz

Computer Science Faculty Publications

With the explosive growth of web-based cameras and mobile devices, billions of photographs are uploaded to the internet. We can trivially collect a huge number of photo streams for various goals, such as image clustering, 3D scene reconstruction, and other big data applications. However, such tasks are not easy due to the fact the retrieved photos can have large variations in their view perspectives, resolutions, lighting, noises, and distortions. Furthermore, with the occlusion of unexpected objects like people, vehicles, it is even more challenging to find feature correspondences and reconstruct realistic scenes. In this paper, we propose a structure-based image …


Contextual Anomaly Detection In Big Sensor Data, Michael Hayes, Miriam A M Capretz Jun 2014

Contextual Anomaly Detection In Big Sensor Data, Michael Hayes, Miriam A M Capretz

Electrical and Computer Engineering Publications

Performing predictive modelling, such as anomaly detection, in Big Data is a difficult task. This problem is compounded as more and more sources of Big Data are generated from environmental sensors, logging applications, and the Internet of Things. Further, most current techniques for anomaly detection only consider the content of the data source, i.e. the data itself, without concern for the context of the data. As data becomes more complex it is increasingly important to bias anomaly detection techniques for the context, whether it is spatial, temporal, or semantic. The work proposed in this paper outlines a contextual anomaly detection …


Global Immutable Region Computation, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang Jun 2014

Global Immutable Region Computation, Jilian Zhang, Kyriakos Mouratidis, Hwee Hwa Pang

Research Collection School Of Computing and Information Systems

A top-k query shortlists the k records in a dataset that best match the user's preferences. To indicate her preferences, the user typically determines a numeric weight for each data dimension (i.e., attribute). We refer to these weights collectively as the query vector. Based on this vector, each data record is implicitly mapped to a score value (via a weighted sum function). The records with the k largest scores are reported as the result. In this paper we propose an auxiliary feature to standard top-k query processing. Specifically, we compute the maximal locus within which the query vector incurs no …


Automatic Objects Removal For Scene Completion, Jianjun Yang, Yin Wang, Honggang Wang, Kun Hua, Wei Wang, Ju Shen Apr 2014

Automatic Objects Removal For Scene Completion, Jianjun Yang, Yin Wang, Honggang Wang, Kun Hua, Wei Wang, Ju Shen

Computer Science Faculty Publications

With the explosive growth of Web-based cameras and mobile devices, billions of photographs are uploaded to the Internet. We can trivially collect a huge number of photo streams for various goals, such as 3D scene reconstruction and other big data applications. However, this is not an easy task due to the fact the retrieved photos are neither aligned nor calibrated. Furthermore, with the occlusion of unexpected foreground objects like people, vehicles, it is even more challenging to find feature correspondences and reconstruct realistic scenes. In this paper, we propose a structure-based image completion algorithm for object removal that produces visually …


Data Supply Chains, Tamara Kneese Mar 2014

Data Supply Chains, Tamara Kneese

Media Studies

As data moves between actors and organizations, what emerges is a data supply chain. Unlike manufacturing supply chains, transferred data is often duplicated in the process, challenging the essence of ownership. What does ethical data labor look like? How are the various stakeholders held accountable for being good data guardians? What does clean data transfer look like? What kinds of best practices can business and government put into place? What upstream rights to data providers have over downstream commercialization of their data?


Predicting Human Behavior, Tamara Kneese Mar 2014

Predicting Human Behavior, Tamara Kneese

Media Studies

Countless highly accurate predictions can be made from trace data, with varying degrees of personal or societal consequence (e.g., search engines predict hospital admission, gaming companies can predict compulsive gambling problems, government agencies predict criminal activity). Predicting human behavior can be both hugely beneficial and deeply problematic depending on the context. What kinds of predictive privacy harms are emerging? And what are the implications for systems of oversight and due process protections? For example, what are the implications for employment, health care and policing when predictive models are involved? How should varied organizations address what they can predict?


L-Opacity: Linkage-Aware Graph Anonymization, Sadegh Nobari, Panagiotis Karras, Hwee Hwa Pang, Stephane Bressan Mar 2014

L-Opacity: Linkage-Aware Graph Anonymization, Sadegh Nobari, Panagiotis Karras, Hwee Hwa Pang, Stephane Bressan

Research Collection School Of Computing and Information Systems

The wealth of information contained in online social networks has created a demand for the publication of such data as graphs. Yet, publication, even after identities have been removed, poses a privacy threat. Past research has suggested ways to publish graph data in a way that prevents the re-identification of nodes. However, even when identities are effectively hidden, an adversary may still be able to infer linkage between individuals with sufficiently high confidence. In this paper, we focus on the privacy threat arising from such link disclosure. We suggest L-opacity, a sufficiently strong privacy model that aims to control an …


L-Opacity: Linkage-Aware Graph Anonymization, Sadegh Nobari, Panagiotis Karras, Hwee Hwa Pang, Stephane Bressan Feb 2014

L-Opacity: Linkage-Aware Graph Anonymization, Sadegh Nobari, Panagiotis Karras, Hwee Hwa Pang, Stephane Bressan

Sadegh Nobari

The wealth of information contained in online social networks has created a demand for the publication of such data as graphs. Yet, publication, even after identities have been removed, poses a privacy threat. Past research has suggested ways to publish graph data in a way that prevents the re-identification of nodes. However, even when identities are effectively hidden, an adversary may still be able to infer linkage between individuals with sufficiently high confidence. In this paper, we focus on the privacy threat arising from such link disclosure. We suggest L-opacity, a sufficiently strong privacy model that aims to control an …


Unstructured P2p Link Lifetimes Redux, Zhongmei Yao, Daren B. H. Cline Feb 2014

Unstructured P2p Link Lifetimes Redux, Zhongmei Yao, Daren B. H. Cline

Computer Science Faculty Publications

We revisit link lifetimes in random P2P graphs under dynamic node failure and create a unifying stochastic model that generalizes the majority of previous efforts in this direction. We not only allow nonexponential user lifetimes and age-dependent neighbor selection, but also cover both active and passive neighbor-management strategies, model the lifetimes of incoming and outgoing links, derive churn-related message volume of the system, and obtain the distribution of transient in/out degree at each user. We then discuss the impact of design parameters on overhead and resilience of the network.


Digital Certificate Management: Optimal Pricing And Crl Releasing Strategies, Jie Zhang, Nan Hu, M. K. Raka Feb 2014

Digital Certificate Management: Optimal Pricing And Crl Releasing Strategies, Jie Zhang, Nan Hu, M. K. Raka

Research Collection School Of Computing and Information Systems

The fast growth of e-commerce and online activities places increasing needs for authentication and secure communication to enable information exchange and online transactions. The public key infrastructure (PKI) provides a promising foundation for meeting such demand, in which certificate authorities (CAs) provide digital certificates. In practice, it is critical to understand consumer purchasing and revocation behaviors so that CAs can better manage the digital certificates and its CRL releasing process. To address this problem, we analytically model a CA's pricing and revocation releasing strategies taking into consideration the users' rational decisions. The model provides solutions two main research questions: (1) …


A Hybrid Approach To Music Recommendation: Exploiting Collaborative Music Tags And Acoustic Features, Jaime C. Kaufman Jan 2014

A Hybrid Approach To Music Recommendation: Exploiting Collaborative Music Tags And Acoustic Features, Jaime C. Kaufman

UNF Graduate Theses and Dissertations

Recommendation systems make it easier for an individual to navigate through large datasets by recommending information relevant to the user. Companies such as Facebook, LinkedIn, Twitter, Netflix, Amazon, Pandora, and others utilize these types of systems in order to increase revenue by providing personalized recommendations. Recommendation systems generally use one of the two techniques: collaborative filtering (i.e., collective intelligence) and content-based filtering.

Systems using collaborative filtering recommend items based on a community of users, their preferences, and their browsing or shopping behavior. Examples include Netflix, Amazon shopping, and Last.fm. This approach has been proven effective due to increased popularity, and …