Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 45

Full-Text Articles in Databases and Information Systems

State Preserving Extreme Learning Machine For Face Recognition, Md. Zahangir Alom, Paheding Sidike, Vijayan K. Asari, Tarek M. Taha Oct 2016

State Preserving Extreme Learning Machine For Face Recognition, Md. Zahangir Alom, Paheding Sidike, Vijayan K. Asari, Tarek M. Taha

Vijayan K. Asari

Extreme Learning Machine (ELM) has been introduced as a new algorithm for training single hidden layer feed-forward neural networks (SLFNs) instead of the classical gradient-based algorithms. Based on the consistency property of data, which enforce similar samples to share similar properties, ELM is a biologically inspired learning algorithm with SLFNs that learns much faster with good generalization and performs well in classification applications. However, the random generation of the weight matrix in current ELM based techniques leads to the possibility of unstable outputs in the learning and testing phases. Therefore, we present a novel approach for computing the weight matrix …


User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang Apr 2016

User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang

Saverio Perugini

As detailed in Chap. 1, system implementations for dynamic taxonomies and faceted search allow a wide range of query possibilities on the data. Only when these are made accessible by appropriate user interfaces, the resulting applications can support a variety of search, browsing and analysis tasks. User interface design in this area is confronted with specific challenges. This chapter presents an overview of both established and novel principles and solutions.


Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan Apr 2016

Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Personalization constitutes the mechanisms necessary to automatically customize information content, structure, and presentation to the end user to reduce information overload. Unlike traditional approaches to personalization, the central theme of our approach is to model a website as a program and conduct website transformation for personalization by program transformation (e.g., partial evaluation, program slicing). The goal of this paper is study personalization through a program transformation lens and develop a formal model, based on program transformations, for personalized interaction with hierarchical hypermedia. The specific research issues addressed involve identifying and developing program representations and transformations suitable for classes of hierarchical …


A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini Apr 2016

A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini

Saverio Perugini

We discuss and demonstrate a tool for prototyping dialog-based systems that, given a high-level specification of a human-computer dialog, stages the dialog for interactive use. The tool enables a dialog designer to evaluate a variety of dialogs without having to program each individual dialog, and serves as a proof-of-concept for our approach to mixed-initiative dialog modeling and implementation from a programming language-based perspective.


An Immersive Telepresence System Using Rgb-D Sensors And Head-Mounted Display, Xinzhong Lu, Ju Shen, Saverio Perugini, Jianjun Yang Jan 2016

An Immersive Telepresence System Using Rgb-D Sensors And Head-Mounted Display, Xinzhong Lu, Ju Shen, Saverio Perugini, Jianjun Yang

Saverio Perugini

We present a tele-immersive system that enables people to interact with each other in a virtual world using body gestures in addition to verbal communication. Beyond the obvious applications, including general online conversations and gaming, we hypothesize that our proposed system would be particularly beneficial to education by offering rich visual contents and interactivity. One distinct feature is the integration of egocentric pose recognition that allows participants to use their gestures to demonstrate and manipulate virtual objects simultaneously. This functionality enables the instructor to effectively and efficiently explain and illustrate complex concepts or sophisticated problems in an intuitive manner. The …


Metalogic Notes, Saverio Perugini Jun 2015

Metalogic Notes, Saverio Perugini

Saverio Perugini

A collection of notes, formulas, theorems, postulates and terminology in symbolic logic, syntactic notions, semantic notions, linkages between syntax and semantics, soundness and completeness, quantified logic, first-order theories, Goedel's First Incompleteness Theorem and more.


Statistics Notes, Saverio Perugini Jun 2015

Statistics Notes, Saverio Perugini

Saverio Perugini

A collection of terms, definitions, formulas and explanations about statistics.


A Behavior-Reactive Autonomous System To Identify Pokémon Characters, Xu Cao, Bohan Zhang, Jeremy Straub, Eunjin Kim Apr 2015

A Behavior-Reactive Autonomous System To Identify Pokémon Characters, Xu Cao, Bohan Zhang, Jeremy Straub, Eunjin Kim

Jeremy Straub

Pokémon is an entertainment franchise with a large fan base. This project uses well-known Pokémon characters to demonstrate the operations of a question selection system. Presented in the form of a game where the computer attempts to guess the user-selected character, the system attempts to minimize the number of questions required for this purpose by identifying questions that most constrain the decision space. The decision making process is refined based on actual user behavior.


Theory Identity: A Machine-Learning Approach, Kai Larsen, Dirk Hovorka, Jevin West, James Birt, James Pfaff, Trevor Chambers, Zebula Sampedro, Nick Zager, Bruce Vanstone Mar 2015

Theory Identity: A Machine-Learning Approach, Kai Larsen, Dirk Hovorka, Jevin West, James Birt, James Pfaff, Trevor Chambers, Zebula Sampedro, Nick Zager, Bruce Vanstone

Bruce Vanstone

Theory identity is a fundamental problem for researchers seeking to determine theory quality, create theory ontologies and taxonomies, or perform focused theory-specific reviews and meta-analyses. We demonstrate a novel machine-learning approach to theory identification based on citation data and article features. The multi-disciplinary ecosystem of articles which cite a theory's originating paper is created and refined into the network of papers predicted to contribute to, and thus identify, a specific theory. We provide a 'proof-of-concept' for a highly-cited theory. Implications for crossdisciplinary theory integration and the identification of theories for a rapidly expanding scientific literature are discussed.


Residual-Based Measurement Of Peer And Link Lifetimes In Gnutella Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov Jan 2015

Residual-Based Measurement Of Peer And Link Lifetimes In Gnutella Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov

Zhongmei Yao

Existing methods of measuring lifetimes in P2P systems usually rely on the so-called create-based method (CBM), which divides a given observation window into two halves and samples users "created" in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we flrst derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent …


On Node Isolation Under Churn In Unstructured P2p Networks With Heavy-Tailed Lifetimes, Zhongmei Yao, Xiaoming Wang, Dmitri Loguinov Jan 2015

On Node Isolation Under Churn In Unstructured P2p Networks With Heavy-Tailed Lifetimes, Zhongmei Yao, Xiaoming Wang, Dmitri Loguinov

Zhongmei Yao

Previous analytical studies [12], [18] of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared to uniform selection of neighbors. In fact, the second strategy based on random walks on age-weighted graphs demonstrates that for lifetimes with infinite variance, the system …


Modeling Heterogeneous User Churn And Local Resilience Of Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov, Xiaoming Wang Jan 2015

Modeling Heterogeneous User Churn And Local Resilience Of Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov, Xiaoming Wang

Zhongmei Yao

Previous analytical results on the resilience of unstructured P2P systems have not explicitly modeled heterogeneity of user churn (i.e., difference in online behavior) or the impact of in-degree on system resilience. To overcome these limitations, we introduce a generic model of heterogeneous user churn, derive the distribution of the various metrics observed in prior experimental studies (e.g., lifetime distribution of joining users, joint distribution of session time of alive peers, and residual lifetime of a randomly selected user), derive several closed-form results on the transient behavior of in-degree, and eventually obtain the joint in/out degree isolation probability as a simple …


Link Lifetimes And Randomized Neighbor Selection In Dhts, Zhongmei Yao, Dmitri Loguinov Jan 2015

Link Lifetimes And Randomized Neighbor Selection In Dhts, Zhongmei Yao, Dmitri Loguinov

Zhongmei Yao

Several models of user churn, resilience, and link lifetime have recently appeared in the literature [12], [13], [34], [35]; however, these results do not directly apply to classical Distributed Hash Tables (DHTs) in which neighbor replacement occurs not only when current users die, but also when new user arrive into the system, and where replacement choices are often restricted to the successor of the failed zone in the DHT space. To understand neighbor churn in such networks, this paper proposes a simple, yet accurate, model for capturing link dynamics in structured P2P systems and obtains the distribution of link lifetimes …


Residual-Based Estimation Of Peer And Link Lifetimes In P2p Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov Jan 2015

Residual-Based Estimation Of Peer And Link Lifetimes In P2p Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov

Zhongmei Yao

Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-BasedMethod (CBM), which divides a given observation window into two halves and samples users ldquocreatedrdquo in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent tradeoff …


Robust Lifetime Measurement In Large-Scale P2p Systems With Non-Stationary Arrivals, Xiaoming Wang, Zhongmei Yao, Yueping Zhang, Dmitri Loguinov Jan 2015

Robust Lifetime Measurement In Large-Scale P2p Systems With Non-Stationary Arrivals, Xiaoming Wang, Zhongmei Yao, Yueping Zhang, Dmitri Loguinov

Zhongmei Yao

Characterizing user churn has become an important topic in studying P2P networks, both in theoretical analysis and system design. Recent work has shown that direct sampling of user lifetimes may lead to certain bias (arising from missed peers and round-off inconsistencies) and proposed a technique that estimates lifetimes based on sampled residuals. In this paper, however, we show that under non-stationary arrivals, which are often present in real systems, residual-based sampling does not correctly reconstruct user lifetimes and suffers a varying degree of bias, which in some cases makes estimation completely impossible. We overcome this problem using two contributions: a …


Stochastic Analysis Of Horizontal Ip Scanning, Derek Leonard, Zhongmei Yao, Xiaoming Wang, Dmitri Loguinov Jan 2015

Stochastic Analysis Of Horizontal Ip Scanning, Derek Leonard, Zhongmei Yao, Xiaoming Wang, Dmitri Loguinov

Zhongmei Yao

Intrusion Detection Systems (IDS) have become ubiquitous in the defense against virus outbreaks, malicious exploits of OS vulnerabilities, and botnet proliferation. As attackers frequently rely on host scanning for reconnaissance leading to penetration, IDS is often tasked with detecting scans and preventing them. However, it is currently unknown how likely an IDS is to detect a given Internet-wide scan pattern and whether there exist sufficiently fast scan techniques that can remain virtually undetectable at large-scale. To address these questions, we propose a simple analytical model for the window-expiration rules of popular IDS tools (i.e., Snort and Bro) and utilize a …


In-Degree Dynamics Of Large-Scale P2p Systems, Zhongmei Yao, Daren B. H. Cline, Dmitri Loguinov Jan 2015

In-Degree Dynamics Of Large-Scale P2p Systems, Zhongmei Yao, Daren B. H. Cline, Dmitri Loguinov

Zhongmei Yao

This paper builds a complete modeling framework for understanding user churn and in-degree dynamics in unstructured P2P systems in which each user can be viewed as a stationary alternating renewal process. While the classical Poisson result on the superposition of n stationary renewal processes for n→∞ requires that each point process become sparser as n increases, it is often difficult to rigorously show this condition in practice. In this paper, we first prove that despite user heterogeneity and non-Poisson arrival dynamics, a superposition of edge-arrival processes to a live user under uniform selection converges to a Poisson process when …


Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao Jan 2015

Automatically Discovering The Number Of Clusters In Web Page Datasets, Zhongmei Yao

Zhongmei Yao

Clustering is well-suited for Web mining by automatically organizing Web pages into categories, each of which contains Web pages having similar contents. However, one problem in clustering is the lack of general methods to automatically determine the number of categories or clusters. For the Web domain in particular, currently there is no such method suitable for Web page clustering. In an attempt to address this problem, we discover a constant factor that characterizes the Web domain, based on which we propose a new method for automatically determining the number of clusters in Web page data sets. We discover that the …


Node Isolation Model And Age-Based Neighbor Selection In Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov Jan 2015

Node Isolation Model And Age-Based Neighbor Selection In Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov

Zhongmei Yao

Previous analytical studies of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared with uniform selection of neighbors. In fact, the second strategy based on random walks on age-proportional graphs demonstrates that, for lifetimes with infinite variance, the system monotonically increases …


Cepsim: A Simulator For Cloud-Based Complex Event Processing, Wilson Higashino, Miriam Capretz, Luiz Bittencourt Dec 2014

Cepsim: A Simulator For Cloud-Based Complex Event Processing, Wilson Higashino, Miriam Capretz, Luiz Bittencourt

Wilson A Higashino

As one of the Vs defining Big Data, data velocity brings many new challenges to traditional data processing approaches. The adoption of cloud environments in complex event processing (CEP) systems is a recent architectural style that aims to overcome these challenges. Validating cloud-based CEP systems at the required Big Data scale, however, is often a laborious, error-prone, and expensive task. This article presents CEPSim, a new simulator that has been developed to facilitate this validation process. CEPSim extends CloudSim, an existing cloud simulator, with an application model based on directed acyclic graphs that is used to represent continuous CEP queries. …


Recommender Systems Research: A Connection-Centric Survey, Saverio Perugini, Marcos André Gonçalves, Edward A. Fox Dec 2014

Recommender Systems Research: A Connection-Centric Survey, Saverio Perugini, Marcos André Gonçalves, Edward A. Fox

Saverio Perugini

Recommender systems attempt to reduce information overload and retain customers by selecting a subset of items from a universal set based on user preferences. While research in recommender systems grew out of information retrieval and filtering, the topic has steadily advanced into a legitimate and challenging research area of its own. Recommender systems have traditionally been studied from a content-based filtering vs. collaborative design perspective. Recommendations, however, are not delivered within a vacuum, but rather cast within an informal community of users and social context. Therefore, ultimately all recommender systems make connections among people and thus should be surveyed from …


Symbolic Links In The Open Directory Project, Saverio Perugini Dec 2014

Symbolic Links In The Open Directory Project, Saverio Perugini

Saverio Perugini

We present a study to develop an improved understanding of symbolic links in web directories. A symbolic link is a hyperlink that makes a directed connection from a web page along one path through a directory to a page along another path. While symbolic links are ubiquitous in web directories such as Yahoo!, they are under-studied, and as a result, their uses are poorly understood. A cursory analysis of symbolic links reveals multiple uses: to provide navigational shortcuts deeper into a directory, backlinks to more general categories, and multiclassification. We investigated these uses in the Open Directory Project (ODP), the …


Interacting With Web Hierarchies, Saverio Perugini, Naren Ramakrishnan Dec 2014

Interacting With Web Hierarchies, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Web site interfaces are a particularly good fit for hierarchies in the broadest sense of that idea, i.e. a classification with multiple attributes, not necessarily a tree structure. Several adaptive interface designs are emerging that support flexible navigation orders, exposing and exploring dependencies, and procedural information-seeking tasks. This paper provides a context and vocabulary for thinking about hierarchical Web sites and their design. The paper identifies three features that interface to information hierarchies. These are flexible navigation orders, the ability to expose and explore dependencies, and support for procedural tasks. A few examples of these features are also provided


The Partial Evaluation Approach To Information Personalization, Naren Ramakrishnan, Saverio Perugini Dec 2014

The Partial Evaluation Approach To Information Personalization, Naren Ramakrishnan, Saverio Perugini

Saverio Perugini

Information personalization refers to the automatic adjustment of information content, structure, and presentation tailored to an individual user. By reducing information overload and customizing information access, personalization systems have emerged as an important segment of the Internet economy. This paper presents a systematic modeling methodology— PIPE (‘Personalization is Partial Evaluation’) — for personalization. Personalization systems are designed and implemented in PIPE by modeling an information-seeking interaction in a programmatic representation. The representation supports the description of information-seeking activities as partial information and their subsequent realization by partial evaluation, a technique for specializing programs. We describe the modeling methodology at a …


The Staging Transformation Approach To Mixing Initiative, Robert Capra, Michael Narayan, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones Dec 2014

The Staging Transformation Approach To Mixing Initiative, Robert Capra, Michael Narayan, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones

Saverio Perugini

Mixed-initiative interaction is an important facet of many conversational interfaces, flexible planning architectures, intelligent tutoring systems, and interactive information retrieval systems. Software systems for mixed-initiative interaction must enable us to both operationalize the mixing of initiative (i.e., support the creation of practical dialogs) and to reason in real-time about how a flexible mode of interaction can be supported (e.g., from a meta-dialog standpoint). In this paper, we present the staging transformation approach to mixing initiative, where a dialog script captures the structure of the dialog and dialog control processes are realized through generous use of program transformation techniques (e.g., partial …


Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson Dec 2014

Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson

Saverio Perugini

Hierarchies are ubiquitous on the web for structuring online catalogs and indexing multidimensional attributed data sets. They are a natural metaphor for information seeking if their levelwise structure mirrors the user's conception of the underlying domain. In other cases, they can be frustrating, especially if multiple drill‐downs are necessary to arrive at information of interest. To support a broad range of users, site designers often expose multiple faceted classifications or provide within‐page pruning mechanisms. We present a new technique, called out-of-turn interaction, that increases the richness of user interaction at hierarchical sites, without enumerating all possible completion paths in the …


Personalizing The Gams Cross-Index, Saverio Perugini, Priya Lakshminarayanan, Naren Ramakrishnan Dec 2014

Personalizing The Gams Cross-Index, Saverio Perugini, Priya Lakshminarayanan, Naren Ramakrishnan

Saverio Perugini

The NIST Guide to Available Mathematical Software (GAMS) system at http://gams.nist .gov serves as the gateway to thousands of scientific codes and modules for numerical computation. We describe the PIPE personalization facility for GAMS, whereby content from the cross-index is specialized for a user desiring software recommendations for a specific problem instance. The key idea is to (i) mine structure, and (ii) exploit it in a programmatic manner to generate personalized web pages. Our approach supports both content-based and collaborative personalization and enables information integration from multiple (and complementary) web resources. We present case studies for the domain of linear, …


Information Assurance Through Binary Vulnerability Auditing, William B. Kimball, Saverio Perugini Dec 2014

Information Assurance Through Binary Vulnerability Auditing, William B. Kimball, Saverio Perugini

Saverio Perugini

The goal of this research is to develop improved methods of discovering vulnerabilities in software. A large volume of software, from the most frequently used programs on a desktop computer, such as web browsers, e-mail programs, and word processing applications, to mission-critical services for the space shuttle, is unintentionally vulnerable to attacks and thus insecure. By seeking to improve the identification of vulnerabilities in software, the security community can save the time and money necessary to restore compromised computer systems. In addition, this research is imperative to activities of national security such as counterterrorism. The current approach involves a systematic …


Personalizing Interactions With Information Systems, Saverio Perugini, Naren Ramakrishnan Dec 2014

Personalizing Interactions With Information Systems, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Personalization constitutes the mechanisms and technologies necessary to customize information access to the end-user. It can be defined as the automatic adjustment of information content, structure, and presentation tailored to the individual. In this chapter, we study personalization from the viewpoint of personalizing interaction. The survey covers mechanisms for information-finding on the web, advanced information retrieval systems, dialog-based applications, and mobile access paradigms. Specific emphasis is placed on studying how users interact with an information system and how the system can encourage and foster interaction. This helps bring out the role of the personalization system as a facilitator which reconciles …


Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini Dec 2014

Supporting Multiple Paths To Objects In Information Hierarchies: Faceted Classification, Faceted Search, And Symbolic Links, Saverio Perugini

Saverio Perugini

We present three fundamental, interrelated approaches to support multiple access paths to each terminal object in information hierarchies: faceted classification, faceted search, and web directories with embedded symbolic links. This survey aims to demonstrate how each approach supports users who seek information from multiple perspectives. We achieve this by exploring each approach, the relationships between these approaches, including tradeoffs, and how they can be used in concert, while focusing on a core set of hypermedia elements common to all. This approach provides a foundation from which to study, understand, and synthesize applications which employ these techniques. This survey does not …