Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 21 of 21

Full-Text Articles in Physical Sciences and Mathematics

Exterminator: Automatically Correcting Memory Errors With High Probability, Gene Novark Jun 2007

Exterminator: Automatically Correcting Memory Errors With High Probability, Gene Novark

Computer Science Department Faculty Publication Series

Programs written in C and C++ are susceptible to memory errors, including buffer overflows and dangling pointers. These errors, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult, even when the full memory state of the program is available. Once the errors are finally found, fixing them remains challenging: even for critical security-sensitive bugs, the average time between initial reports and the issuance of a patch is nearly one month. We present Exterminator, a system that automatically corrects heap-based memory errors without programmer intervention. …


Piecewise Pseudolikelihood For Efficient Training Of Conditional Random Fields, Charles Sutton, Andrew Mccallum Jan 2007

Piecewise Pseudolikelihood For Efficient Training Of Conditional Random Fields, Charles Sutton, Andrew Mccallum

Computer Science Department Faculty Publication Series

Discriminative training of graphical models can be expensive if the variables have large cardinality, even if the graphical structure is tractable. In such cases, pseudolikelihood is an attractive alternative, because its running time is linear in the variable cardinality, but on some data its accuracy can be poor. Piecewise training (Sutton & McCallum, 2005) can have better accuracy but does not scale as well in the variable cardinality. In this paper, we introduce piecewise pseudolikelihood, which retains the computational efficiency of pseudolikelihood but can have much better accuracy. On several benchmark NLP data sets, piecewise pseudolikelihood has better accuracy than …


Compact Spectral Bases For Value Function Approximation Using Kronecker Factorization, Jeff Johns Jan 2007

Compact Spectral Bases For Value Function Approximation Using Kronecker Factorization, Jeff Johns

Computer Science Department Faculty Publication Series

A new spectral approach to value function approximation has recently been proposed to automatically construct basis functions from samples. Global basis functions called proto-value functions are generated by diagonalizing a diffusion operator, such as a reversible random walk or the Laplacian, on a graph formed from connecting nearby samples. This paper addresses the challenge of scaling this approach to large domains. We propose using Kronecker factorization coupled with the Metropolis-Hastings algorithm to decompose reversible transition matrices. The result is that the basis functions can be computed on much smaller matrices and combined to form the overall bases. We demonstrate that …


Relational Dependency Networks, Jennifer Neville Jan 2007

Relational Dependency Networks, Jennifer Neville

Computer Science Department Faculty Publication Series

Recent work on graphical models for relational data has demonstrated significant improvements in classification and inference when models represent the dependencies among instances. Despite its use in conventional statistical models, the assumption of instance independence is contradicted by most relational datasets. For example, in citation data there are dependencies among the topics of a paper’s references, and in genomic data there are dependencies among the functions of interacting proteins. In this paper, we present relational dependency networks (RDNs), graphical models that are capable of expressing and reasoning with such dependencies in a relational setting. We discuss RDNs in the context …


On Unstructured File Sharing Networks, Honggang Zhang Jan 2007

On Unstructured File Sharing Networks, Honggang Zhang

Computer Science Department Faculty Publication Series

We study the interaction among users of unstructured file sharing applications, who compete for available network resources (link bandwidth or capacity) by opening multiple connections on multiple paths so as to accelerate data transfer. We model this interaction with an unstructured file sharing game. Users are players and their strategies are the numbers of sessions on available paths. We consider a general bandwidth sharing framework proposed by Kelly [1] and Mo and Walrand [2], with TCP as a special case. Furthermore, we incorporate the Tit-for-Tat strategy (adopted by BitTorrent [3] networks) into the unstructured file sharing game to model the …


Web Search From A Bus, Aruna Balasubramanian, Yun Zhou, Bruce Croft, Brian Neil Levine, Arun Venkataramani Jan 2007

Web Search From A Bus, Aruna Balasubramanian, Yun Zhou, Bruce Croft, Brian Neil Levine, Arun Venkataramani

Computer Science Department Faculty Publication Series

Opportunistic connections to the Internet from open wireless access points is now commonly possible in municipal areas. Vehicular networks can opportunistically connect to the internet for several seconds via open access points. In this paper, we adapt the interactive process of web search and retrieval to vehicular networks with intermittent Internet access. Our system, called Thedu has mobile nodes use an Internet proxy to collect search engine results and prefetch result pages. The mobiles nodes download the pre-fetched web pages from the proxy. Our contribution is a novel set of techniques to make aggressive but selective prefetching practical, resulting in …


Cryptanalysis Of Two Lightweight Rfid Authentication Schemes, Benessa Defend, Kevin Fu, Ari Juels Jan 2007

Cryptanalysis Of Two Lightweight Rfid Authentication Schemes, Benessa Defend, Kevin Fu, Ari Juels

Computer Science Department Faculty Publication Series

Vajda and Buttyan proposed several lightweight authentication protocols for authenticating RFID tags to readers, and left open the quantifiable cryptographic strength. Our cryptanalysis answers this open question by implementing and measuring attacks against their XOR and SUBSET protocols. A passive eavesdropper can impersonate a tag in the XOR protocol after observing only 70 challenge-response transactions between the tag and reader. In contrast, the theoretical maximum strength of the XOR protocol could have required 16! * 2 observed transactions to break the key. Our experiments also show that a passive eavesdropper can recover the shared secret used in the XOR protocol …


Simple, Robust, Scalable Semi-Supervised Learning Via Expectation Regularization, Gideon S. Mann Jan 2007

Simple, Robust, Scalable Semi-Supervised Learning Via Expectation Regularization, Gideon S. Mann

Computer Science Department Faculty Publication Series

Although semi-supervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semi-supervised learning method for exponential family parametric models that augments the traditional conditional label-likelihood objective function with an additional term that encourages model predictions on unlabeled data to match certain expectations—such as label priors. The method is extremely easy to implement, scales as well as logistic regression, and can handle non-independent features. We present experiments on five different data sets, …


Anonymizing Social Networks, Michael Hay, Gerome Miklau, David Jensen, Philipp Weis, Siddharth Srivastava Jan 2007

Anonymizing Social Networks, Michael Hay, Gerome Miklau, David Jensen, Philipp Weis, Siddharth Srivastava

Computer Science Department Faculty Publication Series

Advances in technology have made it possible to collect data about individuals and the connections between them, such as email correspondence and friendships. Agencies and researchers who have collected such social network data often have a compelling interest in allowing others to analyze the data. However, in many cases the data describes relationships that are private (e.g., email correspondence) and sharing the data in full can result in unacceptable disclosures. In this paper, we present a framework for assessing the privacy risk of sharing anonymized network data. This includes a model of adversary knowledge, for which we consider several variants …


Structure Theorem And Strict Alternation Hierarchy For Fo2 On Words, Philipp Weis Jan 2007

Structure Theorem And Strict Alternation Hierarchy For Fo2 On Words, Philipp Weis

Computer Science Department Faculty Publication Series

It is well-known that every first-order property on words is expressible using at most three variables. The subclass of properties expressible with only two variables is also quite interesting and wellstudied. We prove precise structure theorems that characterize the exact expressive power of first-order logic with two variables on words. Our results apply to FO2[<] and FO 2[<; Suc], the latter of which includes the binary successor relation in addition to the linear ordering on string positions. For both languages, our structuretheorems showexactly whatis expressible using a given quantifier depth, n, and using m blocks of alternating quantifiers, for any m ≤ n. Using these characterizations, we prove, among other results, that there is a strict hierarchy of alternating quantifiers for both languages. The question whether there was such a hierarchy had been completely open. As another consequence of our structural results, we show that satisfiability for FO2[<], which is NEXP-complete in general, becomes NP-complete once we only consider alphabets of a bounded size.


Learning State-Action Basis Functions For Hierarchical Mdps, Sarah Osentoski Jan 2007

Learning State-Action Basis Functions For Hierarchical Mdps, Sarah Osentoski

Computer Science Department Faculty Publication Series

This paper introduces a new approach to actionvalue function approximation by learning basis functions from a spectral decomposition of the state-action manifold. This paper extends previous work on using Laplacian bases for value function approximation by using the actions of the agent as part of the representation when creating basis functions. The approach results in a nonlinear learned representation particularly suited to approximating action-value functions, without incurring the wasteful duplication of state bases in previous work. We discuss two techniques to create state-action graphs: offpolicy and on-policy. We show that these graphs have a greater expressive power and have better …


Evaluating Search Engines By Modeling The Relationship Between Relevance And Clicks, Ben Carterette, Rosie Jones Jan 2007

Evaluating Search Engines By Modeling The Relationship Between Relevance And Clicks, Ben Carterette, Rosie Jones

Computer Science Department Faculty Publication Series

We propose a model that leverages the millions of clicks received by web search engines to predict document relevance. This allows the comparison of ranking functions when clicks are available but complete relevance judgments are not. After an initial training phase using a set of relevance judgments paired with click data, we show that our model can predict the relevance score of documents that have not been judged. These predictions can be used to evaluate the performance of a search engine, using our novel formalization of the confidence of the standard evaluation metric discounted cumulative gain (DCG), so comparisons can …


Dtn Routing As A Resource Allocation Problem, Aruna Balasubramanian, Brian Neil Levine, Arun Venkataramani Jan 2007

Dtn Routing As A Resource Allocation Problem, Aruna Balasubramanian, Brian Neil Levine, Arun Venkataramani

Computer Science Department Faculty Publication Series

Routing protocols for disruption-tolerant networks (DTNs) use a variety of mechanisms, including discovering the meeting probabilities among nodes, packet replication, and network coding. The primary focus of these mechanisms is to increase the likelihood of finding a path with limited information, and so these approaches have only an incidental effect on routing such metrics as maximum or average delivery delay. In this paper, we present rapid, an intentional DTN routing protocol that can optimize a specific routing metric such as the worst-case delivery delay or the fraction of packets that are delivered within a deadline. The key insight is to …


A Framework For Meta-Level Control In Multi-Agent Systems, Anita Raja, Victor Lesser Jan 2007

A Framework For Meta-Level Control In Multi-Agent Systems, Anita Raja, Victor Lesser

Computer Science Department Faculty Publication Series

Sophisticated agents operating in open environments must make decisions that efficiently trade off the use of their limited resources between dynamic deliberative actions and domain actions. This is the meta-level control problem for agents operating in resource-bounded multi-agent environments. Control activities involve decisions on when to invoke and the amount to effort to put into scheduling and coordination of domain activities. The focus of this paper is how to make effectivemeta-level control decisions. We show that meta-level control with bounded computational overhead allows complex agents to solve problems more efficiently than current approaches in dynamic open multi-agent environments. The meta-level …


Expertise Modeling For Matching Papers With Reviewers, David Mimno Jan 2007

Expertise Modeling For Matching Papers With Reviewers, David Mimno

Computer Science Department Faculty Publication Series

An essential part of an expert-finding task, such as matching reviewers to submitted papers, is the ability to model the expertise of a person based on documents. We evaluate several measures of the association between an author in an existing collection of research papers and a previously unseen document. We compare two language model based approaches with a novel topic model, Author-Persona-Topic (APT). In this model, each author can write under one or more "personas," which are represented as independent distributions over hidden topics. Examples of previous papers written by prospective reviewers are gathered from the Rexa database, which extracts …


Bounds On The Gain Of Network Coding And Broadcasting In Wireless Networks, Junning Liu Jan 2007

Bounds On The Gain Of Network Coding And Broadcasting In Wireless Networks, Junning Liu

Computer Science Department Faculty Publication Series

Gupta and Kumar established that the per node throughput of ad hoc networks with multi-pair unicast traffic scales (poorly) as lambda(n) = Theta (1 / radic(n log n)) with an increasing number of nodes n. However, Gupta and Kumar did not consider the possibility of network coding and broadcasting in their model, and recent work has suggested that such techniques have the potential to greatly improve network throughput. In [1], we have shown that for the protocol communication model of Gupta and Kumar [2], the multi-unicast throughput of schemes using arbitrary network coding and broadcasting in a two-dimensional random topology …


Finding Tribes: Identifying Close-Knit Individuals From Employment Patterns, Lisa Friedland Jan 2007

Finding Tribes: Identifying Close-Knit Individuals From Employment Patterns, Lisa Friedland

Computer Science Department Faculty Publication Series

We present a family of algorithms to uncover tribes—groups of individuals who share unusual sequences of affiliations. While much work inferring community structure describes large-scale trends, we instead search for small groups of tightly linked individuals who behave anomalously with respect to those trends. We apply the algorithms to a large temporal and relational data set consisting of millions of employment records from the National Association of Securities Dealers. The resulting tribes contain individuals at higher risk for fraud, are homogenous with respect to risk scores, and are geographically mobile, all at significant levels compared to random or to other …


Unsupervised Joint Alignment Of Complex Images, Gary B. Huang Jan 2007

Unsupervised Joint Alignment Of Complex Images, Gary B. Huang

Computer Science Department Faculty Publication Series

Many recognition algorithms depend on careful positioning of an object into a canonical pose, so the position of features relative to a fixed coordinate system can be examined. Currently, this positioning is done either manually or by training a class-specialized learning algorithm with samples of the class that have been hand-labeled with parts or poses. In this paper, we describe a novel method to achieve this positioning using poorly aligned examples of a class with no additional labeling. Given a set of unaligned examplars of a class, such as faces, we automatically build an alignment mechanism, without any additional labeling …


Eflux: A Language And Runtime System For Perpetual Systems, Jacob Sorber Jan 2007

Eflux: A Language And Runtime System For Perpetual Systems, Jacob Sorber

Computer Science Department Faculty Publication Series

A key goal of mobile computing is untethering devices from wires, making them truly portable. While mobile devices can make use of wireless communication for network connectivity, they are still dependent on an electrical connection for continued operation. This need for tethering to available electricity significantly limits their range, usefulness, and manageability. Environmental energy harvesting—collecting energy from the sun, wind, heat differentials, and motion—offers the prospect of unprecedented, large-scale deployments of perpetual mobile systems that never need to be recharged. However, programming these systems presents new challenges: perpetual systems must adapt dynamically to available energy, delivering higher service levels when …


Adaptive Control Of Duty Cycling In Energy-Harvesting Wireless Sensor Networks, Christopher M. Vigorito Jan 2007

Adaptive Control Of Duty Cycling In Energy-Harvesting Wireless Sensor Networks, Christopher M. Vigorito

Computer Science Department Faculty Publication Series

Increasingly many wireless sensor network deployments are using harvested environmental energy to extend system lifetime. Because the temporal profiles of such energy sources exhibit great variability due to dynamic weather patterns, an important problem is designing an adaptive duty-cycling mechanism that allows sensor nodes to maintain their power supply at sufficient levels (energy neutral operation) by adapting to changing environmental conditions. Existing techniques to address this problem are minimally adaptive and assume a priori knowledge of the energy profile. While such approaches are reasonable in environments that exhibit low variance, we find that it is highly inefficient in more variable …


Mixtures Of Hierarchical Topics With Pachinko Allocation, David Mimno Jan 2007

Mixtures Of Hierarchical Topics With Pachinko Allocation, David Mimno

Computer Science Department Faculty Publication Series

The four-level pachinko allocation model (PAM) (Li & McCallum, 2006) represents correlations among topics using a DAG structure. It does not, however, represent a nested hierarchy of topics, with some topical word distributions representing the vocabulary that is shared among several more specific topics. This paper presents hierarchical PAM---an enhancement that explicitly represents a topic hierarchy. This model can be seen as combining the advantages of hLDA's topical hierarchy representation with PAM's ability to mix multiple leaves of the topic hierarchy. Experimental results show improvements in likelihood of held-out documents, as well as mutual information between automatically-discovered topics and humangenerated …