Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 31

Full-Text Articles in Physical Sciences and Mathematics

Successful And Abandoned Sourceforge.Net Projects In The Initiation Stage, Charles Schweik Dec 2009

Successful And Abandoned Sourceforge.Net Projects In The Initiation Stage, Charles Schweik

National Center for Digital Government

[first paragraph] Chapter 6 provided an open source project success and abandonment dependent variable. Chapter 7 described data available in the Sourceforge.net repository and linked these data to various independent variable concepts and hypotheses presented in the theoretical part of this book. Chapter 7 also described the Classification Tree and Random Forest statistical approaches we use in this and the following chapter. This chapter presents the results of the Classification Tree analysis for successful and abandoned projects in the Initiation Stage, which in Chapter 3 (Figure 3.2), we defined as the period before and up to the time when a …


The Dependent Variable: Defining Open Source "Success" And "Abandonment" Using Sourceforge.Net Data, Charles Schweik Dec 2009

The Dependent Variable: Defining Open Source "Success" And "Abandonment" Using Sourceforge.Net Data, Charles Schweik

National Center for Digital Government

[first paragraph] From the very beginning of this research project, we understood that we needed to define what success meant in open source so that we could use that definition to create a dependent variable for our empirical studies. Does success mean a project has developed high quality software, or does it mean that the software is widely used? How might extremely valuable software that is used by only a few people, such as software for charting parts of the human genome, fit into this definition? In this chapter, we establish a robust success and abandonment measure that satisfies these …


Short-Wavelength Technology And The Potential For Distributed Networks Of Small Radar Systems, David Mclaughlin, David Pepyne, Brenda Philips, James Kurose, Michael Zink, David Westbrook, Eric Lyons, Eric Knapp, Anthony Hopf, Alfred Defonzo, Robert Contreras, Theodore Djaferis, Edin Insanic, Stephen Frasier, V. Chandrasekar, Francesc Junyent, Nitin Bharadwaj, Yanting Wang, Yuxiang Liu, Brenda Dolan, Kelvin Droegemeier, Jerald Brotzge, Ming Xue, Kevin Kloesel, Keith Brewster, Frederick Carr, Sandra Cruz-Pol, Kurt Hondl, Pavlos Kollias Dec 2009

Short-Wavelength Technology And The Potential For Distributed Networks Of Small Radar Systems, David Mclaughlin, David Pepyne, Brenda Philips, James Kurose, Michael Zink, David Westbrook, Eric Lyons, Eric Knapp, Anthony Hopf, Alfred Defonzo, Robert Contreras, Theodore Djaferis, Edin Insanic, Stephen Frasier, V. Chandrasekar, Francesc Junyent, Nitin Bharadwaj, Yanting Wang, Yuxiang Liu, Brenda Dolan, Kelvin Droegemeier, Jerald Brotzge, Ming Xue, Kevin Kloesel, Keith Brewster, Frederick Carr, Sandra Cruz-Pol, Kurt Hondl, Pavlos Kollias

James Kurose

Dense networks of short-range radars capable of mapping storms and detecting atmospheric hazards are described. Composed of small X-band (9.4 GHz) radars spaced tens of kilometers apart, these networks defeat the Earth curvature blockage that limits today's long-range weather radars and enables observing capabilities fundamentally beyond the operational state-of-the-art radars. These capabilities include multiple Doppler observations for mapping horizontal wind vectors, subkilometer spatial resolution, and rapid-update (tens of seconds) observations extending from the boundary layer up to the tops of storms. The small physical size and low-power design of these radars permits the consideration of commercial electronic manufacturing approaches and …


Web 2.0 In The Process Of E-Participation: The Case Of Organizing For America And The Obama Administration, Aysu Kes-Erkul, R. Erdem Erkul Oct 2009

Web 2.0 In The Process Of E-Participation: The Case Of Organizing For America And The Obama Administration, Aysu Kes-Erkul, R. Erdem Erkul

National Center for Digital Government

The presidential campaign of Barack Obama during the 2008 elections sparked new discussion about the public engagement issue in the political processes. The campaign used Web 2.0 tools intensively to reach the general public and seek support and collect feedback from voters. In this paper, we analyze the major website of this project, “Organizing for America” (OFA) from the perspective of e-participation, which is a concept that include all the processes of public involvement via information and communication technologies.


Information & Communication Technologies And Digital Government: The Turkish Case, Turhan Mentes Sep 2009

Information & Communication Technologies And Digital Government: The Turkish Case, Turhan Mentes

National Center for Digital Government

The technological innovations of the last decades have opened the doors to a new and different world for businesses and governments. As access to the Internet penetrates more populations each day, ICTs continue to shape societies all over the world. This presentation will explore the development of ICTs and e-government in Turkey. It will include significant figures and statistics about e-government in Turkey and discuss the social consequences of such developments.


Improved Network Consistency And Connectivity In Mobile And Sensor Systems, Nilanjan Banerjee Sep 2009

Improved Network Consistency And Connectivity In Mobile And Sensor Systems, Nilanjan Banerjee

Open Access Dissertations

Edge networks such as sensor, mobile, and disruption tolerant networks suffer from topological uncertainty and disconnections due to myriad of factors including limited battery capacity on client devices and mobility. Hence, providing reliable, always-on consistency for network applications in such mobile and sensor systems is non-trivial and challenging. However, the problem is of paramount importance given the proliferation of mobile phones, PDAs, laptops, and music players. This thesis identifies two fundamental deterrents to addressing the above problem. First, limited energy on client mobile and sensor devices makes high levels of consistency and availability impossible. Second, unreliable support from the network …


The Development Of Hierarchical Knowledge In Robot Systems, Stephen W. Hart Sep 2009

The Development Of Hierarchical Knowledge In Robot Systems, Stephen W. Hart

Open Access Dissertations

This dissertation investigates two complementary ideas in the literature on machine learning and robotics--those of embodiment and intrinsic motivation--to address a unified framework for skill learning and knowledge acquisition. "Embodied" systems make use of structure derived directly from sensory and motor configurations for learning behavior. Intrinsically motivated systems learn by searching for native, hedonic value through interaction with the world. Psychological theories of intrinsic motivation suggest that there exist internal drives favoring open-ended cognitive development and exploration. I argue that intrinsically motivated, embodied systems can learn generalizable skills, acquire control knowledge, and form an epistemological understanding of the world …


Sensor Control And Scheduling Strategies For Sensor Networks, Victoria U. Manfredi Sep 2009

Sensor Control And Scheduling Strategies For Sensor Networks, Victoria U. Manfredi

Open Access Dissertations

We investigate sensor control and scheduling strategies to most effectively use the limited resources of an ad hoc network or closed-loop sensor network. In this context, we examine the following three problems. Where to focus sensing? Certain types of sensors, such as cameras or radars, are unable to simultaneously collect high fidelity data from all environmental locations, and thus require some sort of sensing strategy. Considering a meteorological radar network, we show that the main benefits of optimizing sensing over expected future states of the environment are when there are multiple small phenomena in the environment. Considering multiple users, we …


Action-Based Representation Discovery In Markov Decision Processes, Sarah Osentoski Sep 2009

Action-Based Representation Discovery In Markov Decision Processes, Sarah Osentoski

Open Access Dissertations

This dissertation investigates the problem of representation discovery in discrete Markov decision processes, namely how agents can simultaneously learn representation and optimal control. Previous work on function approximation techniques for MDPs largely employed hand-engineered basis functions. In this dissertation, we explore approaches to automatically construct these basis functions and demonstrate that automatically constructed basis functions significantly outperform more traditional, hand-engineered approaches. We specifically examine two problems: how to automatically build representations for action-value functions by explicitly incorporating actions into a representation, and how representations can be automatically constructed by exploiting a pre-specified task hierarchy. We first introduce a technique for …


Semantic Methods For Intelligent Distributed Design Environments, Paul W. Witherell Sep 2009

Semantic Methods For Intelligent Distributed Design Environments, Paul W. Witherell

Open Access Dissertations

Continuous advancements in technology have led to increasingly comprehensive and distributed product development processes while in pursuit of improved products at reduced costs. Information associated with these products is ever changing, and structured frameworks have become integral to managing such fluid information. Ontologies and the Semantic Web have emerged as key alternatives for capturing product knowledge in both a human-readable and computable manner. The primary and conclusive focus of this research is to characterize relationships formed within methodically developed distributed design knowledge frameworks to ultimately provide a pervasive real-time awareness in distributed design processes. Utilizing formal logics in the form …


Resource Management In Complex And Dynamic Environments, Mohammad Salimullah Raunak Sep 2009

Resource Management In Complex And Dynamic Environments, Mohammad Salimullah Raunak

Open Access Dissertations

Resource management is at the heart of many diverse science and engineering research areas. Although the general notion of what constitutes a resource entity seems similar in different research areas, their types, characteristics, and constraints governing their behavior are vastly different depending on the particular domain of research and the nature of the research itself. Often research related to resource modeling and management focus on largely homogeneous resources in a relatively simplified model of the real world. The problem becomes much more challenging to deal with when working with a complex real life domain with many heterogeneous resource types and …


Operating System Support For Modern Applications, Ting Yang May 2009

Operating System Support For Modern Applications, Ting Yang

Open Access Dissertations

Computer systems now run drastically different workloads than they did two decades ago. The enormous advances in hardware power, such as processor speed, memory and storage capacity, and network bandwidth, enable them to run new kinds as well as a large number of applications simultaneously. Software technologies, such as garbage collection and multi-threading, also reshape applications and their behaviors, introducing more challenges to system resource management. However, existing general-purpose operating systems do not provide adequate support for these modern applications. These operating systems were designed over two decades ago, when garbage-collected applications were not prevalent and users interacted with systems …


Structured Topic Models: Jointly Modeling Words And Their Accompanying Modalities, Xuerui Wang May 2009

Structured Topic Models: Jointly Modeling Words And Their Accompanying Modalities, Xuerui Wang

Open Access Dissertations

The abundance of data in the information age poses an immense challenge for us: how to perform large-scale inference to understand and utilize this overwhelming amount of information. Such techniques are of tremendous intellectual significance and practical impact. As part of this grand challenge, the goal of my Ph.D. thesis is to develop effective and efficient statistical topic models for massive text collections by incorporating extra information from other modalities in addition to the text itself. Text documents are not just text, and different kinds of additional information are naturally interleaved with text. Most previous work, however, pays attention to …


Minimizing Detection Probability Routing In Ad Hoc Networks Using Directional Antennas, Xiaofeng Lu, Donald Towsley, Pietro Lio, Fletcher Wicker, Zhang Xiong May 2009

Minimizing Detection Probability Routing In Ad Hoc Networks Using Directional Antennas, Xiaofeng Lu, Donald Towsley, Pietro Lio, Fletcher Wicker, Zhang Xiong

Donald F. Towsley

In a hostile environment, it is important for a transmitter to make its wireless transmission invisible to adversaries because an adversary can detect the transmitter if the received power at its antennas is strong enough. This paper defines a detection probability model to compute the level of a transmitter being detected by a detection system at arbitrary location around the transmitter. Our study proves that the probability of detecting a directional antenna is much lower than that of detecting an omnidirectional antenna if both the directional and omnidirectional antennas provide the same Effective Isotropic Radiated Power (EIRP) in the direction …


Conference Proceedings, Youtube And The 2008 Election Cycle Apr 2009

Conference Proceedings, Youtube And The 2008 Election Cycle

YouTube and the 2008 Election Cycle in the United States

The YouTube and the 2008 Election Cycle in the United States Conference took place April 16-17, 2009 at the University of Massachusetts Amherst. The conference brought together political and computer scientists to explore the electoral impact of user-created YouTube technologies and to demonstrate new technical and analytic opportunities associated with new media technologies and politics. The conference proceedings includes copies of all papers presented at the conference as well as abstracts of all posters and keynote presentations.


A Comparison Of Turbulent Thermal Convection Between Conditions Of Constant Temperature And Constant Flux, Hans Johnston, Charles R. Doering Feb 2009

A Comparison Of Turbulent Thermal Convection Between Conditions Of Constant Temperature And Constant Flux, Hans Johnston, Charles R. Doering

Hans Johnston

We report the results of high-resolution direct numerical simulations of two-dimensional Rayleigh-Bénard convection for Rayleigh numbers up to Ra=1010 in order to study the influence of temperature boundary conditions on turbulent heat transport. Specifically, we considered the extreme cases of fixed heat flux (where the top and bottom boundaries are poor thermal conductors) and fixed temperature (perfectly conducting boundaries). Both cases display identical heat transport at high Rayleigh numbers fitting a power law Nu≈0.138×Ra0.285 with a scaling exponent indistinguishable from 2/7=0.2857… above Ra=107. The overall flow dynamics for both scenarios, in particular, the time averaged temperature profiles, are also indistinguishable …


Agent Interactions In Decentralized Environments, Martin William Allen Feb 2009

Agent Interactions In Decentralized Environments, Martin William Allen

Doctoral Dissertations 1896 - February 2014

The decentralized Markov decision process (Dec-POMDP) is a powerful formal model for studying multiagent problems where cooperative, coordinated action is optimal, but each agent acts based on local data alone. Unfortunately, it is known that Dec-POMDPs are fundamentally intractable: they are NEXP-complete in the worst case, and have been empirically observed to be beyond feasible optimal solution.

To get around these obstacles, researchers have focused on special classes of the general Dec-POMDP problem, restricting the degree to which agent actions can interact with one another. In some cases, it has been proven that these sorts of structured forms of interaction …


The Open Source Software Ecosystem, Charles M. Schweik Jan 2009

The Open Source Software Ecosystem, Charles M. Schweik

National Center for Digital Government

[first paragraph] Open source research in the late 1990s and early 2000's described open source development projects as all-volunteer endeavors without the existence of monetary incentives (Chakravarty, Haruvy and Wu, 2007), and relatively recent empirical studies (Ghosh, 2005; Wolf {{243}}) confirm that a sizable percentage of open source developers are indeed volunteers.1 Open source development projects involving more than one developer were seen to follow a “hacker ethic” (Himanen, 2000; von Hippel and von Krogh, 2003) where individuals freely give away and exchange software they had written so that it could be modified and built upon, with an expectation of …


Efficient Methods For Topic Model Inference On Streaming Document Collections, Limin Yao, David Mimno, Andrew Mccallum Jan 2009

Efficient Methods For Topic Model Inference On Streaming Document Collections, Limin Yao, David Mimno, Andrew Mccallum

Andrew McCallum

Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With today's large-scale, constantly expanding document collections, it is useful to be able to infer topic distributions for new documents without retraining the model. In this paper, we empirically evaluate the performance of several methods for topic inference in previously unseen documents, including methods based on Gibbs sampling, variational inference, and a new method inspired by text classification. The classification-based inference …


Towards Theoretical Bounds For Resource-Bounded Information Gathering For Correlation Clustering, Pallika Kanani, Andrew Mccallum, Ramesh Sitaraman Jan 2009

Towards Theoretical Bounds For Resource-Bounded Information Gathering For Correlation Clustering, Pallika Kanani, Andrew Mccallum, Ramesh Sitaraman

Andrew McCallum

Resource-bounded Information Gathering for Correlation Clustering deals with designing efficient methods for obtaining and incorporating information from external sources to improve accuracy of clustering tasks. In this paper, we formulate the problem, and some specific goals and lay the foundation for better theoretical understanding of this framework. We address the challenging problem of analytically quantifying the effect of changing a single edge weight on the partitioning of the entire graph, under some simplifying assumptions, hence demonstrating a method to calculate the expected reduction in error. Our analysis of different query selection criteria provides a formal way of comparing different heuristics. …


Bi-Directional Joint Inference For Entity Resolution And Segmentation Using Imperatively-Defined Factor Graphs, Sameer Singh, Karl Schultz, Andrew Mccallum Jan 2009

Bi-Directional Joint Inference For Entity Resolution And Segmentation Using Imperatively-Defined Factor Graphs, Sameer Singh, Karl Schultz, Andrew Mccallum

Andrew McCallum

There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled. This paper presents a highly-coupled, bi-directional approach to joint inference based on efficient Markov chain Monte Carlo sampling in a relational conditional random field. The model is specified with our new probabilistic programming …


Semi-Supervised Learning Of Dependency Parsers Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum Jan 2009

Semi-Supervised Learning Of Dependency Parsers Using Generalized Expectation Criteria, Gregory Druck, Gideon Mann, Andrew Mccallum

Andrew McCallum

In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g.~a noun's parent is often a verb). Model parameters are estimated using a generalized expectation (GE) objective function that penalizes the mismatch between model predictions and linguistic expectation constraints. In a comparison with two prominent "unsupervised" learning methods that require indirect biasing toward the correct syntactic structure, we show that GE can attain better accuracy with as few as 20 intuitive constraints. We also present positive experimental results on longer sentences in multiple languages.


Inference And Learning In Large Factor Graphs With Adaptive Proposal Distributions And A Rank-Based Objective, Khashayar Rohanimanesh, Michael Wick, Andrew Mccallum Jan 2009

Inference And Learning In Large Factor Graphs With Adaptive Proposal Distributions And A Rank-Based Objective, Khashayar Rohanimanesh, Michael Wick, Andrew Mccallum

Andrew McCallum

Large templated factor graphs with complex structure that changes during inference have been shown to provide state-of-the-art experimental results in tasks such as identity uncertainty and information integration. However, inference and learning in these models is notoriously difficult. This paper formalizes, analyzes and proves convergence for the SampleRank algorithm, which learns extremely efficiently by calculating approximate parameter estimation gradients from each proposed MCMC jump. Next we present a parameterized, adaptive proposal distribution, which greatly increases the number of accepted jumps. We combine these methods in experiments on a real-world information extraction problem and demonstrate that the adaptive proposal distribution requires …


An Entity Based Model For Coreference Resolution, Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew Mccallum Jan 2009

An Entity Based Model For Coreference Resolution, Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew Mccallum

Andrew McCallum

Recently, many advanced machine learning approaches have been proposed for coreference resolution; however, all of the discriminatively-trained models reason over mentions, rather than entities. That is, they do not explicitly contain variables indicating the ``canonical'' values for each attribute of an entity (e.g., name, venue, title, etc.). This canonicalization step is typically implemented as a post-processing routine to coreference resolution prior to adding the extracted entity to a database. In this paper, we propose a discriminatively-trained model that jointly performs coreference resolution and canonicalization, enabling features over hypothesized entities. We validate our approach on two different coreference problems: newswire anaphora …


Active Learning By Labeling Features, Gregory Druck, Burr Settles, Andrew Mccallum Jan 2009

Active Learning By Labeling Features, Gregory Druck, Burr Settles, Andrew Mccallum

Andrew McCallum

Methods that learn from prior information about input features such as generalized expectation (GE) have been used to train accurate models with very little effort. In this paper, we propose an active learning approach in which the machine solicits "labels" on features rather than instances. In both simulated and real user experiments on two sequence labeling tasks we show that our active learning method outperforms passive learning with features as well as traditional active learning with instances. Preliminary experiments suggest that novel interfaces which intelligently solicit labels on multiple features facilitate more efficient annotation.


Factorie: Probabilistic Programming Via Imperatively Defined Factor Graphs, Andrew Mccallum, Karl Schultz, Sameer Singh Jan 2009

Factorie: Probabilistic Programming Via Imperatively Defined Factor Graphs, Andrew Mccallum, Karl Schultz, Sameer Singh

Andrew McCallum

Discriminatively trained undirected graphical models have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. The power in relational models is in their repeated structure and tied parameters; at issue is how to define these structures in a powerful and flexible way. Rather than using a declarative language, such as SQL or first-order logic, we advocate using an imperative language to express various aspects of model structure, inference, and learning. By combining the traditional, declarative, statistical semantics of factor graphs with imperative definitions of their construction and operation, we …


Generalized Expectation Criteria For Bootstrapping Extractors Using Record-Text Alignment, Kedar Bellare, Andrew Mccallum Jan 2009

Generalized Expectation Criteria For Bootstrapping Extractors Using Record-Text Alignment, Kedar Bellare, Andrew Mccallum

Andrew McCallum

Traditionally, machine learning approaches for information extraction require human annotated data that can be costly and time-consuming to produce. However, in many cases, there already exists a database (DB) with schema related to the desired output, and records related to the expected input text. We present a conditional random field (CRF) that aligns tokens of a given DB record and its realization in text. The CRF model is trained using only the available DB and unlabeled text with generalized expectation criteria. An annotation of the text induced from inferred alignments is used to train an information extractor. We evaluate our …


Alternating Projections For Learning With Expectation Constraints, Kedar Bellare, Gregory Druck, Andrew Mccallum Jan 2009

Alternating Projections For Learning With Expectation Constraints, Kedar Bellare, Gregory Druck, Andrew Mccallum

Andrew McCallum

We present an objective function for learning with unlabeled data that utilizes auxiliary expectation constraints. We optimize this objective function using a procedure that alternates between information and moment projections. Our method provides an alternate interpretation of the posterior regularization framework (Graca et al., 2008), maintains uncertainty during optimization unlike constraint-driven learning (Chang et al., 2007), and is more efficient than generalized expectation criteria (Mann & McCallum, 2008). Applications of this framework include minimally supervised learning, semisupervised learning, and learning with constraints that are more expressive than the underlying model. In experiments, we demonstrate comparable accuracy to generalized expectation criteria …


Training Factor Graphs With Reinforcement Learning For Efficient Map Inference, Michael Wick, Khashayar Rohanimanesh, Sameer Singh, Andrew Mccallum Jan 2009

Training Factor Graphs With Reinforcement Learning For Efficient Map Inference, Michael Wick, Khashayar Rohanimanesh, Sameer Singh, Andrew Mccallum

Andrew McCallum

Large, relational factor graphs with structure defined by first-order logic or other languages give rise to notoriously difficult inference problems. Because unrolling the structure necessary to represent distributions over all hypotheses has exponential blow-up, solutions are often derived from MCMC. However, because of limitations in the design and parameterization of the jump function, these sampling-based methods suffer from local minima--the system must transition through lower-scoring configurations before arriving at a better MAP solution. This paper presents a new method of explicitly selecting fruitful downward jumps by leveraging reinforcement learning (RL). Rather than setting parameters to maximize the likelihood of the …


Factors Leading To Success Or Abandonment Of Open Source Commons: An Empirical Analysis Of Sourceforge.Net Projects, Charles M. Schweik, Robert English, Sandra Haire Jan 2009

Factors Leading To Success Or Abandonment Of Open Source Commons: An Empirical Analysis Of Sourceforge.Net Projects, Charles M. Schweik, Robert English, Sandra Haire

Charles M. Schweik

Open source software is produced cooperatively by groups of people who work together via the Internet. The software produced usually becomes the “common property” of the group and is freely distributed to anyone in the world who wants to use it. Although it may seem unlikely, open source collaborations, or “commons,” have grown phenomenally to become economically and socially important. But what makes open source commons succeed at producing something useful, or alternatively, what makes them become abandoned before achieving success? This paper reviews the theoretical foundations for understanding open source commons and briefly describes our statistical analysis of over …