Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

1,010 Full-Text Articles 2,169 Authors 177,939 Downloads 153 Institutions

All Articles in Data Science

Faceted Search

1,010 full-text articles. Page 51 of 51.

Divad: A Dynamic And Interactive Visual Analytical Dashboard For Exploring And Analyzing Transport Data, Tin Seong KAM, Ketan BARSHIKAR, Shaun Jun Hua TAN 2012 Singapore Management University

Divad: A Dynamic And Interactive Visual Analytical Dashboard For Exploring And Analyzing Transport Data, Tin Seong Kam, Ketan Barshikar, Shaun Jun Hua Tan

Research Collection School Of Computing and Information Systems

The advances in location-based data collection technologies such as GPS, RFID etc. and the rapid reduction of their costs provide us with a huge and continuously increasing amount of data about movement of vehicles, people and goods in an urban area. This explosive growth of geospatially-referenced data has far outpaced the planner’s ability to utilize and transform the data into insightful information thus creating an adverse impact on the return on the investment made to collect and manage this data. Addressing this pressing need, we designed and developed DIVAD, a dynamic and interactive visual analytics dashboard to allow city planners …


Maximizing Network Lifetime On The Line With Adjustable Sensing Ranges, Amotz Bar-Noy, Ben Baumer 2012 The Doctorate-Granting Institution of the City University of New York

Maximizing Network Lifetime On The Line With Adjustable Sensing Ranges, Amotz Bar-Noy, Ben Baumer

Statistical and Data Sciences: Faculty Publications

Given n sensors on a line, each of which is equipped with a unit battery charge and an adjustable sensing radius, what schedule will maximize the lifetime of a network that covers the entire line? Trivially, any reasonable algorithm is at least a 1/2-approximation, but we prove tighter bounds for several natural algorithms. We focus on developing a linear time algorithm that maximizes the expected lifetime under a random uniform model of sensor distribution. We demonstrate one such algorithm that achieves an average-case approximation ratio of almost 0.9. Most of the algorithms that we consider come from a family based …


Age Composition And Distribution Of Red Drum (Sciaenops Ocellatus) In Offshore Waters Of The North Central Gulf Of Mexico: An Evaluation Of A Stock Under A Federal Harvest Moratorium, Sean P. Powers, Crystal Hightower, J. Marcus Drymon, Matthew W. Johnson 2012 University of South Alabama

Age Composition And Distribution Of Red Drum (Sciaenops Ocellatus) In Offshore Waters Of The North Central Gulf Of Mexico: An Evaluation Of A Stock Under A Federal Harvest Moratorium, Sean P. Powers, Crystal Hightower, J. Marcus Drymon, Matthew W. Johnson

University Faculty and Staff Publications

Because of a lack of fishery- dependent data, assessment of the recovery of fish stocks that undergo the most aggressive form of management, namely harvest moratoriums, remains a challenge. Large schools of red drum (Sclaenops ocellatus) were common along the northern Gulf of Mexico until the late 1980s when increased fishing effort quickly depleted the stock. After 24 years of harvest moratorium on red drum in federal waters, the stock is in need of reassessment; however, fishery dependent data are not available in federal waters and fishery-independent data are limited. We document the distribution, age composition, growth, and condition of …


Parsing The Relationship Between Baserunning And Batting Abilities Within Lineups, Ben S. Baumer, James Piette, Brad Null 2012 The Doctorate-Granting Institution of the City University of New York

Parsing The Relationship Between Baserunning And Batting Abilities Within Lineups, Ben S. Baumer, James Piette, Brad Null

Statistical and Data Sciences: Faculty Publications

A baseball team's offensive prowess is a function of two types of abilities: batting and baserunning. While each has been studied extensively in isolation, the effects of their interaction is not well understood. We model offensive output as a scalar function f of an individual player's batting and baserunning profile z. Each of these profiles is in turn estimated from Retrosheet data using heirarchical Bayesian models. We then use the SimulOutCome simulation engine as a method to generate values of f(z) over a fine grid of points. Finally, for each of several methods of taking the extra base, we graphically …


Generating A Close-To-Reality Synthetic Population Of Ghana, Tyler Frazier, Andreas Alfons 2012 William & Mary

Generating A Close-To-Reality Synthetic Population Of Ghana, Tyler Frazier, Andreas Alfons

Arts & Sciences Articles

The purpose of this research is to generate a close-to-reality synthetic human population for use in a geosimulation of urban dynamics. Two commonly accepted approaches to generating synthetic human populations are Iterative Proportional Fitting (IPF) and Resampling with Replacement. While these methods are effective at reproducing one instance of the probability model describing the survey, it is an instance with extremely small variability amongst subgroups and is very unlikely to be the real population. IPF and Resampling with Replacement also rely on pure replication of units from the underlying sample which can increase unrealistic model behavior. In this work we …


Mapsnap System To Perform Vector-To-Raster Fusion, Boris Kovalerchuk, Peter Doucette, Gamal Seedahmed, Jerry Tagestad, Sergei Kovalerchuk, Brian Graff 2011 Central Washington University

Mapsnap System To Perform Vector-To-Raster Fusion, Boris Kovalerchuk, Peter Doucette, Gamal Seedahmed, Jerry Tagestad, Sergei Kovalerchuk, Brian Graff

All Faculty Scholarship for the College of the Sciences

As the availability of geospatial data increases, there is a growing need to match these datasets together. However, since these datasets often vary in their origins and spatial accuracy, they frequently do not correspond well to each other, which create multiple problems. To accurately align with imagery, analysts currently either: 1) manually move the vectors, 2) perform a labor-intensive spatial registration of vectors to imagery, 3) move imagery to vectors, or 4) redigitize the vectors from scratch and transfer the attributes. All of these are time consuming and labor-intensive operations. Automated matching and fusing vector datasets has been a subject …


Semi-Automatic Management Of Knowledge Bases Using Formal Ontologies, Andreas Textor 2011 Department of Computing, Cork Institute of Technology, Cork, Ireland.

Semi-Automatic Management Of Knowledge Bases Using Formal Ontologies, Andreas Textor

Theses

This thesis presents an approach that deals with the ever-growing amount of data in knowledge bases, especially concerning knowledge interoperability and formal representation of domain knowledge. There arc multiple issues that must be addressed with current systems. A multitude of different formats, sources and tools exist in a domain, and it is desirable to develop their use further towards a standardised environment. Such an environment should support both the representation and processing of data from this domain, and the connection to other domains, where necessary. In order to manage large amounts of data, it should be possible to perform whatever …


Extreme Data Mining: Inference From Small Datasets, Răzvan Andonie 2010 Central Washington University

Extreme Data Mining: Inference From Small Datasets, Răzvan Andonie

All Faculty Scholarship for the College of the Sciences

Neural networks have been applied successfully in many fields. However, satisfactory results can only be found under large sample conditions. When it comes to small training sets, the performance may not be so good, or the learning task can even not be accomplished. This deficiency limits the applications of neural network severely. The main reason why small datasets cannot provide enough information is that there exist gaps between samples, even the domain of samples cannot be ensured. Several computational intelligence techniques have been proposed to overcome the limits of learning from small datasets.

We have the following goals: i. To …


Digital Commons powered by bepress