Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Databases and Information Systems

Big Issues For Big Data: Challenges For Critical Spatial Data Analytics, Chris Brunsdon, Alexis Comber Jul 2021

Big Issues For Big Data: Challenges For Critical Spatial Data Analytics, Chris Brunsdon, Alexis Comber

Journal of Spatial Information Science

In this paper we consider some of the issues of working with big data and big spatial data and highlight the need for an open and critical framework. We focus on a set of challenges underlying the collection and analysis of big data. In particular, we consider 1) inference when working with usually biased big data, challenging the assumed inferential superiority of data with observations, n, approaching N, the population n -> N. We also emphasise 2) the need for analyses that answer questions of practical significance or with greater emphasis on the size of the effect, rather than the …


Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski Jun 2019

Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski

Beyond: Undergraduate Research Journal

The purpose of this research project is to use statistical analysis, data mining, and machine learning techniques to determine identifiable factors in child welfare service records that could lead to a child entering the foster care system multiple times. This would allow us the capability of accurately predicting a case’s outcome based on these factors. We were provided with eight years of data in the form of multiple spreadsheets from Partnership for Strong Families (PSF), a child welfare services organization based in Gainesville, Florida, who is contracted by the Florida Department for Children and Families (DCF). This data contained a …


The Billion Object Platform (Bop): A System To Lower Barriers To Support Big, Streaming, Spatio-Temporal Data Sources, Devika Kakkar, Ben Lewis, David Smiley, Ariel Nunez Sep 2017

The Billion Object Platform (Bop): A System To Lower Barriers To Support Big, Streaming, Spatio-Temporal Data Sources, Devika Kakkar, Ben Lewis, David Smiley, Ariel Nunez

Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings

With funding from the Sloan Foundation and Harvard Dataverse, the Harvard Center for Geographic Analysis (CGA) has developed a big spatio-temporal data visualization platform called the Billion Object Platform or "BOP". The goal of the project is to lower barriers for scholars who wish to access large, streaming, spatio-temporal datasets. Since once archived, streaming data gets big fast, and since most GIS systems don't support interactive visualization of millions of objects, a new platform was needed. The BOP is loaded with the latest billion geo-tweets and is fed a real-time stream of about 1 million tweets per day. The CGA …


Optimizing Spatiotemporal Analysis Using Multidimensional Indexing With Geowave, Richard Fecher, Michael A. Whitby Sep 2017

Optimizing Spatiotemporal Analysis Using Multidimensional Indexing With Geowave, Richard Fecher, Michael A. Whitby

Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings

The open source software GeoWave bridges the gap between geographic information systems and distributed computing. This is done by preserving locality of multidimensional data when indexing it into a single-dimensional key-value store, using space filling curves. This means that like values in each dimension are stored physically close together in the datastore. We demonstrate the efficiencies and benefits of the GeoWave indexing algorithm to store and query billions of spatiotemporal data points. We show how this indexing strategy can be used to reduce query and processing times by multiple orders of magnitude using publicly available taxi trip data published by …