Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Physical Sciences and Mathematics

Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi Oct 2016

Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi

Computer Science Faculty Publications and Presentations

Sensing devices generate tremendous amounts of data each day, which include large quantities of multi-dimensional measurements. These data are expected to be immediately available for real-time analytics as they are streamed into storage. Such scenarios pose challenges to state-of-the-art indexing methods, as they must not only support efficient queries but also frequent updates. We propose here a novel indexing method that ingests multi-dimensional observational data in real time. This method primarily guarantees extremely high throughput for data ingestion, while it can be continuously refined in the background to improve query efficiency. Instead of representing collections of points using Minimal Bounding …


S-Store: Streaming Meets Transaction Processing, John Meehan, Nesime Tatbul, Cansu Aslantas, Ugur Cetintemel, Jiang Du, Tim Kraska, Samuel Madden, David Maier, Andrew Pavlo, Michael Stonebraker, Kristin A. Tufte, Hao Wang Jan 2015

S-Store: Streaming Meets Transaction Processing, John Meehan, Nesime Tatbul, Cansu Aslantas, Ugur Cetintemel, Jiang Du, Tim Kraska, Samuel Madden, David Maier, Andrew Pavlo, Michael Stonebraker, Kristin A. Tufte, Hao Wang

Computer Science Faculty Publications and Presentations

Stream processing addresses the needs of real-time applications. Transaction processing addresses the coordination and safety of short atomic computations. Heretofore, these two modes of operation existed in separate, stove-piped systems. In this work, we attempt to fuse the two computational paradigms in a single system called S-Store. In this way, S-Store can simultaneously accommodate OLTP and streaming applications. We present a simple transaction model for streams that integrates seamlessly with a traditional OLTP system. We chose to build S-Store as an extension of H-Store, an open-source, in-memory, distributed OLTP database system. By implementing S-Store in this way, we can make …


A Theory Of Name Resolution, Pierre Néron, Andrew Tolmach, Eelco Visser, Guido Wachsmuth Jan 2015

A Theory Of Name Resolution, Pierre Néron, Andrew Tolmach, Eelco Visser, Guido Wachsmuth

Computer Science Faculty Publications and Presentations

We describe a language-independent theory for name binding and resolution, suitable for programming languages with complex scoping rules including both lexical scoping and modules. We formulate name resolution as a two-stage problem. First a language-independent scope graph is constructed using language-specific rules from an abstract syntax tree. Then references in the scope graph are resolved to corresponding declarations using a language-independent resolution process. We introduce a resolution calculus as a concise, declarative, and language- independent specification of name resolution. We develop a resolution algorithm that is sound and complete with respect to the calculus. Based on the resolution calculus we …


Based On Repeated Experience, System For Modification Of Expression And Negating Overload From Media And Optimizing Referential Efficiency, Peter R. Badovinatz, Veronika M. Megler Jun 2013

Based On Repeated Experience, System For Modification Of Expression And Negating Overload From Media And Optimizing Referential Efficiency, Peter R. Badovinatz, Veronika M. Megler

Computer Science Faculty Publications and Presentations

Content items are revealed to a user based on whether they have been previously reviewed by the user. A number of content items are thus received over time. The content items may be discrete content items, or may be portions of a content stream, and may be received over different media. For each content item, it is determined whether the content item was previously reviewed by a user. Where the content item was not previously reviewed, the item is revealed to the user, such as by being displayed or announced to the user. Where the content item was previously reviewed, …


Data Near Here: Bringing Relevant Data Closer To Scientists, Veronika M. Megler, David Maier May 2013

Data Near Here: Bringing Relevant Data Closer To Scientists, Veronika M. Megler, David Maier

Computer Science Faculty Publications and Presentations

Large scientific repositories run the risk of losing value as their holdings expand, if it means increased effort for a scientist to locate particular datasets of interest. We discuss the challenges that scientists face in locating relevant data, and present our work in applying Information Retrieval techniques to dataset search, as embodied in the Data Near Here application.


The Problem Of Semantics In The Metadata Mess, Veronika Margaret Megler, David Maier Jan 2013

The Problem Of Semantics In The Metadata Mess, Veronika Margaret Megler, David Maier

Computer Science Faculty Publications and Presentations

This presentation addresses problems related to the volume of available scientific data, and its accessibility or inaccessibility to researchers who seek it. Topics addressed include metadata and reducing semantic diversity, especially as they refer to geospatial and other architectures


Taming The Metadata Mess, Veronika Margaret Megler Jan 2013

Taming The Metadata Mess, Veronika Margaret Megler

Computer Science Faculty Publications and Presentations

The rapid growth of scientific data shows no sign of abating. This growth has led to a new problem: with so much scientific data at hand, stored in thousands of datasets, how can scientists find the datasets most relevant to their research interests? We have addressed this problem by adapting Information Retrieval techniques, developed for searching text documents, into the world of (primarily numeric) scientific data. We propose an approach that uses a blend of automated and “semi-curated” methods to extract metadata from large archives of scientific data, then evaluates ranked searches over this metadata. We describe a challenge identified …


Finding Haystacks With Needles: Ranked Search For Data Using Geospatial And Temporal Characteristics, Veronika Margaret Megler, David Maier Jan 2011

Finding Haystacks With Needles: Ranked Search For Data Using Geospatial And Temporal Characteristics, Veronika Margaret Megler, David Maier

Computer Science Faculty Publications and Presentations

The past decade has seen an explosion in the number and types of environmental sensors deployed, many of which provide a continuous stream of observations. Each individual observation consists of one or more sensor measurements, a geographic location, and a time. With billions of historical observations stored in diverse databases and in thousands of datasets, scientists have difficulty finding relevant observations. We present an approach that creates consistent geospatial-temporal metadata from large repositories of diverse data by blending curated and automated extracts. We describe a novel query method over this metadata that returns ranked search results to a query with …


Concurrency Control, Version Management And Transactions In Advanced Database Systems, Jonathan Walpole, Muntuck Yap Feb 1991

Concurrency Control, Version Management And Transactions In Advanced Database Systems, Jonathan Walpole, Muntuck Yap

Computer Science Faculty Publications and Presentations

This document constitutes the final deliverable for the research project titled “An Investigation of Selected Issues in Transaction Mechanism Design for Object Oriented Databases.” The document describes our ideas for extending the traditional transaction concept for use in object oriented databases, and concentrates specifically on providing an underlying model to support the concurrency control and version management aspects of the problem. The ideas presented here are not however restricted to the domain of object oriented databases. They are more generally applicable to database systems that require flexibility in their versioning and concurrency control policies.

In this document we define a …