Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Portland State University

Series

Discipline
Keyword
Publication Year
Publication

Articles 1 - 22 of 22

Full-Text Articles in Databases and Information Systems

Forest Park Trail Monitoring, Adan Robles, Colton S. Maybee, Erin Dougherty Aug 2021

Forest Park Trail Monitoring, Adan Robles, Colton S. Maybee, Erin Dougherty

REU Final Reports

Forest Park, one of the largest public parks in the United States with over 40 trails to pick from when planning a hiking trip. One of the main problems this park has is that there are too many trails, and a lot of the trails extend over 3 miles. Due to these circumstances’ trails are not checked frequently and hikers are forced to hike trails in the area with no warnings of potential hazards they can encounter. In this paper I researched how Forest Park currently monitors its trails and then set up a goal to solve the problem. We …


Digitally Reporting Trail Obstructions In Forest Park, Colton S. Maybee Aug 2021

Digitally Reporting Trail Obstructions In Forest Park, Colton S. Maybee

REU Final Reports

The inclusion of technology on the trail can lead to better experiences for everyone involved in the hobby. Hikers can play a more prominent role in the maintenance of the trails by being able to provide better reports of obstructions while directly on the trail. This paper goes into the project of revamping the obstruction report system applied at Forest Park in Portland, Oregon. Most of my contributions to the project focus on mobile app development with some research into path planning algorithms related to the continuations of this project.


Client Access Feature Engineering For The Homeless Community Of The City Of Portland, Oswaldo Ceballos Jr Aug 2021

Client Access Feature Engineering For The Homeless Community Of The City Of Portland, Oswaldo Ceballos Jr

altREU Projects

Given the severity of homeless in many cities across the country, the project at hand attempts to assist a service provider organization called Central City Concern (CCC) with their mission of providing services to the community of Portland. These services include housing, recovery, health care, and jobs. With many different types of services available through the works of CCC, there exists an abundance of information and data pertaining to the individuals that interact with the CCC service system. The goal of this project is to perform an exploratory analysis and feature engineer the existing datasets CCC has collected over the …


Data Warehousing Class Project Report, Gaya Haciane, Chuan Chieh Lu, Rassaniya Lerdphayakkarat, Rudraxi Mitra Jan 2018

Data Warehousing Class Project Report, Gaya Haciane, Chuan Chieh Lu, Rassaniya Lerdphayakkarat, Rudraxi Mitra

Engineering and Technology Management Student Projects

Data mining is widely described or defined as the discipline of: “making sense of the data”. In today’s day and age, the rise of ubiquity of information calls for more advanced and developed techniques to mine the data and come up with insights. Data mining finds applications in many different fields and industries: Whether it is in Embryology, Crops, Elections, or Business Marketing...etc. It is not a wild assumption to consider that every organization in the world has some data mining capabilities or its main activity necessitates it and they have some third party organization doing that for them. One …


Exploratory Reconstructability Analysis Of Accident Tbi Data, Martin Zwick, Nancy Ann Carney, Rosemary Nettleton Jan 2018

Exploratory Reconstructability Analysis Of Accident Tbi Data, Martin Zwick, Nancy Ann Carney, Rosemary Nettleton

Systems Science Faculty Publications and Presentations

This paper describes the use of reconstructability analysis to perform a secondary study of traumatic brain injury data from automobile accidents. Neutral searches were done and their results displayed with a hypergraph. Directed searches, using both variable-based and state-based models, were applied to predict performance on two cognitive tests and one neurological test. Very simple state-based models gave large uncertainty reductions for all three DVs and sizeable improvements in percent correct for the two cognitive test DVs which were equally sampled. Conditional probability distributions for these models are easily visualized with simple decision trees. Confounding variables and counter-intuitive findings are …


Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi Oct 2016

Fast And Adaptive Indexing Of Multi-Dimensional Observational Data, Sheng Wang, David Maier, Beng Chin Ooi

Computer Science Faculty Publications and Presentations

Sensing devices generate tremendous amounts of data each day, which include large quantities of multi-dimensional measurements. These data are expected to be immediately available for real-time analytics as they are streamed into storage. Such scenarios pose challenges to state-of-the-art indexing methods, as they must not only support efficient queries but also frequent updates. We propose here a novel indexing method that ingests multi-dimensional observational data in real time. This method primarily guarantees extremely high throughput for data ingestion, while it can be continuously refined in the background to improve query efficiency. Instead of representing collections of points using Minimal Bounding …


Search Tool That Utilizes Scientific Metadata Matched Against User-Entered Parameters, Veronika Margaret Megler, David Maier Oct 2013

Search Tool That Utilizes Scientific Metadata Matched Against User-Entered Parameters, Veronika Margaret Megler, David Maier

Computer Science Faculty Publications and Presentations

A method for providing proximate dataset recommendations can begin with the creation of metadata records corresponding to datasets that represent scientific data by a scientific dataset search tool. The metadata records can conform to a standardized structural definition, and may be hierarchical. Values for the data elements of the metadata records can be contained within the datasets. Metadata records with a value that is proximate to a user-entered search parameter can be identified. A proximity score can be calculated for each identified metadata record. The proximity score can express a relevance of the corresponding dataset to the user-entered search parameters. …


Based On Repeated Experience, System For Modification Of Expression And Negating Overload From Media And Optimizing Referential Efficiency, Peter R. Badovinatz, Veronika M. Megler Jun 2013

Based On Repeated Experience, System For Modification Of Expression And Negating Overload From Media And Optimizing Referential Efficiency, Peter R. Badovinatz, Veronika M. Megler

Computer Science Faculty Publications and Presentations

Content items are revealed to a user based on whether they have been previously reviewed by the user. A number of content items are thus received over time. The content items may be discrete content items, or may be portions of a content stream, and may be received over different media. For each content item, it is determined whether the content item was previously reviewed by a user. Where the content item was not previously reviewed, the item is revealed to the user, such as by being displayed or announced to the user. Where the content item was previously reviewed, …


Data Near Here: Bringing Relevant Data Closer To Scientists, Veronika M. Megler, David Maier May 2013

Data Near Here: Bringing Relevant Data Closer To Scientists, Veronika M. Megler, David Maier

Computer Science Faculty Publications and Presentations

Large scientific repositories run the risk of losing value as their holdings expand, if it means increased effort for a scientist to locate particular datasets of interest. We discuss the challenges that scientists face in locating relevant data, and present our work in applying Information Retrieval techniques to dataset search, as embodied in the Data Near Here application.


The Problem Of Semantics In The Metadata Mess, Veronika Margaret Megler, David Maier Jan 2013

The Problem Of Semantics In The Metadata Mess, Veronika Margaret Megler, David Maier

Computer Science Faculty Publications and Presentations

This presentation addresses problems related to the volume of available scientific data, and its accessibility or inaccessibility to researchers who seek it. Topics addressed include metadata and reducing semantic diversity, especially as they refer to geospatial and other architectures


Taming The Metadata Mess, Veronika Margaret Megler Jan 2013

Taming The Metadata Mess, Veronika Margaret Megler

Computer Science Faculty Publications and Presentations

The rapid growth of scientific data shows no sign of abating. This growth has led to a new problem: with so much scientific data at hand, stored in thousands of datasets, how can scientists find the datasets most relevant to their research interests? We have addressed this problem by adapting Information Retrieval techniques, developed for searching text documents, into the world of (primarily numeric) scientific data. We propose an approach that uses a blend of automated and “semi-curated” methods to extract metadata from large archives of scientific data, then evaluates ranked searches over this metadata. We describe a challenge identified …


Finding Haystacks With Needles: Ranked Search For Data Using Geospatial And Temporal Characteristics, Veronika Margaret Megler, David Maier Jan 2011

Finding Haystacks With Needles: Ranked Search For Data Using Geospatial And Temporal Characteristics, Veronika Margaret Megler, David Maier

Computer Science Faculty Publications and Presentations

The past decade has seen an explosion in the number and types of environmental sensors deployed, many of which provide a continuous stream of observations. Each individual observation consists of one or more sensor measurements, a geographic location, and a time. With billions of historical observations stored in diverse databases and in thousands of datasets, scientists have difficulty finding relevant observations. We present an approach that creates consistent geospatial-temporal metadata from large repositories of diverse data by blending curated and automated extracts. We describe a novel query method over this metadata that returns ranked search results to a query with …


Relativistic Red-Black Trees, Philip William Howard, Jonathan Walpole Jan 2011

Relativistic Red-Black Trees, Philip William Howard, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Operating system performance and scalability on sharedmemory many-core systems depends critically on efficient access to shared data structures. Scalability has proven difficult to achieve for many data structures. In this paper we present a novel and highly scalable concurrent red-black tree. Red-black trees are widely used in operating systems, but typically exhibit poor scalability. Our red-black tree has linear read scalability, uncontended read performance that is at least 25% faster than other known approaches, and deterministic lookup times for a given tree size, making it suitable for realtime applications.


Can Infopipes Facilitate Reuse In A Traffic Application?, Emerson Murphy-Hill, Chuan-Kai Lin, Andrew P. Black, Jonathan Walpole Oct 2005

Can Infopipes Facilitate Reuse In A Traffic Application?, Emerson Murphy-Hill, Chuan-Kai Lin, Andrew P. Black, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Infopipes are presented as reusable building blocks for streaming applications. To evaluate this claim, we have built a significant traffic application in Smalltalk using Infopipes. This poster presents a traffic problem and solution, a short introduction to Infopipes, and the types of reuse Infopipes facilitate in our implementation.


Directed Extended Dependency Analysis For Data Mining, Thaddeus T. Shannon, Martin Zwick Jan 2004

Directed Extended Dependency Analysis For Data Mining, Thaddeus T. Shannon, Martin Zwick

Systems Science Faculty Publications and Presentations

Extended dependency analysis (EDA) is a heuristic search technique for finding significant relationships between nominal variables in large data sets. The directed version of EDA searches for maximally predictive sets of independent variables with respect to a target dependent variable. The original implementation of EDA was an extension of reconstructability analysis. Our new implementation adds a variety of statistical significance tests at each decision point that allow the user to tailor the algorithm to a particular objective. It also utilizes data structures appropriate for the sparse data sets customary in contemporary data mining problems. Two examples that illustrate different approaches …


An Overview Of Reconstructability Analysis, Martin Zwick Jan 2004

An Overview Of Reconstructability Analysis, Martin Zwick

Systems Science Faculty Publications and Presentations

This paper is an overview of reconstructability analysis (RA), a discrete multivariate modeling methodology developed in the systems literature; an earlier version of this tutorial is Zwick (2001). RA was derived from Ashby (1964), and was developed by Broekstra, Cavallo, Cellier Conant, Jones, Klir, Krippendorff, and others (Klir, 1986, 1996). RA resembles and partially overlaps log‐line (LL) statistical methods used in the social sciences (Bishop et al., 1978; Knoke and Burke, 1980). RA also resembles and overlaps methods used in logic design and machine learning (LDL) in electrical and computer engineering (e.g. Perkowski et al., 1997). Applications of RA, like …


Reconstructability Analysis With Fourier Transforms, Martin Zwick Jan 2004

Reconstructability Analysis With Fourier Transforms, Martin Zwick

Systems Science Faculty Publications and Presentations

Fourier methods used in two‐ and three‐dimensional image reconstruction can be used also in reconstructability analysis (RA). These methods maximize a variance‐type measure instead of information‐theoretic uncertainty, but the two measures are roughly collinear and the Fourier approach yields results close to that of standard RA. The Fourier method, however, does not require iterative calculations for models with loops. Moreover, the error in Fourier RA models can be assessed without actually generating the full probability distributions of the models; calculations scale with the size of the data rather than the state space. State‐based modeling using the Fourier approach is also …


Infosphere Project: An Overview, Calton Pu, Jonathan Walpole Mar 2001

Infosphere Project: An Overview, Calton Pu, Jonathan Walpole

Computer Science Faculty Publications and Presentations

We describe the Infosphere project, which is building the systems software support for information-driven applications such as digital libraries and electronic commerce. The main technical contribution is the Infopipe abstraction to support information flow with quality of service. Using building blocks such as program specialization, software feedback, domain-specific languages, and personalized information filtering, the Infopipe software generates code and manage resources to provide the specified quality of service with support for composition and restructuring.


Prestructuring Neural Networks Via Extended Dependency Analysis With Application To Pattern Classification, George G. Lendaris, Thaddeus T. Shannon, Martin Zwick Mar 1999

Prestructuring Neural Networks Via Extended Dependency Analysis With Application To Pattern Classification, George G. Lendaris, Thaddeus T. Shannon, Martin Zwick

Systems Science Faculty Publications and Presentations

We consider the problem of matching domain-specific statistical structure to neural-network (NN) architecture. In past work we have considered this problem in the function approximation context; here we consider the pattern classification context. General Systems Methodology tools for finding problem-domain structure suffer exponential scaling of computation with respect to the number of variables considered. Therefore we introduce the use of Extended Dependency Analysis (EDA), which scales only polynomially in the number of variables, for the desired analysis. Based on EDA, we demonstrate a number of NN pre-structuring techniques applicable for building neural classifiers. An example is provided in which EDA …


Quality Of Service Specification For Multimedia Presentations, Richard Staehli, Jonathan Walpole, David Maier Nov 1995

Quality Of Service Specification For Multimedia Presentations, Richard Staehli, Jonathan Walpole, David Maier

Computer Science Faculty Publications and Presentations

The bandwidth limitations of multimedia systems force tradeoffs between presentation data fidelity and real-time performance. For example, digital video is commonly encoded with lossy compression to reduce bandwidth and frames may be skipped during playback to maintain synchronization. These tradeoffs depend on device performance and physical data representations that are hidden by a database system. If a multimedia database is to support digital video and other continuous media data types, we argue that the database should provide a Quality of Service (QOS) interface to allow application control of presentation timing and information loss tradeoffs.

This paper proposes a data model …


Constrained-Latency Storage Access: A Survey Of Application Requirements And Storage System Design Approaches, Richard Staehli, Jonathan Walpole Oct 1991

Constrained-Latency Storage Access: A Survey Of Application Requirements And Storage System Design Approaches, Richard Staehli, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Applications with Constrained Latency Storage Access (CLSA) are those that have large storage needs and hard constraints on the amount of latency they can tolerate. Such applications present a problem when the storage technology that is cost effective and large enough cannot meet their latency constraints for demand fetching. Examples are found in the developing field of multimedia computing and, to a lesser extent, in real-time database literature. This paper examines the nature of timing constraints at the application-storage interface and defines a classification for both the synchronization constraints of the application and the latency characteristics of the storage system. …


Concurrency Control, Version Management And Transactions In Advanced Database Systems, Jonathan Walpole, Muntuck Yap Feb 1991

Concurrency Control, Version Management And Transactions In Advanced Database Systems, Jonathan Walpole, Muntuck Yap

Computer Science Faculty Publications and Presentations

This document constitutes the final deliverable for the research project titled “An Investigation of Selected Issues in Transaction Mechanism Design for Object Oriented Databases.” The document describes our ideas for extending the traditional transaction concept for use in object oriented databases, and concentrates specifically on providing an underlying model to support the concurrency control and version management aspects of the problem. The ideas presented here are not however restricted to the domain of object oriented databases. They are more generally applicable to database systems that require flexibility in their versioning and concurrency control policies.

In this document we define a …