Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Data management

Discipline
Institution
Publication Year
Publication

Articles 1 - 16 of 16

Full-Text Articles in Physical Sciences and Mathematics

Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra Oct 2021

Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra

Doctoral Dissertations

A growing number of data-based applications are used for decision-making that have far-reaching consequences and significant societal impact. Entity resolution, community detection and taxonomy construction are some of the building blocks of these applications and for these methods, clustering is the fundamental underlying concept. Therefore, the use of accurate, robust and scalable methods for clustering cannot be overstated. We tackle the various facets of clustering with a multi-pronged approach described below. 1. While identification of clusters that refer to different entities is challenging for automated strategies, it is relatively easy for humans. We study the robustness of clustering methods that …


Data-Driven Learning For Robot Physical Intelligence, Leidi Zhao Aug 2021

Data-Driven Learning For Robot Physical Intelligence, Leidi Zhao

Dissertations

The physical intelligence, which emphasizes physical capabilities such as dexterous manipulation and dynamic mobility, is essential for robots to physically coexist with humans. Much research on robot physical intelligence has achieved success on hyper robot motor capabilities, but mostly through heavily case-specific engineering. Meanwhile, in terms of robot acquiring skills in a ubiquitous manner, robot learning from human demonstration (LfD) has achieved great progress, but still has limitations handling dynamic skills and compound actions. In this dissertation, a composite learning scheme which goes beyond LfD and integrates robot learning from human definition, demonstration, and evaluation is proposed. This method tackles …


Audubon Data Project Final Report, Askhat Beygenov, Valinur Kutlambetov, Shrikant Patel, Phoebe Roberts, Ulfat Sayyed, Shriram Sivaraman Jan 2018

Audubon Data Project Final Report, Askhat Beygenov, Valinur Kutlambetov, Shrikant Patel, Phoebe Roberts, Ulfat Sayyed, Shriram Sivaraman

School of Professional Studies

The Audubon Data Project was initiated as a Clark University Capstone project. The project’s client, Mass Audubon’s Shaping the Future of Your Community program, had identified a need to improve their data management methods and make better use of their data. The Capstone team, composed of Clark University graduate students, met with the client regularly to review the current state of the data and potential improvements to be made. The process began with a data review. During the review we worked with the client to explicitly define the purposes and requirements of the data, the current process for updating and …


Recommendation Support For Multi-Attribute Databases, Jilian Zhang Jun 2014

Recommendation Support For Multi-Attribute Databases, Jilian Zhang

Dissertations and Theses Collection (Open Access)

This dissertation studies the subject of providing recommendation support for multi-attribute databases. Recommendation is an important and very useful information evaluation mechanism that explores a database of huge volume, and retrieves from it the interesting data items (tuples) for users based on their preferences.


A Farm Management Information System With Task-Specific, Collaborative Mobile Apps And Cloud Storage Services, Jonathan Tyler Welte Apr 2014

A Farm Management Information System With Task-Specific, Collaborative Mobile Apps And Cloud Storage Services, Jonathan Tyler Welte

Open Access Theses

Modern production agriculture is beginning to advance beyond deterministic, scheduled operations between relatively few people to larger scale, information-driven efficiency in order to respond to the challenges of field variability and meet the needs of a growing population. Since no two farms are the same with respect to information and management structure, a specialized farm management information system (FMIS) which is tailored to the realities on the ground of individual farms is likely to be more effective than generalized FMIS available today.

This thesis presents the design of a FMIS using proven user-centered design principles. This approach resulted in the …


High-Performance Processing Of Continuous Uncertain Data, Thanh Thi Lac Tran May 2013

High-Performance Processing Of Continuous Uncertain Data, Thanh Thi Lac Tran

Open Access Dissertations

Uncertain data has arisen in a growing number of applications such as sensor networks, RFID systems, weather radar networks, and digital sky surveys. The fact that the raw data in these applications is often incomplete, imprecise and even misleading has two implications: (i) the raw data is not suitable for direct querying, (ii) feeding the uncertain data into existing systems produces results of unknown quality.

This thesis presents a system for uncertain data processing that has two key functionalities, (i) capturing and transforming raw noisy data to rich queriable tuples that carry attributes needed for query processing with quantified uncertainty, …


Metocean Data Management And Modeling To Support U.S. Offshore Wind Power Development In The Mid-Atlantic, Whitney Amanda West May 2012

Metocean Data Management And Modeling To Support U.S. Offshore Wind Power Development In The Mid-Atlantic, Whitney Amanda West

Masters Theses, 2010-2019

Efforts to encourage more conservative electricity consumption, through public awareness campaigns and government-mandated energy efficiency standards, have consistently been overshadowed by population increase and increased standards of living, leading to higher electricity demand, year after year. Sufficient resources and technology exist to support the development of a robust offshore wind industry to help meet this rising demand, but a number of barriers unique to the U.S. have hindered progress. Addressing many of these obstacles involves resolving uncertainty issues related to development. Not only is there a general lack of data to provide stakeholders, developers, and governing authorities with sufficient information …


Effecting Data Quality Through Data Governance: A Case Study In The Financial Services Industry, Patrick Egan Jan 2011

Effecting Data Quality Through Data Governance: A Case Study In The Financial Services Industry, Patrick Egan

Regis University Student Publications (comprehensive collection)

One of the most significant challenges faced by senior management today is implementing a data governance program to ensure that data is an asset to an organization's mission. New legislation aligned with continual regulatory oversight, increasing data volume growth, and the desire to improve data quality for decision making are driving forces behind data governance initiatives. Data governance involves reshaping existing processes and the way people view data along with the information technology required to create a consistent, secure and defined processes for handling the quality of an organization's data. In examining attempts to move towards making data an asset …


Data Management And Wireless Transport For Large Scale Sensor Networks, Ming Li Sep 2010

Data Management And Wireless Transport For Large Scale Sensor Networks, Ming Li

Open Access Dissertations

Today many large scale sensor networks have emerged, which span many different sensing applications. Each of these sensor networks often consists of millions of sensors collecting data and supports thousands of users with diverse data needs. Between users and wireless sensors there are often a group of powerful servers that collect and process data from sensors and answer users' requests. To build such a large scale sensor network, we have to answer two fundamental research problems: i) what data to transmit from sensors to servers? ii) how to transmit the data over wireless links? Wireless sensors often can not transmit …


Energy-Efficient Data Management In Wireless Sensor Networks, Chunyu Ai Jul 2010

Energy-Efficient Data Management In Wireless Sensor Networks, Chunyu Ai

Computer Science Dissertations

Wireless Sensor Networks (WSNs) are deployed widely for various applications. A variety of useful data are generated by these deployments. Since WSNs have limited resources and unreliable communication links, traditional data management techniques are not suitable. Therefore, designing effective data management techniques for WSNs becomes important. In this dissertation, we address three key issues of data management in WSNs. For data collection, a scheme of making some nodes sleep and estimating their values according to the other active nodes’ readings has been proved energy-efficient. For the purpose of improving the precision of estimation, we propose two powerful estimation models, Data …


Bay Audio Repair Website & Data Management Application, Michael Shelley Mar 2010

Bay Audio Repair Website & Data Management Application, Michael Shelley

Computer Science and Software Engineering

The goal of this senior project was to build a website and software application to receive and manage audio equipment repair requests for a small startup company called Bay Audio Repair (BAR). Furthermore, it allowed me to gain experience in web development and software engineering practices, specifically requirements gathering, design and implementation. The website provides an online interface for BAR’s customers to request repairs and the application allows BAR employees to update the progress of a repair. Several technologies were used in the system’s construction: HTML, XML, PHP, and C#.


A Logistic Regression Analysis Of Utah Colleges Exit Poll Response Rates Using Sas Software, Clint W. Stevenson Oct 2006

A Logistic Regression Analysis Of Utah Colleges Exit Poll Response Rates Using Sas Software, Clint W. Stevenson

Theses and Dissertations

In this study I examine voter response at an interview level using a dataset of 7562 voter contacts (including responses and nonresponses) in the 2004 Utah Colleges Exit Poll. In 2004, 4908 of the 7562 voters approached responded to the exit poll for an overall response rate of 65 percent. Logistic regression is used to estimate factors that contribute to a success or failure of each interview attempt. This logistic regression model uses interviewer characteristics, voter characteristics (both respondents and nonrespondents), and exogenous factors as independent variables. Voter characteristics such as race, gender, and age are strongly associated with response. …


Methodology For Integrating The Scenario Databases Of Simulation Systems, Emilia M. Colonese Jun 1999

Methodology For Integrating The Scenario Databases Of Simulation Systems, Emilia M. Colonese

Theses and Dissertations

The use of many different simulation systems by the United States Department of Defense has resulted in many different scenario data representations contained in heterogeneous databases. These heterogeneous databases all represent the same data concept, but have different semantics due to intrinsic variations among the data models. In this research, I describe a unified scenario database to allow interoperability and reuse of the scenario data components while avoiding the problems of data redundancy. Using the object oriented approach, the data and schema of the scenario databases, represented in an object oriented model, are integrated into a global database also represented …


Methodology For The Analysis And Design Of Internet Software Components Providing Relational Database Access Through The World Wide Web, Daniel L. Dipiro Mar 1998

Methodology For The Analysis And Design Of Internet Software Components Providing Relational Database Access Through The World Wide Web, Daniel L. Dipiro

Theses and Dissertations

This work examines the application of Internet software technologies to provide access to remote relational databases via the World Wide Web. The research applies these software technologies to assist the Air Force Institute of Technology Civilian Institute Program in improving operations and student to staff communication. An analysis of the existing Internet software technologies revealed several competing technologies capable of performing the same database access functions. The analysis further revealed weaknesses and inconsistencies in the existing AFIT/CI database. A methodology is proposed to assist in analyzing an existing development environment and in selecting among the competing technologies to provide the …


An Advanced Visualization Method For An Operations Research Analysis, Steven C. Oimoen Mar 1998

An Advanced Visualization Method For An Operations Research Analysis, Steven C. Oimoen

Theses and Dissertations

Visualizing multidimensional data using only two dimensions and conventional visualization techniques limits the understanding of the data set. Underlying structures or patterns within the data can easily go unnoticed. In order to gain additional insight into an analysis, incorporation of visualization and multidimensional graphics into the analysis results should be accomplished. The results must ensure that the information portrayed is not misleading or misunderstood. The integrity of the data must be preserved throughout the transformation. The primary objective of this research effort is to identify techniques to visualize multidimensional data and then develop a software tool to display the multidimensional …


An Examination Of Multi-Tier Designs For Legacy Data Access, Michael L. Acker Dec 1997

An Examination Of Multi-Tier Designs For Legacy Data Access, Michael L. Acker

Theses and Dissertations

This work examines the application of Java and the Common Object Request Broker Architecture (CORBA) to support access to remote databases via the Internet. The research applies these software technologies to assist an Air Force distance learning provider in improving the capabilities of its World Wide Web-based correspondence system. An analysis of the distance learning provider's operation revealed a strong dependency on a non-collocated legacy relational database. This dependency limits the distance learning provider's future web-based capabilities. A recommendation to improve operation by data replication is proposed, and the implementation details are provided for two alternative test systems that support …