Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data Science

Theses/Dissertations

Data management

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra Oct 2021

Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra

Doctoral Dissertations

A growing number of data-based applications are used for decision-making that have far-reaching consequences and significant societal impact. Entity resolution, community detection and taxonomy construction are some of the building blocks of these applications and for these methods, clustering is the fundamental underlying concept. Therefore, the use of accurate, robust and scalable methods for clustering cannot be overstated. We tackle the various facets of clustering with a multi-pronged approach described below. 1. While identification of clusters that refer to different entities is challenging for automated strategies, it is relatively easy for humans. We study the robustness of clustering methods that …


An Examination Of Multi-Tier Designs For Legacy Data Access, Michael L. Acker Dec 1997

An Examination Of Multi-Tier Designs For Legacy Data Access, Michael L. Acker

Theses and Dissertations

This work examines the application of Java and the Common Object Request Broker Architecture (CORBA) to support access to remote databases via the Internet. The research applies these software technologies to assist an Air Force distance learning provider in improving the capabilities of its World Wide Web-based correspondence system. An analysis of the distance learning provider's operation revealed a strong dependency on a non-collocated legacy relational database. This dependency limits the distance learning provider's future web-based capabilities. A recommendation to improve operation by data replication is proposed, and the implementation details are provided for two alternative test systems that support …