Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

4,511 Full-Text Articles 4,963 Authors 1,167,439 Downloads 161 Institutions

All Articles in Databases and Information Systems

Faceted Search

4,511 full-text articles. Page 2 of 161.

Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko 2019 The University of Melbourne

Discovery Of Topological Constraints On Spatial Object Classes Using A Refined Topological Model, Ivan Majic, Elham Naghizade, Stephan Winter, Martin Tomko

Journal of Spatial Information Science

In a typical data collection process, a surveyed spatial object is annotated upon creation, and is classified based on its attributes. This annotation can also be guided by textual definitions of objects. However, interpretations of such definitions may differ among people, and thus result in subjective and inconsistent classification of objects. This problem becomes even more pronounced if the cultural and linguistic differences are considered. As a solution, this paper investigates the role of topology as the defining characteristic of a class of spatial objects. We propose a data mining approach based on frequent itemset mining to learn patterns in ...


Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski 2019 Embry-Riddle Aeronautical University, Daytona Beach

Data Mining And Machine Learning To Improve Northern Florida’S Foster Care System, Daniel Oldham, Nathan Foster, Mihhail Berezovski

Beyond: Undergraduate Research Journal

The purpose of this research project is to use statistical analysis, data mining, and machine learning techniques to determine identifiable factors in child welfare service records that could lead to a child entering the foster care system multiple times. This would allow us the capability of accurately predicting a case’s outcome based on these factors. We were provided with eight years of data in the form of multiple spreadsheets from Partnership for Strong Families (PSF), a child welfare services organization based in Gainesville, Florida, who is contracted by the Florida Department for Children and Families (DCF). This data contained ...


Encoding Invariances In Deep Generative Models, Viraj Shah, Ameya Joshi, Sambuddha Ghosal, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde 2019 Iowa State University

Encoding Invariances In Deep Generative Models, Viraj Shah, Ameya Joshi, Sambuddha Ghosal, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde

Baskar Ganapathysubramanian

Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a new generative modeling approach, InvNet, that can efficiently model data spaces with known invariances. We devise an adversarial training algorithm to encode them into data distribution. We validate our framework in three experimental settings: generating images with fixed motifs; solving nonlinear partial differential equations (PDEs); and ...


Parallel Streaming Random Sampling, Kanat Tangwongsan, Srikanta Tirthapura 2019 Mahidol University International College

Parallel Streaming Random Sampling, Kanat Tangwongsan, Srikanta Tirthapura

Electrical and Computer Engineering Publications

This paper investigates parallel random sampling from a potentially-unending data stream whose elements are revealed in a series of element sequences (minibatches). While sampling from a stream was extensively studied sequentially, not much has been explored in the parallel context, with prior parallel random-sampling algorithms focusing on the static batch model. We present parallel algorithms for minibatch-stream sampling in two settings: (1) sliding window, which draws samples from a prespecified number of most-recently observed elements, and (2) infinite window, which draws samples from all the elements received. Our algorithms are computationally and memory efficient: their work matches the fastest sequential ...


Healthcare It In Skilled Nursing And Post-Acute Care Facilities: Reducing Hospital Admissions And Re-Admissions, Improving Reimbursement And Improving Clinical Operations, Scott L. Hopes 2019 University of South Florida

Healthcare It In Skilled Nursing And Post-Acute Care Facilities: Reducing Hospital Admissions And Re-Admissions, Improving Reimbursement And Improving Clinical Operations, Scott L. Hopes

Scott Hopes

Health information technology (HIT), which includes electronic health record (EHR) systems and clinical data analytics, has become a major component of all health care delivery and care management. The adoption of HIT by physicians, hospitals, post-acute care organizations, pharmacies and other health care providers has been accepted as a necessary (and recently, a government required) step toward improved quality, care coordination and reduced costs: “Better coordination of care provides a path to improving communication, improving quality of care, and reducing unnecessary emergency room use and hospital readmissions. LTPAC providers play a critical role in achieving these goals” (HealthIT.gov, 2013 ...


Examining Medline Search Query Reproducibility And Resulting Variation In Search Results, C. Sean Burns, Robert M. Shapiro II, Tyler Nix, Jeffrey T. Huber 2019 University of Kentucky

Examining Medline Search Query Reproducibility And Resulting Variation In Search Results, C. Sean Burns, Robert M. Shapiro Ii, Tyler Nix, Jeffrey T. Huber

C. Sean Burns

The MEDLINE database is publicly available through the National Library of Medicine’s PubMed but the data file itself is also licensed to a number of vendors, who may offer their versions to institutional and other parties as part of a database platform. These vendors provide their own interface to the MEDLINE file and offer other technologies that attempt to make their version useful to subscribers. However, little is known about how vendor platforms ingest and interact with MEDLINE data files, nor how these changes influence the construction of search queries and the results they produce. This poster presents a ...


Encoding Invariances In Deep Generative Models, Viraj Shah, Ameya Joshi, Sambuddha Ghosal, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde 2019 Iowa State University

Encoding Invariances In Deep Generative Models, Viraj Shah, Ameya Joshi, Sambuddha Ghosal, Balaji Pokuri, Soumik Sarkar, Baskar Ganapathysubramanian, Chinmay Hegde

Mechanical Engineering Publications

Reliable training of generative adversarial networks (GANs) typically require massive datasets in order to model complicated distributions. However, in several applications, training samples obey invariances that are \textit{a priori} known; for example, in complex physics simulations, the training data obey universal laws encoded as well-defined mathematical equations. In this paper, we propose a new generative modeling approach, InvNet, that can efficiently model data spaces with known invariances. We devise an adversarial training algorithm to encode them into data distribution. We validate our framework in three experimental settings: generating images with fixed motifs; solving nonlinear partial differential equations (PDEs); and ...


Reach - A Community Service Application, Samuel Noel Magana 2019 California Polytechnic State University, San Luis Obispo

Reach - A Community Service Application, Samuel Noel Magana

Computer Engineering

Communities are familiar threads that unite people through several shared attributes and interests. These commonalities are the core elements that link and bond us together. Many of us are part of multiple communities, moving in and out of them depending on our needs. These common threads allow us to support and advocate for each other when facing a common threat or difficult situation. Healthy and vibrant communities are fundamental to the operation of our society. These interactions within our communities define the way we as individuals interact with each other, and society at large. Being part of a community helps ...


Radish: A Cross Platform Meal Prepping App For Beginner Weightlifters, Spoorthy S. Vemula, Tanay Gottigundala, Cory Baxes 2019 California Polytechnic State University, San Luis Obispo

Radish: A Cross Platform Meal Prepping App For Beginner Weightlifters, Spoorthy S. Vemula, Tanay Gottigundala, Cory Baxes

Computer Science and Software Engineering

With the increasing ease of access and decreasing price of most food, obesity rates in the developing world have risen dramatically in recent years. As of March 23rd, 2019, obesity rates had reached 39.6%, a 6% increase in just 8 years. Research has shown that people with obesity have a significantly increased risk of heart disease, stroke, type 2 diabetes, and certain cancers, among other life-threatening diseases. In addition, 42% of people who begin weightlifting quit because it’s too difficult to follow a diet or workout regimen.

We created Radish in an attempt to tackle these problems. Radish ...


Geometric Top-K Processing: Updates Since Mdm'16 [Advanced Seminar], Kyriakos MOURATIDIS 2019 Singapore Management University

Geometric Top-K Processing: Updates Since Mdm'16 [Advanced Seminar], Kyriakos Mouratidis

Research Collection School Of Information Systems

The top-k query has been studied extensively, and is considered the norm for multi-criteria decision making in large databases. In recent years, research has considered several complementary operators to the traditional top-k query, drawing inspiration (both in terms of problem formulation and solution design) from the geometric nature of the top-k processing model. In this seminar, we will present advances in that stream of work, focusing on updates since the preliminary seminar on the same topic in MDM'16.


A Study Of Machine Learning And Deep Learning Models For Solving Medical Imaging Problems, Fadi G. Farhat 2019 New Jersey Institute of Technology

A Study Of Machine Learning And Deep Learning Models For Solving Medical Imaging Problems, Fadi G. Farhat

Theses

Application of machine learning and deep learning methods on medical imaging aims to create systems that can help in the diagnosis of disease and the automation of analyzing medical images in order to facilitate treatment planning. Deep learning methods do well in image recognition, but medical images present unique challenges. The lack of large amounts of data, the image size, and the high class-imbalance in most datasets, makes training a machine learning model to recognize a particular pattern that is typically present only in case images a formidable task.

Experiments are conducted to classify breast cancer images as healthy or ...


Schema Migration From Relational Databases To Nosql Databases With Graph Transformation And Selective Denormalization, Krishna Chaitanya Mullapudi 2019 San Jose State University

Schema Migration From Relational Databases To Nosql Databases With Graph Transformation And Selective Denormalization, Krishna Chaitanya Mullapudi

Master's Projects

We witnessed a dramatic increase in the volume, variety and velocity of data leading to the era of big data. The structure of data has become highly flexible leading to the development of many storage systems that are different from the traditional structured relational databases where data is stored in “tables,” with columns representing the lowest granularity of data. Although relational databases are still predominant in the industry, there has been a major drift towards alternative database systems that support unstructured data with better scalability leading to the popularity of “Not Only SQL.”

Migration from relational databases to NoSQL databases ...


Image Retrieval Using Image Captioning, Nivetha Vijayaraju 2019 San Jose State University

Image Retrieval Using Image Captioning, Nivetha Vijayaraju

Master's Projects

The rapid growth in the availability of the Internet and smartphones have resulted in the increase in usage of social media in recent years. This increased usage has thereby resulted in the exponential growth of digital images which are available. Therefore, image retrieval systems play a major role in fetching images relevant to the query provided by the users. These systems should also be able to handle the massive growth of data and take advantage of the emerging technologies, like deep learning and image captioning. This report aims at understanding the purpose of image retrieval and various research held in ...


Predictive Analysis For Cloud Infrastructure Metrics, Paridhi Agrawal 2019 San Jose State University

Predictive Analysis For Cloud Infrastructure Metrics, Paridhi Agrawal

Master's Projects

In a cloud computing environment, enterprises have the flexibility to request resources according to their application demands. This elastic feature of cloud computing makes it an attractive option for enterprises to host their applications on the cloud. Cloud providers usually exploit this elasticity by auto-scaling the application resources for quality assurance. However, there is a setup-time delay that may take minutes between the demand for a new resource and it being prepared for utilization. This causes the static resource provisioning techniques, which request allocation of a new resource only when the application breaches a specific threshold, to be slow and ...


Mapping In The Humanities: Gis Lessons For Poets, Historians, And Scientists, Emily W. Fairey 2019 CUNY Brooklyn College

Mapping In The Humanities: Gis Lessons For Poets, Historians, And Scientists, Emily W. Fairey

Open Educational Resources

User-friendly Geographic Information Systems (GIS) is the common thread of this collection of presentations, and activities with full lesson plans. The first section of the site contains an overview of cartography, the art of creating maps, and then looks at historical mapping platforms like Hypercities and Donald Rumsey Historical Mapping Project. In the next section Google Earth Desktop Pro is introduced, with lessons and activities on the basics of GE such as pins, paths, and kml files, as well as a more complex activity on "georeferencing" an historic map over Google Earth imagery. The final section deals with ARCGIS Online ...


Mapping Manuscript Migrations: Digging Into Data For The History And Provenance Of Medieval And Renaissance Manuscripts, Toby Burrows, Eero Hyvönen, Lynn Ransom, Hanno Wijsman 2019 University of Western Australia

Mapping Manuscript Migrations: Digging Into Data For The History And Provenance Of Medieval And Renaissance Manuscripts, Toby Burrows, Eero Hyvönen, Lynn Ransom, Hanno Wijsman

Manuscript Studies

Mapping Manuscript Migrations is a new two-year project funded by the Trans-Atlantic Platform in the fourth round of its Digging into Data Challenge. The project is a collaboration between four international partners: the University of Oxford, the University of Pennsylvania, the Institut de recherche et d’histoire des textes (IRHT) in Paris, and Aalto University in Helsinki.

The project aims to combine data from various different sources to enable the large-scale analysis of the history and provenance of medieval and Renaissance manuscripts.


The Galen Palimpsest And The Modest Ambitions Of The Digital Data Set, Doug Emery 2019 University of Pennsylvania

The Galen Palimpsest And The Modest Ambitions Of The Digital Data Set, Doug Emery

Manuscript Studies

The digital Syriac Galen Palimpsest (SGP) data set is an archive built on the model of the digital Archimedes Palimpsest. As with Archimedes, the SGP data set is meant to promote the long-term preservation of and access to the digitized palimpsest. The SGP data set follows archiving best practices and uses the Archimedes Palimpsest Metadata Standard for spectral imaging metadata. The data is released under a Creative Commons Attribution 3.0 Unported license (CC BY 3.0). The SGP project used custom software to manage its data and metadata from the time of capture to final data set publication. In ...


Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza III, Jose Quinonez, Misael Santana, Nibhrat Lohia 2019 Southern Methodist University

Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia

SMU Data Science Review

In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory ...


Intrusion-Tolerant Order-Preserving Encryption, John Huson 2019 James Madison University

Intrusion-Tolerant Order-Preserving Encryption, John Huson

Masters Theses

Traditional encryption schemes such as AES and RSA aim to achieve the highest level of security, often indistinguishable security under the adaptive chosen-ciphertext attack. Ciphertexts generated by such encryption schemes do not leak useful information. As a result, such ciphertexts do not support efficient searchability nor range queries.

Order-preserving encryption is a relatively new encryption paradigm that allows for efficient queries on ciphertexts. In order-preserving encryption, the data-encrypting key is a long-term symmetric key that needs to stay online for insertion, query and deletion operations, making it an attractive target for attacks.

In this thesis, an intrusion-tolerant order-preserving encryption system ...


The Assessment Of Technology Adoption Interventions And Outcome Achievement Related To The Use Of A Clinical Research Data Warehouse, Katie A. McCarthy 2019 University of Wisconsin-Milwaukee

The Assessment Of Technology Adoption Interventions And Outcome Achievement Related To The Use Of A Clinical Research Data Warehouse, Katie A. Mccarthy

Theses and Dissertations

Introduction: While funding for research has declined since 2004, the need for rapid, innovative, and lifesaving clinical and translational research has never been greater due to the rise in chronic health conditions, which have resulted in lower life expectancy and higher rates of mortality and adverse outcomes. Finding effective diagnostic and treatment methods to address the complex challenges in individual and population health will require a team science approach, creating the need for multidisciplinary collaboration among practitioners and researchers.

To address this need, the National Institutes of Health (NIH) created the Clinical and Translational Science Awards (CTSA) program. The CTSA ...


Digital Commons powered by bepress