Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

534 Full-Text Articles 1,183 Authors 54,725 Downloads 117 Institutions

All Articles in Data Science

Faceted Search

534 full-text articles. Page 1 of 27.

Osm-Gan: Using Generative Adversarial Networks For Detecting Change In High-Resolution Spatial Images, James Carswell 2022 Technological University Dublin

Osm-Gan: Using Generative Adversarial Networks For Detecting Change In High-Resolution Spatial Images, James Carswell

Articles

Detecting changes to built environment objects such as buildings/roads/etc. in aerial/satellite (spatial) imagery is necessary to keep online maps and various value-added LBS applications up-to-date. However, recognising such changes automatically is not a trivial task, and there are many different approaches to this problem in the literature. This paper proposes an automated end-to-end workflow to address this problem by combining OpenStreetMap (OSM) vectors of building footprints with a machine learning Generative Adversarial Network (GAN) model - where two neural networks compete to become more accurate at predicting changes to building objects in spatial imagery. Notably, our proposed OSM-GAN ...


Diving Deep Into Dissertations: Analyzing Graduate Students’ Methodological And Data Practices To Inform Research Data Services And Subject Liaison Librarian Support, Mandy Swygart-Hobaugh M.L.S., Ph.D., Raeda Anderson, Denise Dimsdale, Joel Glogowski 2021 Georgia State University

Diving Deep Into Dissertations: Analyzing Graduate Students’ Methodological And Data Practices To Inform Research Data Services And Subject Liaison Librarian Support, Mandy Swygart-Hobaugh M.L.S., Ph.D., Raeda Anderson, Denise Dimsdale, Joel Glogowski

University Library Faculty Publications

We present findings from an exploratory quantitative content analysis case study of 156 doctoral dissertations from Georgia State University that investigates doctoral student researchers’ methodology practices (used quantitative, qualitative, or mixed methods) and data practices (used primary data, secondary data, or both). We discuss the implications of our findings for provision of data support services provided by the Georgia State University Library’s Research Data Services (RDS) Team and subject liaison librarians in the areas of instructional services, data software support and licensing advocacy, collection development, marketing/outreach, and professional development/expansion.


Human Mobility Monitoring Using Wifi: Analysis, Modeling, And Applications, Amee Trivedi 2021 University of Massachusetts Amherst

Human Mobility Monitoring Using Wifi: Analysis, Modeling, And Applications, Amee Trivedi

Doctoral Dissertations

Understanding and modeling humans and device mobility has fundamental importance in mobile computing, with implications ranging from network design and location-aware technologies to urban infrastructure planning. Today's users carry a plethora of devices such as smartphones, laptops, tablets, and smartwatches, with each device offering a different set of services resulting in different usage and mobility leading to the research question of understanding and modeling multiple user device trajectories. Additionally, prior research on mobility focuses on outdoor mobility when it is known that users spend 80% of their time indoors resulting in wide gaps in knowledge in the area of ...


Social Measurement And Causal Inference With Text, Katherine A. Keith 2021 University of Massachusetts Amherst

Social Measurement And Causal Inference With Text, Katherine A. Keith

Doctoral Dissertations

The digital age has dramatically increased access to large-scale collections of digitized text documents. These corpora include, for example, digital traces from social media, decades of archived news reports, and transcripts of spoken interactions in political, legal, and economic spheres. For social scientists, this new widespread data availability has potential for improved quantitative analysis of relationships between language use and human thought, actions, and societal structure. However, the large-scale nature of these collections means that traditional manual approaches to analyzing content are extremely costly and do not scale. Furthermore, incorporating unstructured text data into quantitative analysis is difficult due to ...


High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki 2021 University of Massachusetts Amherst

High-Dimensional Feature Selection And Multi-Level Causal Mediation Analysis With Applications To Human Aging And Cluster-Based Intervention Studies, Hachem Saddiki

Doctoral Dissertations

Many questions in public health and medicine are fundamentally causal in that our objective is to learn the effect of some exposure, randomized or not, on an outcome of interest. As a result, causal inference frameworks and methodologies have gained interest as a promising tool to reliably answer scientific questions. However, the tasks of identifying and efficiently estimating causal effects from observed data still pose significant challenges under complex data generating scenarios. We focus on (1) high-dimensional settings where the number of variables is orders of magnitude higher than the number of observations; and (2) multi-level settings, where study participants ...


Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra 2021 University of Massachusetts Amherst

Robust Algorithms For Clustering With Applications To Data Integration, Sainyam Galhotra

Doctoral Dissertations

A growing number of data-based applications are used for decision-making that have far-reaching consequences and significant societal impact. Entity resolution, community detection and taxonomy construction are some of the building blocks of these applications and for these methods, clustering is the fundamental underlying concept. Therefore, the use of accurate, robust and scalable methods for clustering cannot be overstated. We tackle the various facets of clustering with a multi-pronged approach described below.

1. While identification of clusters that refer to different entities is challenging for automated strategies, it is relatively easy for humans. We study the robustness of clustering methods that ...


Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan 2021 University of Massachusetts Amherst

Monitoring Mammals At Multiple Scales: Case Studies From Carnivore Communities, Kadambari Devarajan

Doctoral Dissertations

Carnivores are distributed widely and threatened by habitat loss, poaching, climate change, and disease. They are considered integral to ecosystem function through their direct and indirect interactions with species at different trophic levels. Given the importance of carnivores, it is of high conservation priority to understand the processes driving carnivore assemblages in different systems. It is thus essential to determine the abiotic and biotic drivers of carnivore community composition at different spatial scales and address the following questions: (i) What factors influence carnivore community composition and diversity? (ii) How do the factors influencing carnivore communities vary across spatial and temporal ...


Supporting “Big Data” Research At Georgia State University (Gsu), Kelsey Jordan, Bryan Sinclair, Mandy Swygart-Hobaugh, Jeremy Walker 2021 Georgia State University

Supporting “Big Data” Research At Georgia State University (Gsu), Kelsey Jordan, Bryan Sinclair, Mandy Swygart-Hobaugh, Jeremy Walker

University Library Faculty Publications

From Summer 2020 to Summer 2021, a team of Georgia State University (GSU) University Library faculty took part in a multi-institutional research study coordinated by the Ithaka S+R research and consulting organization to examine the research support needs of faculty doing “big data” research. Drawing from semi-structured interviews with eight GSU researchers representing a diverse cross-section of academic fields, this report offers the following insights from participation in the study: (1) identifies the key research support needs and associated challenges faced by GSU faculty who engage in “big data” research, and (2) offers possible paths toward improved support of ...


Crest Or Trough? How Research Libraries Used Emerging Technologies To Survive The Pandemic, So Far, Scout Calvert 2021 University of Nebraska-Lincoln

Crest Or Trough? How Research Libraries Used Emerging Technologies To Survive The Pandemic, So Far, Scout Calvert

Faculty Publications, UNL Libraries

Introduction

In the first months of the COVID-19 pandemic, it was impossible to tell if we were at the crest of a wave of new transmissions, or a trough of a much larger wave, still yet to peak. As of this writing, as colleges and universities prepare for mostly in-person fall 2021 semesters, case counts in the United States are increasing again after a decline that coincided with easier access to the COVID vaccine. Plans for a return to campus made with confidence this spring may be in doubt, as we climb the curve of what is already the second ...


Data Analysis Of The “2021 Covid, Equity And Social Justice Showcase”, Cristo Leon, James Lipuma 2021 New Jersey Institute of Technology

Data Analysis Of The “2021 Covid, Equity And Social Justice Showcase”, Cristo Leon, James Lipuma

STEM Month

During the “2021 STEM for All Video Showcase” (NSF, 2021) funded by the National Science Foundation, 287 short videos showcasing federally funded projects aimed at improving STEM and CS education were presented.

The videos highlight strategies to engage students during COVID-19 and address educational inequities.


Deep Fakes: The Algorithms That Create And Detect Them And The National Security Risks They Pose, Nick Dunard 2021 James Madison University

Deep Fakes: The Algorithms That Create And Detect Them And The National Security Risks They Pose, Nick Dunard

James Madison Undergraduate Research Journal (JMURJ)

The dissemination of deep fakes for nefarious purposes poses significant national security risks to the United States, requiring an urgent development of technologies to detect their use and strategies to mitigate their effects. Deep fakes are images and videos created by or with the assistance of AI algorithms in which a person’s likeness, actions, or words have been replaced by someone else’s to deceive an audience. Often created with the help of generative adversarial networks, deep fakes can be used to blackmail, harass, exploit, and intimidate individuals and businesses; in large-scale disinformation campaigns, they can incite political tensions ...


Fostering Data Reusability: Increasing Impact And Ease In Sharing And Reusing Research Data, Sarah M. Nusser, Gizem Korkmaz, Alyssa Mikytuck 2021 Iowa State University

Fostering Data Reusability: Increasing Impact And Ease In Sharing And Reusing Research Data, Sarah M. Nusser, Gizem Korkmaz, Alyssa Mikytuck

CSAFE Publications

For centuries, researchers have shared and discussed their findings in articles and convenings, seeking to advance scientific progress and enable scrutiny of their ideas. These practices are rooted in the principle of transparency in research, which promotes rigor and trust in the scientific process, increases the potential for equitable access and collaboration, and accelerates the pace of discovery and society impact. Today, the equivalent practice of open science (or scholarship) recognizes that a more complete suite of research outputs should be made publicly accessible whenever possible. Indeed, our global challenges and the drive to increase equity in access to knowledge ...


Detecting Stance On Covid-19 Vaccine In A Polarized Media, Rodica Ceslov 2021 The Graduate Center, City University of New York

Detecting Stance On Covid-19 Vaccine In A Polarized Media, Rodica Ceslov

Dissertations, Theses, and Capstone Projects

The growing polarization in the United States has been widely reported. Media coverage plays an important role in shaping public opinion and influences public debates on complex and unfamiliar topics. There are some benefits to individuals and society from political polarization and conflict between opposing viewpoints. However, recent research has primarily highlighted the negative consequences of polarization which reached an all-time high. One such topic is the Covid-19 vaccine which was developed in record time, and the public learned about its safety and possible risks through the media coverage.

In this capstone, we examine U.S. news media coverage on ...


Piecewise Linear Manifold Clustering, Artyom Diky 2021 The Graduate Center, City University of New York

Piecewise Linear Manifold Clustering, Artyom Diky

Dissertations, Theses, and Capstone Projects

This work studies the application of topological analysis to non-linear manifold clustering. A novel method, that exploits the data clustering structure, allows to generate a topological representation of the point dataset. An analysis of topological construction under different simulated conditions is performed to explore the capabilities and limitations of the method, and demonstrated statistically significant improvements in performance. Furthermore, we introduce a new information-theoretical validation measure for clustering, that exploits geometrical properties of clusters to estimate clustering compressibility, for evaluation of the clustering goodness-of-fit without any prior information about true class assignments. We show how the new validation measure, when ...


Anti-Vaxxers: Parents Fighting Science, Katie West 2021 Kennesaw State University

Anti-Vaxxers: Parents Fighting Science, Katie West

Symposium of Student Scholars

Immunizing children helps protect the health of our community, especially those people who cannot be immunized. Yet, since 1996 after a study was released that linked autism to vaccinations, there has been a trend of parents refusing to vaccinate their children. What are the demographics of the parents who believe their children are better off without vaccines? By knowing where these parents live and what decisions they make for their children’s education, counties and medical professionals can provide education and address their concerns.

My research involves data on 116,141 kindergarten classes from 2000-2015 in California. The two vaccine ...


Opioid Abuse: Are Doctors Creating The Problem?, Nguyen Tran 2021 Kennesaw State University

Opioid Abuse: Are Doctors Creating The Problem?, Nguyen Tran

Symposium of Student Scholars

Opioid abuse and overdose are serious health problems in the United States. Current research has concentrated on the treatment and prevention of opioid abuse. Using data from the Controlled Substance Utilization Review and Evaluation System (CURES) for California zip codes, my research focuses on the causes of opioid overdose by considering the relationships between the following variables within each zip code: population size, average number of prescriptions per doctor, percentage of people who receive opioid prescriptions, percentage of people receiving the same prescription drug from 3 or more doctors, average number of opioid pills per prescription and number of people ...


Market Research: How To Keep And Gain Customers, Chris McCall 2021 Kennesaw State University

Market Research: How To Keep And Gain Customers, Chris Mccall

Symposium of Student Scholars

Customer-centered market research is essential to the creation and management of successful marketing campaigns. A company that understands their customers will be able to provide those customers with products and services that fit their needs better than the competition, and ultimately increase profits. My research focuses on a database containing customer information for a telecommunications company called Telco. Within this research, I will focus on a number of customer attributes including demographics, services provided, payment methods, contract lengths, monthly charges, and tenure with the company. Considering how these attributes relate to one another will give me a better understanding of ...


Food Deserts: Hungry For Answers, Lawren Cumberbatch 2021 Kennesaw State University

Food Deserts: Hungry For Answers, Lawren Cumberbatch

Symposium of Student Scholars

In 2010, the United States Department of Agriculture (USDA) reported that 23.5 million people in the United States live in food deserts. As defined by the USDA, a “food desert” is a neighborhood that lacks healthy food sources. This can be measured by distance to a store, number of stores in an area, individual-level resources such as family income or vehicle availability, and neighborhood-level resources such as availability of public transportation. Past research provides evidence that food deserts are especially likely to occur in communities heavily populated by minorities. As a Black Indian pre-med student aiming to join the ...


Determining Malignancy: Can Mammogram Results Help Predict The Diagnosis Of Breast Tumors?, Taylor Behrens 2021 Kennesaw State University

Determining Malignancy: Can Mammogram Results Help Predict The Diagnosis Of Breast Tumors?, Taylor Behrens

Symposium of Student Scholars

Even with advancements in treatment and preventative care, breast cancer remains an epidemic claiming more than 40,000 American male and female lives each year. The mammogram dataset that I am analyzing was initially complied in the early 1990s by a team from the University of Wisconsin - Madison. Past research diagnoses breast cancer from fine-needle aspirates. My research focuses on predicting whether we can determine breast cancer diagnoses without the use of invasive procedures and, in particular, whether we can predict breast cancer based on mammogram data. Do measures of gray-scale texture, radius, concavity, perimeter, compactness, area, and smoothness of ...


Accidental Overdoses: Insights To Aid In Prevention, Annabel Nganga 2021 Kennesaw State University

Accidental Overdoses: Insights To Aid In Prevention, Annabel Nganga

Symposium of Student Scholars

Having lost a friend six years ago to an accidental cocaine overdose, I am very passionate about spreading awareness of accidental drug overdoses that have affected thousands of families countrywide. According to past research, deaths resulting from opiates specifically have been on the rise, and a significant number of deaths in the United States for those below fifty years are caused by drug overdoses. Data exists indicating which states have more overdoses. The data set I will be using includes variables on race, sex, age, drug with which person overdosed, location of the overdose, ultimate cause of death and year ...


Digital Commons powered by bepress