Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data

PDF

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 485

Full-Text Articles in Physical Sciences and Mathematics

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre Apr 2024

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre

Whittier Scholars Program

The introduction of PoetHQ, a mobile application, offers an economical strategy for colleges, potentially ushering in significant cost savings. These savings could be redirected towards enhancing academic programs and services, enriching the educational landscape for students. PoetHQ aims to democratize access to crucial software, effectively removing financial barriers and facilitating a richer educational experience. By providing an efficient software solution that reduces organizational overhead while maximizing accessibility for students, the project highlights the essential role of equitable education and resource optimization within academic institutions.


Demographic Data Analysis For Measuring Economic Impact Of The Branch Of Nashville, Tessa Pendleton, Annie Wardroup, Nicole Speyrer, Kimberly Amaya Hernandez Apr 2024

Demographic Data Analysis For Measuring Economic Impact Of The Branch Of Nashville, Tessa Pendleton, Annie Wardroup, Nicole Speyrer, Kimberly Amaya Hernandez

Belmont University Research Symposium (BURS)

As part of the Global Honors Scholars Collaborative, researchers aggregated data from The Belmont Data Collaborative to analyze the three primary ZIP codes (37211, 37013, 37217) served by The Branch of Nashville. These communities include immigrant and refugee populations, whom The Branch supports through its food bank, English classes, and further comprehensive care. Future program development will rely on the analysis of the current client base and eventual assessment of The Branch’s economic impact on the surrounding community. The goal of this research for The Branch of Nashville is twofold: (1) analyze the existing demographics within the above ZIP codes …


Identifying Rural Health Clinics Within The Transformed Medicaid Statistical Information System (T-Msis) Analytic Files, Katherine Ahrens Mph, Phd, Zachariah Croll, Yvonne Jonk Phd, John Gale Ms, Heidi O'Connor Ms Mar 2024

Identifying Rural Health Clinics Within The Transformed Medicaid Statistical Information System (T-Msis) Analytic Files, Katherine Ahrens Mph, Phd, Zachariah Croll, Yvonne Jonk Phd, John Gale Ms, Heidi O'Connor Ms

Rural Health Clinics

Researchers at the Maine Rural Health Research Center describe a methodology for identifying Rural Health Clinic encounters within the Medicaid claims data using Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files.

Background: There is limited information on the extent to which Rural Health Clinics (RHC) provide pediatric and pregnancy-related services to individuals enrolled in state Medicaid/CHIP programs. In part this is because methods to identify RHC encounters within Medicaid claims data are outdated.

Methods: We used a 100% sample of the 2018 Medicaid Demographic and Eligibility and Other Services Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files for 20 states …


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Ethical Data Considerations For Engaging In Reparative Archival Practice, Jamie Rogers, Rhia Rae Nov 2023

Ethical Data Considerations For Engaging In Reparative Archival Practice, Jamie Rogers, Rhia Rae

Works of the FIU Libraries

Archival textually-rich materials--such as warranty deeds, mortgages, legal documents, and letter correspondence--can provide valuable historical insights, and if transcribed and analyzed, can produce data points in the form of unstructured text, tabular data, and geospatial assets. This presentation will provide an overview of the process Florida International University librarians went through to turn the papers of Dana A. Dorsey, Miami's first Black Millionaire, into data. Their work is guided by the concept of "collections as data" as a form of reparative archival practice, enabling the elevation of marginalized individuals' histories. The goal of reparative archival practice is to create a …


Delivering Healthcare To The Underserved, Edward Booty Nov 2023

Delivering Healthcare To The Underserved, Edward Booty

Asian Management Insights

Non-profits, governments, and businesses need to come together and use a data-driven approach to improve local basic healthcare access.


Data Ethics And Privacy For Researchers, Kelley F. Rowan Sep 2023

Data Ethics And Privacy For Researchers, Kelley F. Rowan

Works of the FIU Libraries

This workshop addresses specific data privacy and anonymization standards and techniques for researchers that are collecting personally identifiable information as well as sensitive information. The workshop covers federal, state, and international laws and regulations governing data privacy, the development of an impact assessment and privacy policy. The second half of the workshop focuses on ethical workflows, anonymization techniques and related resources.


Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy Aug 2023

Using Geographic Information To Explore Player-Specific Movement And Its Effects On Play Success In The Nfl, Hayley Horn, Eric Laigaie, Alexander Lopez, Shravan Reddy

SMU Data Science Review

American Football is a billion-dollar industry in the United States. The analytical aspect of the sport is an ever-growing domain, with open-source competitions like the NFL Big Data Bowl accelerating this growth. With the amount of player movement during each play, tracking data can prove valuable in many areas of football analytics. While concussion detection, catch recognition, and completion percentage prediction are all existing use cases for this data, player-specific movement attributes, such as speed and agility, may be helpful in predicting play success. This research calculates player-specific speed and agility attributes from tracking data and supplements them with descriptive …


Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal Aug 2023

Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal

Cyber Operations and Resilience Program Graduate Projects

Jack Teixeira, a 21-year-old IT specialist Air National Guard found himself on the wrong side of the US law after sharing what is considered classified and extremely sensitive information about USA's operations and role in Ukraine and Russia war. Like other previous cases of leakage of classified intelligence, the case of Teixeira raises concerns about the weaknesses and vulnerability of federal agencies' IT systems and security protocols governing accessibility to classified documents. Internal leakages of such classified documents hurt national security and can harm the country, especially when such secretive intelligence finds its way into the hands of enemies. Unauthorized …


Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney Aug 2023

Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney

Research Collection School Of Computing and Information Systems

Regulations related to the disclosure of environmental, governance, and social (ESG) factors are evolving rapidly and are a major concern for financial compliance worldwide. Information technology has the potential to reduce the effort and cost of ESG disclosure compliance. However, comprehensive and accurate ESG data are necessary for disclosures. Currently, the availability and quality of underlying data for ESG disclosures vary widely and are often deficient. The process involved with obtaining ESG data is also often inefficient and prone to error. This paper compares the models used and the evolution of Fintech data infrastructure developed to support financial services with …


A Bayesian Spatial Scan Statistic For Normal Data, Laasya Velamakanni Jul 2023

A Bayesian Spatial Scan Statistic For Normal Data, Laasya Velamakanni

Theses and Dissertations

Scan statistics are useful methods for detecting spatial clustering. While they were initially developed to detect regions with an excess of binomial or Poisson events, spatial scan statistics have been extended to detect hotspots in other types of data including continuous data. They have many applications in different fields such as epidemiology (e.g. detecting disease outbreaks), sociology (e.g. detecting crime hotspots), and environmental health (e.g. detecting high-pollution areas). Spatial scan statistics identify a ‘most likely cluster’ and then use a likelihood ratio test to determine if this cluster is statistically significant. Spatial scan statistics have been extended to the Bayesian …


Phantom Shootings, Allan Ambris Jun 2023

Phantom Shootings, Allan Ambris

Dissertations, Theses, and Capstone Projects

This capstone is a website designed to critique NYC Open Data reporting with respect to shootings through a series of visualizations and discoveries. The NYPD Shooting Incidents datasets (Historic and Year to Date) introduce themselves to the user by claiming to be a “list of every shooting incident that occurred in NYC.” The supplied documentation reveals that this is not the case.

After understanding the supporting materials, there are still undisclosed truths. My exploration of the data revealed that a single victim may be represented across multiple entries. Additionally, multiple victims may be represented by a single entry. It is …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe May 2023

Baseball’S Evolution In The 21st Century, And How It Exemplifies Human Response To Change, Jonathan Sharpe

Honors Projects

The game of baseball has changed a lot in the past twenty years. It can be primarily attributed to the explosion in data analytics and how they are used to evaluate baseball players. This led to different player profiles being preferred and eventually led to the development of players changing. As a result, the strategies employed have also evolved and turned into a different game than seen only a couple of decades ago. This paper will explore the changes that the game has seen. On the other hand, Major League Baseball has also implemented its own changes to try and …


Social Impacts Of Robotics On The Labor And Employment Market, Kelvin Espinal Feb 2023

Social Impacts Of Robotics On The Labor And Employment Market, Kelvin Espinal

Dissertations, Theses, and Capstone Projects

Robotics have been introduced into the workplace to perform tasks that human beings have traditionally fulfilled. Complementing or substituting human labor with robotics eliminates human involvement in functions attributable to hazardous environments, heavy lifting, toxic substances, and repetitive low-level tasks. On the other hand, they are meant to be more efficient and cost-effective, saving money, time, and labor. However, since the introduction of robotics in the workforce, societal opposition has been towards this branch of technology in fear of losing employment, wages, and purpose.

Previous studies have reported an overarching societal fear that adopting robotics in the workplace and industry …


Development Of A Data Science Curriculum For An Engineering Technology Program, Salih Sarp, Murat Kuzlu, Otilia Popescu, Vukica M. Jovanovic, Zafer Acar Jan 2023

Development Of A Data Science Curriculum For An Engineering Technology Program, Salih Sarp, Murat Kuzlu, Otilia Popescu, Vukica M. Jovanovic, Zafer Acar

Engineering Technology Faculty Publications

Data science has gained the attention of various industries, educators, parents, and students thinking about their future careers. Statistics departments have traditionally offered data science courses for a long time. The main objective of these courses is to examine the fundamental concepts and theories. However, teaching data science courses has also expanded to other disciplines due to the vast amount of data being collected by numerous modern applications. Also, someone needs to learn how to collect and process data, especially from industrial devices, because of the recent development of Internet of Things (IoT) technologies. Hence, integrating data science into the …


Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp Jan 2023

Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp

Engineering Technology Faculty Publications

The Arduino platform has long been an efficient tool in teaching electrical engineering technology, electrical engineering, and computer science concepts in schools and universities and introducing new learners to programming and microcontrollers. Numerous Arduino projects are widely available through the open-source community, and they can help students to have hands-on experience in building circuits and programming electronics with a wide variety of topics that can make learning electrical prototyping fun. The educational fields of electrical engineering and electrical engineering technology need continuous updating to keep up with the continuous evolution of the computer system. Although the traditional Arduino platform has …


Data Curation For Modeling Tall Fescue Biomass Dynamics With Dssat-Csm, M. B. Hanson, P. D. Alderman, T. J. Butler, A. Caldeira Rocateli Jan 2023

Data Curation For Modeling Tall Fescue Biomass Dynamics With Dssat-Csm, M. B. Hanson, P. D. Alderman, T. J. Butler, A. Caldeira Rocateli

IGC Proceedings (1997-2023)

While models for predicting forage production are available to aid management decisions for some forage crops, there is limited research for a yield model designed specifically for tall fescue (Schedonorus arundinaceus). Therefore, our objective was to adapt an existing perennial forage model, the Decision Support System for Agrotechnology Transfer Cropping Systems Model (DSSAT-CSM) for predicting forage biomass of tall fescue in the southern Great Plains. To evaluate model performance, there must first be a high level of data manipulation and cleaning. In this project, a cohesive dataset combining biomass, weather, soil, and management data were structured into DSSAT …


Big Data Analytics Of Medical Data, Ashwin Rajasankar Dec 2022

Big Data Analytics Of Medical Data, Ashwin Rajasankar

Culminating Experience Projects

Data has become a huge part of modern decision making. With the improvements in computing performance and storage in the past two decades, storing large amounts of data has become much easier. Analyzing large amounts of data and creating data models with them can help organizations obtain insights and information which helps their decision making. Big data analytics has become an integral part of many fields such as retail, real estate, education, and medicine. In the project, the goal is to understand the working of Apache Spark and its different storage methods and create a data warehouse to analyze data. …


Safe Sharing For Sensitive Data, Kristi Thompson Dec 2022

Safe Sharing For Sensitive Data, Kristi Thompson

Western Libraries Presentations

This workshop focused on the question of when and how human subjects' data can be safely shared. It introduced the basics of data anonymization and discussed how to tell if a dataset has been de-identified. Case studies of successful anonymization and some spectacular failures were shared


Getting Started Analyzing Data In Spss, Kristi Thompson Nov 2022

Getting Started Analyzing Data In Spss, Kristi Thompson

Western Libraries Presentations

SPSS is a popular package for analyzing data. This session will discuss how to get started on a simple quantitative analysis project using SPSS. Topics covered will include getting summary statistics, creating and modifying variables, creating graphs, running simple analyses, and interpreting SPSS output.


Supplementary Information For "Understanding Mid-To Large Underground Leaks From Buried Pipelines As Affected By Soil And Atmospheric Conditions – Field Scale Experimental Study", Navodi J.R.R. Jayarathne, Kathleen M. Smits, Stuart N. Riddick, Daniel J. Zimmerle, Younki Cho, Michelle Schwartz, Fancy Cheptonui, Kevan Cameron, Peter Ronney Aug 2022

Supplementary Information For "Understanding Mid-To Large Underground Leaks From Buried Pipelines As Affected By Soil And Atmospheric Conditions – Field Scale Experimental Study", Navodi J.R.R. Jayarathne, Kathleen M. Smits, Stuart N. Riddick, Daniel J. Zimmerle, Younki Cho, Michelle Schwartz, Fancy Cheptonui, Kevan Cameron, Peter Ronney

Earth & Environmental Sciences Datasets

Reducing the amount of leaked natural gas (NG) from pipelines from production to use has become a high priority in efforts to cut anthropogenic emissions of methane and ensure public safety. However, tracking and evaluating NG pipeline leaks, especially at moderate to high flow rates, requires a better understanding of the leak from the source to the detector as well as more robust quantification methods. To better understand fugitive emissions from NG pipelines, we developed a field scale testbed that simulates mid and high-pressure gas leaks from belowground natural gas infrastructure. The system is equipped with subsurface, surface and atmospheric …


Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove Jul 2022

Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove

Shorter Faculty Works

In the classic comedy Groundhog Day, protagonist Phil, played by Bill Murray, asks “What would you do if you were stuck in one place and every day was exactly the same, and nothing that you did mattered?” In this movie, Phil is stuck reliving the same day over and over, where the events repeat in a continual loop, and nothing he does can stop them. Phil’s predicament sounds a lot like our cruel cycle with data breaches.

Every year, organizations suffer more data spills and attacks, with personal information being exposed and abused at alarming rates. While Phil …


Mapping The Covid-19 Pandemic In Staten Island, Vincenzo Mezzio May 2022

Mapping The Covid-19 Pandemic In Staten Island, Vincenzo Mezzio

Student Theses

COVID-19 has had diverging effects in New York City. Out of the five boroughs, Staten Island has one of the largest percentages of COVID-19 cases relative to population. This research examines key social and spatial factors that contribute to the increase in COVID-19 cases in Staten Island). It asks: Which parts of Staten Island have higher rates of transmission of COVID-19? Which parts of the borough have higher population who are more vulnerable to COVID-19? What is the relationship between the location of vaccination centers with the rates of COVID-19 cases? Using Geographic Information Systems (GIS), this research examines the …


How Blockchain Solutions Enable Better Decision Making Through Blockchain Analytics, Sammy Ter Haar May 2022

How Blockchain Solutions Enable Better Decision Making Through Blockchain Analytics, Sammy Ter Haar

Information Systems Undergraduate Honors Theses

Since the founding of computers, data scientists have been able to engineer devices that increase individuals’ opportunities to communicate with each other. In the 1990s, the internet took over with many people not understanding its utility. Flash forward 30 years, and we cannot live without our connection to the internet. The internet of information is what we called early adopters with individuals posting blogs for others to read, this was known as Web 1.0. As we progress, platforms became social allowing individuals in different areas to communicate and engage with each other, this was known as Web 2.0. As Dr. …


Meas: Exploring Links Between Implementation And Standards Mastery, Noah Silver Apr 2022

Meas: Exploring Links Between Implementation And Standards Mastery, Noah Silver

Honors Projects

In order to effectively enhance a student’s mathematical understanding and development in the field of mathematics, students need to engage in problem solving. Model eliciting activities, or MEAs, provide students with tasks that promote higher level thinking and the ability to utilize mathematics outside of the classroom; they also align and promote the utilization of the Common Core State Standards and Standards for Mathematical Practice. Research suggests that the language and motivation promoted by MEAs enriches engagement and increases student ability and performance of traditional and real-world mathematics. Use of technology further supports these goals. Through the analysis of checkpoint …


Performance Improvements In Inner Product Encryption, Serena Riback Apr 2022

Performance Improvements In Inner Product Encryption, Serena Riback

Honors Scholar Theses

Consider a database that contains thousands of entries of the iris biometric. Each entry identifies an individual, so it is especially important that it remains secure. However, searching for entries among an encrypted database proves to be a security problem - how should one search encrypted data without leaking any information to a potential attacker? The proximity searchable encryption scheme, as discussed in the work by Cachet et al., uses the notions of inner product encryption developed by Kim et al.. In this paper, we will focus on the efficiency of these schemes. Specifically, how the symmetry of the bilinear …


Building Capacity For Data-Driven Scholarship, Jamie Rogers Mar 2022

Building Capacity For Data-Driven Scholarship, Jamie Rogers

Works of the FIU Libraries

This talk provides an overview of "dLOC as Data: A Thematic Approach to Caribbean Newspapers," an initiative developed to increase access to digitized Caribbean newspaper text for bulk download, facilitating computational analysis. Capacity building for future research in Caribbean Studies being a crucial aspect of this initiative, a thematic toolkit was developed to facilitate use of the project data as well as provide replicable processes. The toolkit includes sample text analysis projects, as well as tutorials and detailed project documentation. While the toolkit focuses on the history of hurricanes and tropical cyclones of the region, the methodologies and tools used …


The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson Mar 2022

The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson

Western Libraries Presentations

This webinar is devoted to the mathematical and theoretical underpinnings of guaranteed data anonymization. Topics covered include an overview of identifiers and quasi-identifiers, an introduction to k-anonymity, a look at some cases where k-anonymity breaks down, and anonymization hierarchies. The presenter will describe a method to assess a survey dataset for anonymization using standard statistical software and consider the question of "anonymization overkill". Much of the academic material looking at data anonymization is quite abstract and aimed at computer scientists, while material aimed at data curators does not always consider recent developments. This webinar is intended to help bridge the …


Outvoice: Bringing Transparency To Healthcare, Autumn Clark Feb 2022

Outvoice: Bringing Transparency To Healthcare, Autumn Clark

Undergraduate Honors Theses

Industries are not incentivized to price reasonably and spend responsibly if consumers do not have the ability to shop around within that industry, and shopping around is not possible without pricing transparency (knowing how much a good or service costs before purchasing it). But in the healthcare industry, we typically default to whichever clinic or hospital is closest, with no prior knowledge of what costs we can expect to incur at that particular institution. According to a poll published by Harvard University, nine out of ten Americans feel the healthcare industry is too opaque and greater transparency is needed.

We …