Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

PDF

Computer Sciences

Data

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 97

Full-Text Articles in Entire DC Network

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre Apr 2024

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre

Whittier Scholars Program

The introduction of PoetHQ, a mobile application, offers an economical strategy for colleges, potentially ushering in significant cost savings. These savings could be redirected towards enhancing academic programs and services, enriching the educational landscape for students. PoetHQ aims to democratize access to crucial software, effectively removing financial barriers and facilitating a richer educational experience. By providing an efficient software solution that reduces organizational overhead while maximizing accessibility for students, the project highlights the essential role of equitable education and resource optimization within academic institutions.


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Delivering Healthcare To The Underserved, Edward Booty Nov 2023

Delivering Healthcare To The Underserved, Edward Booty

Asian Management Insights

Non-profits, governments, and businesses need to come together and use a data-driven approach to improve local basic healthcare access.


Enhancing Relation Database Security With Shuffling, Tieming Geng Oct 2023

Enhancing Relation Database Security With Shuffling, Tieming Geng

Theses and Dissertations

Database security holds paramount importance as it safeguards an organization's most valuable assets: its data. In an age marked by escalating cyber threats, protecting sensitive information stored in databases is essential to preserve trust, prevent financial losses, and maintain legal compliance. In this dissertation, an exploration into the realm of relation database security is undertaken. The research introduces a cryptographic secure shuffling algorithm designed to fortify database security. Additionally, the dissertation presents a series of innovative solutions aimed at bolstering both the security and efficiency of the shuffling algorithm. Encryption algorithms have long served as a mean of safeguarding sensitive …


Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal Aug 2023

Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal

Cyber Operations and Resilience Program Graduate Projects

Jack Teixeira, a 21-year-old IT specialist Air National Guard found himself on the wrong side of the US law after sharing what is considered classified and extremely sensitive information about USA's operations and role in Ukraine and Russia war. Like other previous cases of leakage of classified intelligence, the case of Teixeira raises concerns about the weaknesses and vulnerability of federal agencies' IT systems and security protocols governing accessibility to classified documents. Internal leakages of such classified documents hurt national security and can harm the country, especially when such secretive intelligence finds its way into the hands of enemies. Unauthorized …


Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney Aug 2023

Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney

Research Collection School Of Computing and Information Systems

Regulations related to the disclosure of environmental, governance, and social (ESG) factors are evolving rapidly and are a major concern for financial compliance worldwide. Information technology has the potential to reduce the effort and cost of ESG disclosure compliance. However, comprehensive and accurate ESG data are necessary for disclosures. Currently, the availability and quality of underlying data for ESG disclosures vary widely and are often deficient. The process involved with obtaining ESG data is also often inefficient and prone to error. This paper compares the models used and the evolution of Fintech data infrastructure developed to support financial services with …


Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan May 2023

Data-Optimized Spatial Field Predictions For Robotic Adaptive Sampling: A Gaussian Process Approach, Zachary Nathan

Computer Science Senior Theses

We introduce a framework that combines Gaussian Process models, robotic sensor measurements, and sampling data to predict spatial fields. In this context, a spatial field refers to the distribution of a variable throughout a specific area, such as temperature or pH variations over the surface of a lake. Whereas existing methods tend to analyze only the particular field(s) of interest, our approach optimizes predictions through the effective use of all available data. We validated our framework on several datasets, showing that errors can decline by up to two-thirds through the inclusion of additional colocated measurements. In support of adaptive sampling, …


Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp Jan 2023

Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp

Engineering Technology Faculty Publications

The Arduino platform has long been an efficient tool in teaching electrical engineering technology, electrical engineering, and computer science concepts in schools and universities and introducing new learners to programming and microcontrollers. Numerous Arduino projects are widely available through the open-source community, and they can help students to have hands-on experience in building circuits and programming electronics with a wide variety of topics that can make learning electrical prototyping fun. The educational fields of electrical engineering and electrical engineering technology need continuous updating to keep up with the continuous evolution of the computer system. Although the traditional Arduino platform has …


Big Data Analytics Of Medical Data, Ashwin Rajasankar Dec 2022

Big Data Analytics Of Medical Data, Ashwin Rajasankar

Culminating Experience Projects

Data has become a huge part of modern decision making. With the improvements in computing performance and storage in the past two decades, storing large amounts of data has become much easier. Analyzing large amounts of data and creating data models with them can help organizations obtain insights and information which helps their decision making. Big data analytics has become an integral part of many fields such as retail, real estate, education, and medicine. In the project, the goal is to understand the working of Apache Spark and its different storage methods and create a data warehouse to analyze data. …


Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove Jul 2022

Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove

Shorter Faculty Works

In the classic comedy Groundhog Day, protagonist Phil, played by Bill Murray, asks “What would you do if you were stuck in one place and every day was exactly the same, and nothing that you did mattered?” In this movie, Phil is stuck reliving the same day over and over, where the events repeat in a continual loop, and nothing he does can stop them. Phil’s predicament sounds a lot like our cruel cycle with data breaches.

Every year, organizations suffer more data spills and attacks, with personal information being exposed and abused at alarming rates. While Phil …


Mapping The Covid-19 Pandemic In Staten Island, Vincenzo Mezzio May 2022

Mapping The Covid-19 Pandemic In Staten Island, Vincenzo Mezzio

Student Theses

COVID-19 has had diverging effects in New York City. Out of the five boroughs, Staten Island has one of the largest percentages of COVID-19 cases relative to population. This research examines key social and spatial factors that contribute to the increase in COVID-19 cases in Staten Island). It asks: Which parts of Staten Island have higher rates of transmission of COVID-19? Which parts of the borough have higher population who are more vulnerable to COVID-19? What is the relationship between the location of vaccination centers with the rates of COVID-19 cases? Using Geographic Information Systems (GIS), this research examines the …


Performance Improvements In Inner Product Encryption, Serena Riback Apr 2022

Performance Improvements In Inner Product Encryption, Serena Riback

Honors Scholar Theses

Consider a database that contains thousands of entries of the iris biometric. Each entry identifies an individual, so it is especially important that it remains secure. However, searching for entries among an encrypted database proves to be a security problem - how should one search encrypted data without leaking any information to a potential attacker? The proximity searchable encryption scheme, as discussed in the work by Cachet et al., uses the notions of inner product encryption developed by Kim et al.. In this paper, we will focus on the efficiency of these schemes. Specifically, how the symmetry of the bilinear …


Building Capacity For Data-Driven Scholarship, Jamie Rogers Mar 2022

Building Capacity For Data-Driven Scholarship, Jamie Rogers

Works of the FIU Libraries

This talk provides an overview of "dLOC as Data: A Thematic Approach to Caribbean Newspapers," an initiative developed to increase access to digitized Caribbean newspaper text for bulk download, facilitating computational analysis. Capacity building for future research in Caribbean Studies being a crucial aspect of this initiative, a thematic toolkit was developed to facilitate use of the project data as well as provide replicable processes. The toolkit includes sample text analysis projects, as well as tutorials and detailed project documentation. While the toolkit focuses on the history of hurricanes and tropical cyclones of the region, the methodologies and tools used …


The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson Mar 2022

The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson

Western Libraries Presentations

This webinar is devoted to the mathematical and theoretical underpinnings of guaranteed data anonymization. Topics covered include an overview of identifiers and quasi-identifiers, an introduction to k-anonymity, a look at some cases where k-anonymity breaks down, and anonymization hierarchies. The presenter will describe a method to assess a survey dataset for anonymization using standard statistical software and consider the question of "anonymization overkill". Much of the academic material looking at data anonymization is quite abstract and aimed at computer scientists, while material aimed at data curators does not always consider recent developments. This webinar is intended to help bridge the …


Outvoice: Bringing Transparency To Healthcare, Autumn Clark Feb 2022

Outvoice: Bringing Transparency To Healthcare, Autumn Clark

Undergraduate Honors Theses

Industries are not incentivized to price reasonably and spend responsibly if consumers do not have the ability to shop around within that industry, and shopping around is not possible without pricing transparency (knowing how much a good or service costs before purchasing it). But in the healthcare industry, we typically default to whichever clinic or hospital is closest, with no prior knowledge of what costs we can expect to incur at that particular institution. According to a poll published by Harvard University, nine out of ten Americans feel the healthcare industry is too opaque and greater transparency is needed.

We …


Membership Application Subscription Based, Vlera Zhubi, Medina Shamolli Oct 2021

Membership Application Subscription Based, Vlera Zhubi, Medina Shamolli

UBT International Conference

I

t's no secret: Units wanting from small businesses, both public and private, to large businesses, are facing new challenges in hiring and managing members and payments. The membership application is an application that provides businesses, associations, clubs and functional organizations looking to manage their members. A membership application is an online subscription-based business where people pay for regular access to exclusive content. Online memberships and courses have continued to grow as people are willing to pay for convenience, exclusivity, knowledge and community. They are structured in such a way as to meet the special needs of the members at …


Public Interest Technology – Exploring Covid-19 Health Data, Sarah Zelikovitz Jan 2021

Public Interest Technology – Exploring Covid-19 Health Data, Sarah Zelikovitz

Open Educational Resources

This module is part of a Introduction to Data Science course that covers the different parts of the data science process: data acquisition, cleaning, exploratory data analysis, and modeling. The COVID-19 pandemic has created much interest in public health data, as well as interest in visualization of all types of data. Public health data has a set of challenges that is unique to health data, with HIPAA laws, and real time collection of data. With COVID-19, the challenges are particularly amplified, as data collection and statistics collected are constantly changing in response to feedback from labs, hospitals, drug companies, and …


Law Library Blog (January 2021): Legal Beagle's Blog Archive, Roger Williams University School Of Law Jan 2021

Law Library Blog (January 2021): Legal Beagle's Blog Archive, Roger Williams University School Of Law

Law Library Newsletters/Blog

No abstract provided.


A Data-Based Guiding Framework For Digital Transformation, Zakaria Maamar, Saoussen Cheikhrouhou, Said Elnaffar Jan 2021

A Data-Based Guiding Framework For Digital Transformation, Zakaria Maamar, Saoussen Cheikhrouhou, Said Elnaffar

All Works

This paper presents a framework for guiding organizations initiate and sustain digital transformation initiatives. Digital transformation is a long-term journey that an organization embarks on when it decides to question its practices in light of management, operation, and technology challenges. The guiding framework stresses out the importance of data in any digital transformation initiative by suggesting 4 stages referred to as collection, processing, storage, and dissemination. Because digital transformation could impact different areas of an organization for instance, business processes and business models, each stage suggests techniques to expose data. 2 case studies are adopted in the paper to illustrate …


Security Against Data Falsification Attacks In Smart City Applications, Venkata Praveen Kumar Madhavarapu Jan 2021

Security Against Data Falsification Attacks In Smart City Applications, Venkata Praveen Kumar Madhavarapu

Doctoral Dissertations

Smart city applications like smart grid, smart transportation, healthcare deal with very important data collected from IoT devices. False reporting of data consumption from device failures or by organized adversaries may have drastic consequences on the quality of operations. To deal with this, we propose a coarse grained and a fine grained anomaly based security event detection technique that uses indicators such as deviation and directional change in the time series of the proposed anomaly detection metrics to detect different attacks. We also built a trust scoring metric to filter out the malicious devices. Another challenging problem is injection of …


Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison Jul 2020

Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison

Presentations

Do you trudge through gathering statistics at fiscal or calendar year-end? Do you wonder why you track certain things, thinking many seem outdated or irrelevant? Many places seem to keep counting certain statistics because "that's what they've always done." For e-resources, how do you integrate those with physical counts and reconcile the variations (updated e-resources versus re-cataloged physical items)? What about repository downloads and other web traffic? The quantity of stats that libraries track is staggering and keeps growing. This program will encourage attendees to stop and evaluate what and why they're gathering data and help identify possible alternatives to …


Context Aware Data Generation Through Domain Specific Language, Caleb Druckemiller Apr 2020

Context Aware Data Generation Through Domain Specific Language, Caleb Druckemiller

Other Student Works

No abstract provided.


Data Governance And The Emerging University, Michael J. Madison Jan 2020

Data Governance And The Emerging University, Michael J. Madison

Book Chapters

Knowledge and information governance questions are tractable primarily in institutional terms, rather than in terms of abstractions such as knowledge itself or individual or social interests. This chapter offers the modern research university as an example. Practices of data-intensive research by university-based researchers, sometimes reduced to the popular phrase “Big Data,” pose governance challenges for the university. The chapter situates those challenges in the traditional understanding of the university as an institution for understanding forms and flows of knowledge. At a broad level, the chapter argues that the new salience of data exposes emerging shifts in the social, cultural, and …


A Machine Learning Approach To The Perception Of Phrase Boundaries In Music, Evan Matthew Petratos Jan 2020

A Machine Learning Approach To The Perception Of Phrase Boundaries In Music, Evan Matthew Petratos

Senior Projects Fall 2020

Segmentation is a well-studied area of research for speech, but the segmentation of music has typically been treated as a separate domain, even though the same acoustic cues that constitute information in speech (e.g., intensity, timbre, and rhythm) are present in music. This study aims to sew the gap in research of speech and music segmentation. Musicians can discern where musical phrases are segmented. In this study, these boundaries are predicted using an algorithmic, machine learning approach to audio processing of acoustic features. The acoustic features of musical sounds have localized patterns within sections of the music that create aurally …


Complex Systems Analysis In Selected Domains: Animal Biosecurity & Genetic Expression, Luke Trinity Jan 2020

Complex Systems Analysis In Selected Domains: Animal Biosecurity & Genetic Expression, Luke Trinity

Graduate College Dissertations and Theses

I first broadly define the study of complex systems, identifying language to describe and characterize mechanisms of such systems which is applicable across disciplines. An overview of methods is provided, including the description of a software development methodology which defines how a combination of computer science, statistics, and mathematics are applied to specified domains. This work describes strategies to facilitate timely completion of robust and adaptable projects which vary in complexity and scope. A biosecurity informatics pipeline is outlined, which is an abstraction useful in organizing the analysis of biological data from cells. This is followed by specific applications of …


A Data Analysis Of The World Happiness Index And Its Relation To The North-South Divide, Charles Alba Dec 2019

A Data Analysis Of The World Happiness Index And Its Relation To The North-South Divide, Charles Alba

Undergraduate Economic Review

In this document, we perform a detailed data analysis on the World Happiness Report with its relation to the socio-economic North-South Divide. In order to do so, we perform some extensive data cleaning and analysis before querying on the World Happiness Report. Our results based on Hypothesis Testing determines the happiness of the Global North is greater than that of the Global South. Furthermore, our queries show that the mean happiness score for the Global North significantly outweighing that of the South. Likewise, the 10 'Happiest' nations all belong to the Global North whereas the 10 'least happy' nations belong …


Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li Jun 2019

Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li

Faculty Publications

Introduction Linkage and retention in HIV medical care remains problematic in the USA. Extensive health utilisation data collection through electronic health records (EHR) and claims data represent new opportunities for scientific discovery. Big data science (BDS) is a powerful tool for investigating HIV care utilisation patterns. The South Carolina (SC) office of Revenue and Fiscal Affairs (RFA) data warehouse captures individual-level longitudinal health utilisation data for persons living with HIV (PLWH). The data warehouse includes EHR, claims and data from private institutions, housing, prisons, mental health, Medicare, Medicaid, State Health Plan and the department of health and human services. The …


Big Data And The Consumer, Seema Chokshi Apr 2019

Big Data And The Consumer, Seema Chokshi

MITB Thought Leadership Series

What is big data? The intuitive meaning of the phrase ‘big data’ might be “data that is huge in quantity”. But is that interpretation enough? Data of this type has existed for as long as humans have made records of their work. Some of the earliest writings, such as cuneiform, contain vast amounts of data covering areas as diverse as law, mapping and mathematical equations.


Representation And Reconstruction Of Linear, Time-Invariant Networks, Nathan Scott Woodbury Apr 2019

Representation And Reconstruction Of Linear, Time-Invariant Networks, Nathan Scott Woodbury

Theses and Dissertations

Network reconstruction is the process of recovering a unique structured representation of some dynamic system using input-output data and some additional knowledge about the structure of the system. Many network reconstruction algorithms have been proposed in recent years, most dealing with the reconstruction of strictly proper networks (i.e., networks that require delays in all dynamics between measured variables). However, no reconstruction technique presently exists capable of recovering both the structure and dynamics of networks where links are proper (delays in dynamics are not required) and not necessarily strictly proper.The ultimate objective of this dissertation is to develop algorithms capable of …


A Bottom-Up Modeling Methodology Using Knowledge Graphs For Composite Metric Development Applied To Traffic Crashes In The State Of Texas, Daniel Michael Mejia Jan 2019

A Bottom-Up Modeling Methodology Using Knowledge Graphs For Composite Metric Development Applied To Traffic Crashes In The State Of Texas, Daniel Michael Mejia

Open Access Theses & Dissertations

Data is a key factor for understanding real-world phenomena. Data can be discovered and integrated from multiple sources and has the potential to be interpreted in a multitude of ways. Traffic crashes, for example, are common events that occur in cities and provide a significant amount of data that has potential to be analyzed and disseminated in a way that can improve mobility of people, and ultimately improve the quality of life. Improving the quality of life of city residents through the use of data and technology is at the core of Smart Cities solutions. Measuring the improvement that Smart …