Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Series

Data

Institution
Publication Year
Publication

Articles 1 - 30 of 44

Full-Text Articles in Entire DC Network

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre Apr 2024

Data Engineering: Building Software Efficiency In Medium To Large Organizations, Alessandro De La Torre

Whittier Scholars Program

The introduction of PoetHQ, a mobile application, offers an economical strategy for colleges, potentially ushering in significant cost savings. These savings could be redirected towards enhancing academic programs and services, enriching the educational landscape for students. PoetHQ aims to democratize access to crucial software, effectively removing financial barriers and facilitating a richer educational experience. By providing an efficient software solution that reduces organizational overhead while maximizing accessibility for students, the project highlights the essential role of equitable education and resource optimization within academic institutions.


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Delivering Healthcare To The Underserved, Edward Booty Nov 2023

Delivering Healthcare To The Underserved, Edward Booty

Asian Management Insights

Non-profits, governments, and businesses need to come together and use a data-driven approach to improve local basic healthcare access.


Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal Aug 2023

Cybersecurity Safeguards: What Cybersecurity Safeguards Could Have Prevented The Intelligence/Data Breach By A Member Of The Air National Guard, Christopher Curtis Royal

Cyber Operations and Resilience Program Graduate Projects

Jack Teixeira, a 21-year-old IT specialist Air National Guard found himself on the wrong side of the US law after sharing what is considered classified and extremely sensitive information about USA's operations and role in Ukraine and Russia war. Like other previous cases of leakage of classified intelligence, the case of Teixeira raises concerns about the weaknesses and vulnerability of federal agencies' IT systems and security protocols governing accessibility to classified documents. Internal leakages of such classified documents hurt national security and can harm the country, especially when such secretive intelligence finds its way into the hands of enemies. Unauthorized …


Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney Aug 2023

Fintech Data Infrastructure For Esg Disclosure Compliance, Randall E. Duran, Peter Tierney

Research Collection School Of Computing and Information Systems

Regulations related to the disclosure of environmental, governance, and social (ESG) factors are evolving rapidly and are a major concern for financial compliance worldwide. Information technology has the potential to reduce the effort and cost of ESG disclosure compliance. However, comprehensive and accurate ESG data are necessary for disclosures. Currently, the availability and quality of underlying data for ESG disclosures vary widely and are often deficient. The process involved with obtaining ESG data is also often inefficient and prone to error. This paper compares the models used and the evolution of Fintech data infrastructure developed to support financial services with …


Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp Jan 2023

Development Of Sensing And Programming Activities For Engineering Technology Pathways Using A Virtual Arduino Simulation Platform, Murat Kuzlu, Vukica Jovanovic, Otilia Popescu, Salih Sarp

Engineering Technology Faculty Publications

The Arduino platform has long been an efficient tool in teaching electrical engineering technology, electrical engineering, and computer science concepts in schools and universities and introducing new learners to programming and microcontrollers. Numerous Arduino projects are widely available through the open-source community, and they can help students to have hands-on experience in building circuits and programming electronics with a wide variety of topics that can make learning electrical prototyping fun. The educational fields of electrical engineering and electrical engineering technology need continuous updating to keep up with the continuous evolution of the computer system. Although the traditional Arduino platform has …


Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove Jul 2022

Data Vu: Why Breaches Involve The Same Stories Again And Again, Woodrow Hartzog, Daniel Solove

Shorter Faculty Works

In the classic comedy Groundhog Day, protagonist Phil, played by Bill Murray, asks “What would you do if you were stuck in one place and every day was exactly the same, and nothing that you did mattered?” In this movie, Phil is stuck reliving the same day over and over, where the events repeat in a continual loop, and nothing he does can stop them. Phil’s predicament sounds a lot like our cruel cycle with data breaches.

Every year, organizations suffer more data spills and attacks, with personal information being exposed and abused at alarming rates. While Phil …


Performance Improvements In Inner Product Encryption, Serena Riback Apr 2022

Performance Improvements In Inner Product Encryption, Serena Riback

Honors Scholar Theses

Consider a database that contains thousands of entries of the iris biometric. Each entry identifies an individual, so it is especially important that it remains secure. However, searching for entries among an encrypted database proves to be a security problem - how should one search encrypted data without leaking any information to a potential attacker? The proximity searchable encryption scheme, as discussed in the work by Cachet et al., uses the notions of inner product encryption developed by Kim et al.. In this paper, we will focus on the efficiency of these schemes. Specifically, how the symmetry of the bilinear …


Building Capacity For Data-Driven Scholarship, Jamie Rogers Mar 2022

Building Capacity For Data-Driven Scholarship, Jamie Rogers

Works of the FIU Libraries

This talk provides an overview of "dLOC as Data: A Thematic Approach to Caribbean Newspapers," an initiative developed to increase access to digitized Caribbean newspaper text for bulk download, facilitating computational analysis. Capacity building for future research in Caribbean Studies being a crucial aspect of this initiative, a thematic toolkit was developed to facilitate use of the project data as well as provide replicable processes. The toolkit includes sample text analysis projects, as well as tutorials and detailed project documentation. While the toolkit focuses on the history of hurricanes and tropical cyclones of the region, the methodologies and tools used …


The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson Mar 2022

The Mathematics Of Risk: An Introduction To Guaranteed Data De-Identification, Kristi Thompson

Western Libraries Presentations

This webinar is devoted to the mathematical and theoretical underpinnings of guaranteed data anonymization. Topics covered include an overview of identifiers and quasi-identifiers, an introduction to k-anonymity, a look at some cases where k-anonymity breaks down, and anonymization hierarchies. The presenter will describe a method to assess a survey dataset for anonymization using standard statistical software and consider the question of "anonymization overkill". Much of the academic material looking at data anonymization is quite abstract and aimed at computer scientists, while material aimed at data curators does not always consider recent developments. This webinar is intended to help bridge the …


Public Interest Technology – Exploring Covid-19 Health Data, Sarah Zelikovitz Jan 2021

Public Interest Technology – Exploring Covid-19 Health Data, Sarah Zelikovitz

Open Educational Resources

This module is part of a Introduction to Data Science course that covers the different parts of the data science process: data acquisition, cleaning, exploratory data analysis, and modeling. The COVID-19 pandemic has created much interest in public health data, as well as interest in visualization of all types of data. Public health data has a set of challenges that is unique to health data, with HIPAA laws, and real time collection of data. With COVID-19, the challenges are particularly amplified, as data collection and statistics collected are constantly changing in response to feedback from labs, hospitals, drug companies, and …


Law Library Blog (January 2021): Legal Beagle's Blog Archive, Roger Williams University School Of Law Jan 2021

Law Library Blog (January 2021): Legal Beagle's Blog Archive, Roger Williams University School Of Law

Law Library Newsletters/Blog

No abstract provided.


A Data-Based Guiding Framework For Digital Transformation, Zakaria Maamar, Saoussen Cheikhrouhou, Said Elnaffar Jan 2021

A Data-Based Guiding Framework For Digital Transformation, Zakaria Maamar, Saoussen Cheikhrouhou, Said Elnaffar

All Works

This paper presents a framework for guiding organizations initiate and sustain digital transformation initiatives. Digital transformation is a long-term journey that an organization embarks on when it decides to question its practices in light of management, operation, and technology challenges. The guiding framework stresses out the importance of data in any digital transformation initiative by suggesting 4 stages referred to as collection, processing, storage, and dissemination. Because digital transformation could impact different areas of an organization for instance, business processes and business models, each stage suggests techniques to expose data. 2 case studies are adopted in the paper to illustrate …


Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison Jul 2020

Data, Stats, Go: Navigating The Intersections Of Cataloging, E-Resource, And Web Analytics Reporting, Rachel S. Evans, Wendy Moore, Jessica Pasquale, Andre Davison

Presentations

Do you trudge through gathering statistics at fiscal or calendar year-end? Do you wonder why you track certain things, thinking many seem outdated or irrelevant? Many places seem to keep counting certain statistics because "that's what they've always done." For e-resources, how do you integrate those with physical counts and reconcile the variations (updated e-resources versus re-cataloged physical items)? What about repository downloads and other web traffic? The quantity of stats that libraries track is staggering and keeps growing. This program will encourage attendees to stop and evaluate what and why they're gathering data and help identify possible alternatives to …


Context Aware Data Generation Through Domain Specific Language, Caleb Druckemiller Apr 2020

Context Aware Data Generation Through Domain Specific Language, Caleb Druckemiller

Other Student Works

No abstract provided.


Data Governance And The Emerging University, Michael J. Madison Jan 2020

Data Governance And The Emerging University, Michael J. Madison

Book Chapters

Knowledge and information governance questions are tractable primarily in institutional terms, rather than in terms of abstractions such as knowledge itself or individual or social interests. This chapter offers the modern research university as an example. Practices of data-intensive research by university-based researchers, sometimes reduced to the popular phrase “Big Data,” pose governance challenges for the university. The chapter situates those challenges in the traditional understanding of the university as an institution for understanding forms and flows of knowledge. At a broad level, the chapter argues that the new salience of data exposes emerging shifts in the social, cultural, and …


Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li Jun 2019

Using Big Data Analytics To Improve Hiv Medical Care Utilisation In South Carolina: A Study Protocol, Bankole Olatosi, Jiajia Zhang, Sharon Weissman, Jianjun Hu, Mohammad Rifat Haider, Xiaoming Li

Faculty Publications

Introduction Linkage and retention in HIV medical care remains problematic in the USA. Extensive health utilisation data collection through electronic health records (EHR) and claims data represent new opportunities for scientific discovery. Big data science (BDS) is a powerful tool for investigating HIV care utilisation patterns. The South Carolina (SC) office of Revenue and Fiscal Affairs (RFA) data warehouse captures individual-level longitudinal health utilisation data for persons living with HIV (PLWH). The data warehouse includes EHR, claims and data from private institutions, housing, prisons, mental health, Medicare, Medicaid, State Health Plan and the department of health and human services. The …


Big Data And The Consumer, Seema Chokshi Apr 2019

Big Data And The Consumer, Seema Chokshi

MITB Thought Leadership Series

What is big data? The intuitive meaning of the phrase ‘big data’ might be “data that is huge in quantity”. But is that interpretation enough? Data of this type has existed for as long as humans have made records of their work. Some of the earliest writings, such as cuneiform, contain vast amounts of data covering areas as diverse as law, mapping and mathematical equations.


Data Insertion In Bitcoin's Blockchain, Andrew Sward, Vecna Op_0, Forrest Stonedahl Jul 2017

Data Insertion In Bitcoin's Blockchain, Andrew Sward, Vecna Op_0, Forrest Stonedahl

Computer Science: Faculty Scholarship & Creative Works

This paper provides the first comprehensive survey of methods for inserting arbitrary data into Bitcoin's blockchain. Historical methods of data insertion are described, along with lesser-known techniques that are optimized for efficiency. Insertion methods are compared on the basis of efficiency, cost, convenience of data reconstruction, permanence, and potentially negative impact on the Bitcoin ecosystem.


Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper Dec 2016

Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Data Visualizations And Infographics, Darren Sweeper Sep 2016

Data Visualizations And Infographics, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Alignment For Comprehensive Two-Dimensional Gas Chromatography (Gcxgc) With Global, Low-Order Polynomial Transformations, Davis Rempe, Stephen Reichenbach, Stephen Scott Apr 2016

Alignment For Comprehensive Two-Dimensional Gas Chromatography (Gcxgc) With Global, Low-Order Polynomial Transformations, Davis Rempe, Stephen Reichenbach, Stephen Scott

UCARE Research Products

As columns age and differ between systems, retention times for GC x GC may vary between runs. In order to properly analyze chromatograms, it is often desirable to align chromatographic features between chromatograms. This alignment can be characterized by a mapping of retention times from one chromatogram to the retention times of another chromatogram. Alignment methods can be classified as global or local, i.e., whether the geometric differences between chromatograms are characterized by a single function for the entire chromatogram or by a combination of many functions for different regions of the chromatogram. Previous work has shown that global, low-degree …


Forecasting Internal Temperature In A Home With A Sensor Network, Bruce Spencer, Omar Alfandi Jan 2016

Forecasting Internal Temperature In A Home With A Sensor Network, Bruce Spencer, Omar Alfandi

All Works

© 2016 The Authors. We forecast internal temperature in a home with sensors, modeled as a linear function of recent sensor values. The Smart∗Project provides publicly available data from an inhabited home over a three month period, reporting on 38 sensors including environmental readings, circuit loads, motion detectors, and switches controlling lights and fans. We select 13 of these sensors that have some influence on the internal temperature, and create forecasts that are accurate to within about 1.6°F (0.9°C) over the next six hours. Temperature prediction is important for saving energy while maintaining comfortable conditions in the home.


Key-Aggregate Cryptosystem For Scalable Data Sharing In Cloud Storage, Cheng-Kang Chu, Sherman S. M. Chow, Wen-Guey Tzeng, Jiangying Zhou, Robert H. Deng Feb 2014

Key-Aggregate Cryptosystem For Scalable Data Sharing In Cloud Storage, Cheng-Kang Chu, Sherman S. M. Chow, Wen-Guey Tzeng, Jiangying Zhou, Robert H. Deng

Research Collection School Of Computing and Information Systems

Data sharing is an important functionality in cloud storage. In this article, we show how to securely, efficiently, and flexibly share data with others in cloud storage. We describe new public-key cryptosystems which produce constant-size ciphertexts such that efficient delegation of decryption rights for any set of ciphertexts are possible. The novelty is that one can aggregate any set of secret keys and make them as compact as a single key, but encompassing the power of all the keys being aggregated. In other words, the secret key holder can release a constant-size aggregate key for flexible choices of ciphertext set …


Big Data: Immediate Opportunities And Longer Term Challenges, Jens Pohl, Kym Jason Pohl Jul 2013

Big Data: Immediate Opportunities And Longer Term Challenges, Jens Pohl, Kym Jason Pohl

Collaborative Agent Design (CAD) Research Center

The transformation of words, locations, and human interactions into digital data forms the basis of trend detection and information extraction opportunities that can be automated with the increasing availability of relatively inexpensive computer storage and processing technology. Trend detection, which focuses on what, is facilitated by the ability to apply analytics to an entire corpus of data instead of a random sample. Since the corpus essentially includes all data within a population there is no need to apply any of the precautions that are in order to ensure the representativeness of a sample in traditional statistical analysis. Several examples are …


Restfs: Resources And Services Are Filesystems, Too, Joseph P. Kaylor, Konstantin Läufer, George K. Thiruvathukal Mar 2011

Restfs: Resources And Services Are Filesystems, Too, Joseph P. Kaylor, Konstantin Läufer, George K. Thiruvathukal

Computer Science: Faculty Publications and Other Works

We have designed and implemented RestFS, a software frame-work that provides a uniform, configurable connector layerfor mapping remote web-based resources to local filesystem-based resources, recognizing the similarity between thesetwo types of resources. Such mappings enable programmaticaccess to a resource, as well as composition of two or moreresources, through the local operating system’s standardfilesystem application programming interface (API), script-able file-based command-line utilities, and inter-process com-munication (IPC) mechanisms. The framework supports au-tomatic and manual authentication. We include several ex-amples intended to show the utility and practicality of ourframework.


Solving The Data Deluge Problem, Jens G. Pohl Aug 2010

Solving The Data Deluge Problem, Jens G. Pohl

Collaborative Agent Design (CAD) Research Center

The paper postulates that the information technology revolution that is commonly referred to as the Information Age is currently in a transition stage between data-processing and knowledge management that should be more aptly referred to as the Data Age. Symptoms of this transition stage are a data deluge problem that is evidenced by the inability of human computer-users to effectively analyze and draw useful conclusions from the overwhelming volume of data that is being collected, the increasing complexity of networked systems, and the acknowledged vulnerability of virtually all existing digital systems to cyber security threats.

The author suggests that the …


The Representation Of Context In Computer Software, Hisham Assal, Kym Pohl, Jens G. Pohl Aug 2009

The Representation Of Context In Computer Software, Hisham Assal, Kym Pohl, Jens G. Pohl

Collaborative Agent Design (CAD) Research Center

Computers do not have the equivalent of a human cognitive system and therefore store data simply as the numbers and words that are entered into the computer. For a computer to interpret data it requires an information structure that provides at least some level of context. This can be accomplished utilizing an ontology of objects with characteristics, semantic behavior, and a rich set of relationships to create a virtual version of real world situations and provide the context within which intelligent logic (e.g., agents) can automatically operate.

This paper discusses the process of developing ontologies that serve to …


Data For Cybersecurity Research: Process And ‘Wish List’, Jean Camp, Lorrie Cranor, Nick Feamster, Joan Feigenbaum, Stephanie Forrest, David Kotz, Wenke Lee, Patrick Lincoln, Vern Paxson, Mike Reiter, Ron Rivest, William Sanders, Stefan Savage, Sean Smith, Eugene Spafford, Sal Stolfo Jun 2009

Data For Cybersecurity Research: Process And ‘Wish List’, Jean Camp, Lorrie Cranor, Nick Feamster, Joan Feigenbaum, Stephanie Forrest, David Kotz, Wenke Lee, Patrick Lincoln, Vern Paxson, Mike Reiter, Ron Rivest, William Sanders, Stefan Savage, Sean Smith, Eugene Spafford, Sal Stolfo

Other Faculty Materials

This document identifies data needs of the security research community. This document is in response to a request for a “data wish list”. Because specific data needs will evolve in conjunction with evolving threats and research problems, we augment the wish list with commentary about some of the broader issues for data usage.


Alternative Paths To Intelligent Systems, Jens G. Pohl Jul 2007

Alternative Paths To Intelligent Systems, Jens G. Pohl

Collaborative Agent Design (CAD) Research Center

This paper examines the three prevalent approaches to Artificial Intelligence (AI), namely symbolic reasoning systems, connectionist systems, and emergent systems based on the principles of the subsumption theory. Distinguished by their top-down and bottom-up mechanisms all three approaches have strengths and weaknesses. While the logical reasoning approach is precise and well supported by mathematical theories and procedures, it is constrained by a largely predefined representational model. Connectionist systems, on the other hand, are able to recognize patterns even if these patterns are only similar and not identical to the patterns that they have been trained to recognize, but they have …