Open Access. Powered by Scholars. Published by Universities.®

Data Storage Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

1,465 Full-Text Articles 3,568 Authors 520,855 Downloads 119 Institutions

All Articles in Data Storage Systems

Faceted Search

1,465 full-text articles. Page 4 of 69.

Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying REN, Lei WANG, Kunpeng LIU, Ruocheng GUO, Ee-peng LIM, Yanjie FU 2022 University of Central Florida

Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying Ren, Lei Wang, Kunpeng Liu, Ruocheng Guo, Ee-Peng Lim, Yanjie Fu

Research Collection School Of Computing and Information Systems

Recommender systems learn from historical user-item interactions to identify preferred items for target users. These observed interactions are usually unbalanced following a long-tailed distribution. Such long-tailed data lead to popularity bias to recommend popular but not personalized items to users. We present a gradient perspective to understand two negative impacts of popularity bias in recommendation model optimization: (i) the gradient direction of popular item embeddings is closer to that of positive interactions, and (ii) the magnitude of positive gradient for popular items are much greater than that of unpopular items. To address these issues, we propose a simple yet efficient …


Scalable Data-Driven Predictive Modeling And Analytics For Cho Process Development Optimization, Sarah Mbiki 2022 Clemson University

Scalable Data-Driven Predictive Modeling And Analytics For Cho Process Development Optimization, Sarah Mbiki

All Dissertations

In 1982, the FDA approved the first recombinant therapeutic protein, and since then, the biopharmaceutical industry has continued to develop innovative and highly effective biological drugs for various illnesses1. These drugs are produced using host organisms that are modified to hold the genetic encoding of the targeted protein1. Of the many host organisms, Chinese hamster ovary (CHO) cells are often used due to capability to perform posttranslational modification (PTM): which allows human-like synthesis of proteins unlikely to invoke immunogenicity in humans 1,2.

Despite all the positive attributes, many challenges are associated with CHO cell cultures, …


Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis, Darin Tsai, Alan Zhang, Aloysius Rebeiro 2022 Purdue University

Redefining Research In Nanotechnology Simulations: A New Approach To Data Caching And Analysis, Darin Tsai, Alan Zhang, Aloysius Rebeiro

The Journal of Purdue Undergraduate Research

No abstract provided.


Data Scarcity In Event Analysis And Abusive Language Detection, Sheikh Muhammad Sarwar 2022 University of Massachusetts Amherst

Data Scarcity In Event Analysis And Abusive Language Detection, Sheikh Muhammad Sarwar

Doctoral Dissertations

Lack of data is almost always the cause of the suboptimal performance of neural networks. Even though data scarce scenarios can be simulated for any task by assuming limited access to training data, we study two problem areas where data scarcity is a practical challenge: event analysis and abusive content detection} Journalists, social scientists and political scientists need to retrieve and analyze event mentions in unstructured text to compute useful statistical information to understand society. We claim that it is hard to specify information need about events using keyword-based representation and propose a Query by Example (QBE) setting for event …


Towards Qos-Based Embedded Machine Learning, Tom Springer, Erik Linstead, Peiyi Zhao, Chelsea Parlett-Pelleriti 2022 Chapman University

Towards Qos-Based Embedded Machine Learning, Tom Springer, Erik Linstead, Peiyi Zhao, Chelsea Parlett-Pelleriti

Engineering Faculty Articles and Research

Due to various breakthroughs and advancements in machine learning and computer architectures, machine learning models are beginning to proliferate through embedded platforms. Some of these machine learning models cover a range of applications including computer vision, speech recognition, healthcare efficiency, industrial IoT, robotics and many more. However, there is a critical limitation in implementing ML algorithms efficiently on embedded platforms: the computational and memory expense of many machine learning models can make them unsuitable in resource-constrained environments. Therefore, to efficiently implement these memory-intensive and computationally expensive algorithms in an embedded computing environment, innovative resource management techniques are required at the …


Bitrdf: Extending Rdf For Bitemporal Data, Di Wu 2022 The Graduate Center, City University of New York

Bitrdf: Extending Rdf For Bitemporal Data, Di Wu

Dissertations, Theses, and Capstone Projects

The Internet is not only a platform for communication, transactions, and cloud storage, but it is also a large knowledge store where people as well as machines can create, manipulate, infer, and make use of data and knowledge. The Semantic Web was developed for this purpose. It aims to help machines understand the meaning of data and knowledge so that machines can use the data and knowledge in decision making. The Resource Description Framework (RDF) forms the foundation of the Semantic Web which is organized as the Semantic Web Layer Cake. RDF is limited and can only express a binary …


Artificial Justice: The Quandary Of Ai In The Courtroom, Paul W. Grimm, Maura R. Grossman, Sabine Gless, Mireille Hildebrandt 2022 Duke Law

Artificial Justice: The Quandary Of Ai In The Courtroom, Paul W. Grimm, Maura R. Grossman, Sabine Gless, Mireille Hildebrandt

Judicature International

No abstract provided.


An Attribute-Aware Attentive Gcn Model For Attribute Missing In Recommendation, Fan LIU, Zhiyong CHENG, Lei ZHU, Chenghao LIU, Liqiang NIE 2022 Singapore Management University

An Attribute-Aware Attentive Gcn Model For Attribute Missing In Recommendation, Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie

Research Collection School Of Computing and Information Systems

As important side information, attributes have been widely exploited in the existing recommender system for better performance. However, in the real-world scenarios, it is common that some attributes of items/users are missing (e.g., some movies miss the genre data). Prior studies usually use a default value (i.e., "other") to represent the missing attribute, resulting in sub-optimal performance. To address this problem, in this paper, we present an attribute-aware attentive graph convolution network (A(2)-GCN). In particular, we first construct a graph, where users, items, and attributes are three types of nodes and their associations are edges. Thereafter, we leverage the graph …


Towards Public Vaccination Data Resilience During Natural Disaster Using Blockchain-Based Decentralized Application, Riri Fitri Sari, Mokhammad Rizqi Herdiawan, Muhammad Hamzah, Muhammad Aljundi, Jauzak Hussaini Windiatmaja 2022 University of Indonesia

Towards Public Vaccination Data Resilience During Natural Disaster Using Blockchain-Based Decentralized Application, Riri Fitri Sari, Mokhammad Rizqi Herdiawan, Muhammad Hamzah, Muhammad Aljundi, Jauzak Hussaini Windiatmaja

Smart City

This paper presents a platform development plan for a public vaccination decentralized data base based on Blockchain technology. Vaccination is a medical practice that allows a human body to produce a specific antibody as a means of developing a preventive measure and reducing the risk of contracting a particular disease. Systematic vaccination data recording is an utmost important system that is required particularly during the Covid-19 pandemic. The recording mechanism should utilize the state-of-the-art information and communication technology (ICT). For faster vaccination data recording, it is common to use digital technology, making the health care process faster thus decreasing time …


Credit Card Fraud Detection Using Machine Learning Techniques, Nermin Samy Elhusseny, shimaa mohamed ouf, Amira M. Idrees AMI 2022 BIS Helwan University

Credit Card Fraud Detection Using Machine Learning Techniques, Nermin Samy Elhusseny, Shimaa Mohamed Ouf, Amira M. Idrees Ami

Future Computing and Informatics Journal

This is a systematic literature review to reflect the previous studies that dealt with credit card fraud detection and highlight the different machine learning techniques to deal with this problem. Credit cards are now widely utilized daily. The globe has just begun to shift toward financial inclusion, with marginalized people being introduced to the financial sector. As a result of the high volume of e-commerce, there has been a significant increase in credit card fraud. One of the most important parts of today's banking sector is fraud detection. Fraud is one of the most serious concerns in terms of monetary …


Cobol Cripples The Mind!: Academia And The Alienation Of Data Processing, Neel Shah 2022 Northwestern University

Cobol Cripples The Mind!: Academia And The Alienation Of Data Processing, Neel Shah

Swarthmore Undergraduate History Journal

This paper writes a social history of the programming language COBOL that focuses on its reception in academia. Through this focus, the paper seeks to understand the contentious relationship between data processing and the academy. In historicizing COBOL, the paper also illuminates the changing nature of the academy-industry-military triangle that was a mainstay of early computing.


Measuring The Rol Of Digital Engineering: It's A Journey, Not A Number, Tom McDermott, Kaitlin Henderson, Eileen Van Aken, Alejandro Salado, Joseph Bradley 2022 Old Dominion University

Measuring The Rol Of Digital Engineering: It's A Journey, Not A Number, Tom Mcdermott, Kaitlin Henderson, Eileen Van Aken, Alejandro Salado, Joseph Bradley

Engineering Management & Systems Engineering Faculty Publications

Systems engineering as a discipline has long had difficulty providing quantifiable evidence of its value (Honour 2004); DE transformation provides an opportunity to better measure its value. Transitioning from a document-based to a model-based approach is expensive, and organizations want to know if the effort and cost to adopt MBSE is worth it.


Finding Top-M Leading Records In Temporal Data, Yiyi WANG 2022 Singapore Management University

Finding Top-M Leading Records In Temporal Data, Yiyi Wang

Dissertations and Theses Collection (Open Access)

A traditional top-k query retrieves the records that stand out at a certain point in time. On the other hand, a durable top-k query considers how long the records retain their supremacy, i.e., it reports those records that are consistently among the top-k in a given time interval. In this thesis, we introduce a new query to the family of durable top-k formulations. It finds the top-m leading records, i.e., those that rank among the top-k for the longest duration within the query interval. Practically, this query assesses the records based on how long …


Spotlight Report #6: Proffering Machine-Readable Personal Privacy Research Agreements: Pilot Project Findings For Ieee P7012 Wg, Noreen Y. Whysel, Lisa LeVasseur 2022 Internet Safety Labs

Spotlight Report #6: Proffering Machine-Readable Personal Privacy Research Agreements: Pilot Project Findings For Ieee P7012 Wg, Noreen Y. Whysel, Lisa Levasseur

Publications and Research

What if people had the ability to assert their own legally binding permissions for data collection, use, sharing, and retention by the technologies they use? The IEEE P7012 has been working on an interoperability specification for machine-readable personal privacy terms to support this ability since 2018. The premise behind the work of IEEE P7012 is that people need technology that works on their behalf—i.e. software agents that assert the individual’s permissions and preferences in a machine-readable format.

Thanks to a grant from the IEEE Technical Activities Board Committee on Standards (TAB CoS), we were able to explore the attitudes of …


Total Sky Imager Project, Ryan D. Maier, Benjamin Jack Forest, Kyle X. McGrath 2022 California Polytechnic State University, San Luis Obispo

Total Sky Imager Project, Ryan D. Maier, Benjamin Jack Forest, Kyle X. Mcgrath

Mechanical Engineering

Solar farms like the Gold Tree Solar Farm at Cal Poly San Luis Obispo have difficulty delivering a consistent level of power output. Cloudy days can trigger a significant drop in the utility of a farm’s solar panels, and an unexpected loss of power from the farm could potentially unbalance the electrical grid. Being able to predict these power output drops in advance could provide valuable time to prepare a grid and keep it stable. Furthermore, with modern data analysis methods such as machine learning, these predictions are becoming more and more accurate – given a sufficient data set. The …


Who Is Missing? Characterizing The Participation Of Different Demographic Groups In A Korean Nationwide Daily Conversation Corpus, Haewoon KWAK, Jisun AN, Kunwoo PARK 2022 Singapore Management University

Who Is Missing? Characterizing The Participation Of Different Demographic Groups In A Korean Nationwide Daily Conversation Corpus, Haewoon Kwak, Jisun An, Kunwoo Park

Research Collection School Of Computing and Information Systems

A conversation corpus is essential to build interactive AI applications. However, the demographic information of the participants in such corpora is largely underexplored mainly due to the lack of individual data in many corpora. In this work, we analyze a Korean nationwide daily conversation corpus constructed by the National Institute of Korean Language (NIKL) to characterize the participation of different demographic (age and sex) groups in the corpus.


Blockchain Storage – Drive Configurations And Performance Analysis, Jesse Garner, Aditya A. Syal, Ronald C. Jones 2022 Harrisburg University of Science and Technology

Blockchain Storage – Drive Configurations And Performance Analysis, Jesse Garner, Aditya A. Syal, Ronald C. Jones

Other Student Works

This project will analyze the results of trials implementing various storage methods on Geth nodes to synchronize and maintain a full-archive state of the Ethereum blockchain. The purpose of these trials is to gain deeper insight to the process of lowering cost and increasing efficiency of blockchain storage using available technologies, analyzing results of various storage drives under similar conditions. It provides performance analysis and describes performance of each trial in relation to the others.


Data Management In Web Applications To Balance Performance And Security, Caleb Marcoux 2022 University of Nebraska - Lincoln

Data Management In Web Applications To Balance Performance And Security, Caleb Marcoux

Honors Theses

As web applications become increasingly popular, many are running several calculations and data processing on the client machine, it is important to consider data management practices on the front-end of these web applications. Typically, some data from the server is stored in the client's memory or hard disk. How much data should be stored for how long, as well as many other considerations, influence the time and space performance of the web application, as well as its security. In this thesis, we explore several challenges, solutions, and design patterns in web application data management through the lens of a senior …


How Blockchain Solutions Enable Better Decision Making Through Blockchain Analytics, Sammy Ter Haar 2022 University of Arkansas, Fayetteville

How Blockchain Solutions Enable Better Decision Making Through Blockchain Analytics, Sammy Ter Haar

Information Systems Undergraduate Honors Theses

Since the founding of computers, data scientists have been able to engineer devices that increase individuals’ opportunities to communicate with each other. In the 1990s, the internet took over with many people not understanding its utility. Flash forward 30 years, and we cannot live without our connection to the internet. The internet of information is what we called early adopters with individuals posting blogs for others to read, this was known as Web 1.0. As we progress, platforms became social allowing individuals in different areas to communicate and engage with each other, this was known as Web 2.0. As Dr. …


Modeling Damage Spread, Assessment, And Recovery Of Critical Systems, Justin Burns 2022 University of Arkansas, Fayetteville

Modeling Damage Spread, Assessment, And Recovery Of Critical Systems, Justin Burns

Graduate Theses and Dissertations

Critical infrastructure systems have recently become more vulnerable to attacks on their data systems through internet connectivity. If an attacker is successful in breaching a system’s defenses, it is imperative that operations are restored to the system as quickly as possible. This thesis focuses on damage assessment and recovery following an attack. A literature review is first conducted on work done in both database protection and critical infrastructure protection, then the thesis defines how damage affects the relationships between data and software. Then, the thesis proposes a model using a graph construction to show the cascading affects within a system …


Digital Commons powered by bepress