Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

1,483 Full-Text Articles 2,962 Authors 435,013 Downloads 189 Institutions

All Articles in Data Science

Faceted Search

1,483 full-text articles. Page 19 of 73.

Developing A Data-Driven Statistical Model For Accurately Predicting The Superconducting Critical Temperature Of Materials Using Multiple Regression And Gradient-Boosted Methods, Emil Agbemade 2023 University of Central Florida

Developing A Data-Driven Statistical Model For Accurately Predicting The Superconducting Critical Temperature Of Materials Using Multiple Regression And Gradient-Boosted Methods, Emil Agbemade

Data Science and Data Mining

This study focuses on developing a statistical model for estimating the superconducting critical temperature (Tc) of materials using a data-driven strategy. The study analyzed 21,263 superconductors and used a combination of multiple regression and gradient-boosted models to make predictions. The analysis included a descriptive analysis of the distribution of Tc, feature selection using the Backwards selection method, and model diagnostics. The results showed that the gradient-boosted method outperformed the multiple linear regression method with an RMSE of 12.01 and an R2 value of 88.23 after fine-tuning its hyperparameters. The study concludes that the gradient-boosted method is an effective approach …


Analyzing The Impact Of Health, Economic, And Demographic Factors On Life Expectancy: A Comparative Study Of Developed And Developing Countries, Mahyar Alinejad 2023 University of Central Florida

Analyzing The Impact Of Health, Economic, And Demographic Factors On Life Expectancy: A Comparative Study Of Developed And Developing Countries, Mahyar Alinejad

Data Science and Data Mining

This study presents a comprehensive analysis of three prominent machine learning regression models—Random Forest, XGBoost, and Support Vector Machine (SVM)—in the context of predictive analysis. Leveraging a carefully curated dataset, we explore the impact of various hyperparameters on model performance through an exhaustive tuning process. The Random Forest and XGBoost models exhibit robust predictive capabilities, with the former revealing notable insights through feature importance visualization. Additionally, SVM, optimized via GridSearchCV, demonstrates competitive performance. Evaluation metrics, including Mean Squared Error and R-squared, facilitate a thorough comparison of model efficacy. Results highlight nuanced strengths and weaknesses, informing practitioners on the suitability of …


Should Academia Thrive For Research Citation In Policy? A Case Study On Five Universities In Illinois., Minhaz Suleman Ibrahim Patel 2023 Northern Illinois University

Should Academia Thrive For Research Citation In Policy? A Case Study On Five Universities In Illinois., Minhaz Suleman Ibrahim Patel

CURE Proceedings

Academics and policymakers are seen as operating separately, which limits the potential impact of research on society. The influence of university research on policy documents is frequently underestimated, given that cutting-edge research is being conducted at universities. Therefore, it is crucial to unveil the role of academic research in fostering evidence-driven policymaking across various public service domains. In this study, we conducted an in-depth exploratory data analysis and statistical summarization to comprehensively understand the level of academic research present in policy documents. We chose five public universities from the state of Illinois and collected research and policy citation data for …


Time Series Forecasting For Stock Market Prices, Albert Zhou 2023 John Carroll University

Time Series Forecasting For Stock Market Prices, Albert Zhou

Senior Honors Projects

No abstract provided.


Warehouses In The Inland Empire: Displacing Land And Life, Katherine Gelsey 2023 Claremont Colleges

Warehouses In The Inland Empire: Displacing Land And Life, Katherine Gelsey

Pomona Senior Theses

The Inland Empire in Southern California embodies unique spatial and social configurations as a consequence of how settler colonialism has manifested locally in the region since the Spanish Mission Period. This work uses GIS software to estimate patterns of land conversion for residential, agricultural, and warehouse land from 2012 to 2022. Preliminary analysis suggests that thousands of people have been displaced by warehouse expansion over the ten-year period. In the twenty-first century, the Southern California logistics industry continues processes of land dispossession and racialized labor exploitation through displacing agricultural and residential land, exposing disproportionately low-income Black and Latine communities living …


Teaching Analytics Online: A Self-Study Of Professional Practice, Andrew J. Collins, Brandon Butler, James F. Leathrum Jr., Christopher J. Lynch 2023 Old Dominion University

Teaching Analytics Online: A Self-Study Of Professional Practice, Andrew J. Collins, Brandon Butler, James F. Leathrum Jr., Christopher J. Lynch

Engineering Management & Systems Engineering Faculty Publications

As the COVID-19 pandemic caused severe disruption to education enterprises throughout the world, the main response by educational institutions was to move to online learning environments. The purpose of this study was to understand better how instructors could improve online learning for a professional-level week-long short course in a highly technical area (data analytics), which had, pre-COVID, been a hands-on computer, laboratory-based learning experience. The authors used self-study of professional practice to elicit and understand the major issues and concerns of the transition to an online learning environment. Under the guidance of a colleague in teacher education, three course instructors …


Readiness For Transfer: A Mixed-Methods Study On Icu Transfers Of Care, Soo-Hoon Lee, Clarice Wee, Phillip Phan, Yanika Kowitlawakul, Chee-Kiat Tan, Amartya Mukhopadhyay 2023 Old Dominion University

Readiness For Transfer: A Mixed-Methods Study On Icu Transfers Of Care, Soo-Hoon Lee, Clarice Wee, Phillip Phan, Yanika Kowitlawakul, Chee-Kiat Tan, Amartya Mukhopadhyay

Management Faculty Publications

Objective Past studies on intensive care unit (ICU) patient transfers compare the efficacy of using standardised checklists against unstructured communications. Less studied are the experiences of clinicians in enacting bidirectional (send/receive) transfers. This study reports on the differences in protocols and data elements between receiving and sending transfers in the ICU, and the elements constituting readiness for transfer.

Methods Mixed-methods study of a 574-bed general hospital in Singapore with a 74-bed ICU for surgical and medical patients. Six focus group discussions (FGDs) with 34 clinicians comprising 15 residents and 19 nurses, followed by a structured questionnaire survey of 140 clinicians …


Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu 2023 Claremont Colleges

Application Of Sentiment Analysis And Machine Learning Techniques To Predict Daily Cryptocurrency Price Returns, Edward Wu

CMC Senior Theses

This paper examines the effects of social media sentiment relating to Bitcoin on the daily price returns of Bitcoin and other popular cryptocurrencies by utilizing sentiment analysis and machine learning techniques to predict daily price returns. Many investors think that social media sentiment affects cryptocurrency prices. However, the results of this paper find that social media sentiment relating to Bitcoin does not add significant predictive value to forecasting daily price returns for each of the six cryptocurrencies used for analysis and that machine learning models that do not assume linearity between the current day price return and previous daily price …


Health Care Equity Through Intelligent Edge Computing And Augmented Reality/Virtual Reality: A Systematic Review, Vishal Lakshminarayanan, Aswathy Ravikumar, Harini Sriraman, Sujatha Alla, Vijay Kumar Chattu 2023 Vellore Institute of Technology

Health Care Equity Through Intelligent Edge Computing And Augmented Reality/Virtual Reality: A Systematic Review, Vishal Lakshminarayanan, Aswathy Ravikumar, Harini Sriraman, Sujatha Alla, Vijay Kumar Chattu

Engineering Management & Systems Engineering Faculty Publications

Intellectual capital is a scarce resource in the healthcare industry. Making the most of this resource is the first step toward achieving a completely intelligent healthcare system. However, most existing centralized and deep learning-based systems are unable to adapt to the growing volume of global health records and face application issues. To balance the scarcity of healthcare resources, the emerging trend of IoMT (Internet of Medical Things) and edge computing will be very practical and cost-effective. A full examination of the transformational role of intelligent edge computing in the IoMT era to attain health care equity is offered in this …


Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar 2023 Claremont Colleges

Utilizing Machine Learning In Healthcare In An Ethical Fashion, Nishka Ayyar

CMC Senior Theses

This thesis paper explores the ethical considerations surrounding the use of machine learning (ML) solutions in healthcare. The background section discusses the basics of machine learning techniques and algorithms, and the increasing interest in their utilization in the healthcare sector. The paper then reviews and critically analyzes four studies that highlight concerns related to using ML in healthcare, including issues of bias, privacy, accountability, and transparency. Based on the analysis of these studies, the paper presents several recommendations for addressing these concerns. The paper concludes with a discussion on the potential benefits of using machine learning technology in healthcare. Ultimately, …


Invasive Buckthorn Mapping: A Uav-Based Approach Utilizing Machine Learning, Gis, And Remote Sensing Techniques In The Upper Peninsula Of Michigan, Vikranth Madeppa 2023 Michigan Technological University

Invasive Buckthorn Mapping: A Uav-Based Approach Utilizing Machine Learning, Gis, And Remote Sensing Techniques In The Upper Peninsula Of Michigan, Vikranth Madeppa

Dissertations, Master's Theses and Master's Reports

An Invasive species is a species that is alien or non-native to the ecosystem which causes harm to economic, environmental, or human health (E.O. 13112 of Feb 3, 1999). Invasive species have posed a serious threat to ecosystems across the globe. These invasive species have impacts on the biodiversity and productivity of invaded forests. Remotely sensed data is a valuable resource for understanding and addressing issues related to invasive species. This study presents a novel approach for mapping the distribution of two invasive plant species, Common and Glossy Buckthorn, using unmanned aerial vehicles (UAVs), machine learning algorithms, geographic information systems …


Thinking Local With Original Data In Ai And Machine Learning Research, David G. Taylor, Robert McCloud 2023 Sacred Heart University

Thinking Local With Original Data In Ai And Machine Learning Research, David G. Taylor, Robert Mccloud

WCBT Working Papers

Sacred Heart University spent significant funds to establish an AI lab. Initially there is no ongoing research and no real plan for a research agenda. This paper details how the Jack Welch College of Business and Technology created and implemented an active meaningful research plan. It involves two key elements: thinking local and using business connections to foster active, impactful research. Surrounding communities, business connections, area environment, and other Sacred Heart University departments all played a part. The research plan also identifies a specific issue in working with local and business contact sources: the AI researcher almost never gets data …


A Study On Global Reef Deterioration: Exploring Coral Bleaching, Emily Fernandez 2023 Claremont Colleges

A Study On Global Reef Deterioration: Exploring Coral Bleaching, Emily Fernandez

CMC Senior Theses

This thesis is a study on coral bleaching and coral mortality, studying the relationship between variables such as depth, exposure, distance to shore, and temperature for percent bleaching. All of the analyses were made using two different data sets, that contain information about bleaching events in specific regions, and dates, and provide information factors such as depth, temperature, and exposure. Models were created for different relationships of variables for eco-regions, recent data, and countries. I attempted to find relationships between variables such as depth, temperature, exposure, and distance to shore, and how they affect coral bleaching. Unfortunately, I did not …


Maximizing Productivity And Quality In Senior Thesis Writing With Artificial Intelligence And Natural Language Processing Driven Tools, Lauren Leadbetter 2023 Claremont Colleges

Maximizing Productivity And Quality In Senior Thesis Writing With Artificial Intelligence And Natural Language Processing Driven Tools, Lauren Leadbetter

CMC Senior Theses

This project is a Python program designed to generate a senior thesis on a user-
inputted topic using natural language processing techniques. The program takes in a
topic from the user and then uses OpenAI API to deploy text models for text genera-
tion and evaluation, such as GPT-3 and Davinci-003. The resulting output is in .tex
format and includes a first-draft outline and paper, followed by self-generated assessment, with scoring, revisions, and feedback comments instructing manual revisions.

This submission is a sample using one available model of the project, meant to
demonstrate it’s functionality and limitations. Further model versions …


Blockchain And Puf-Based Secure Key Establishment Protocol For Cross-Domain Digital Twins In Industrial Internet Of Things Architecture, Khalid Mahmood, Salman Shamshad, Muhammad Asad Saleem, Rupak Kharel, Ashok Kumar Das, Sachin Shetty, Joel J. P. C. Rodrigues 2023 University of Central Lancashire

Blockchain And Puf-Based Secure Key Establishment Protocol For Cross-Domain Digital Twins In Industrial Internet Of Things Architecture, Khalid Mahmood, Salman Shamshad, Muhammad Asad Saleem, Rupak Kharel, Ashok Kumar Das, Sachin Shetty, Joel J. P. C. Rodrigues

VMASC Publications

Introduction:: The Industrial Internet of Things (IIoT) is a technology that connects devices to collect data and conduct in-depth analysis to provide value-added services to industries. The integration of the physical and digital domains is crucial for unlocking the full potential of the IIoT, and digital twins can facilitate this integration by providing a virtual representation of real-world entities.

Objectives:: By combining digital twins with the IIoT, industries can simulate, predict, and control physical behaviors, enabling them to achieve broader value and support industry 4.0 and 5.0. Constituents of cooperative IIoT domains tend to interact and collaborate during their complicated …


Assessing Univariate And Multivariate Normality In Pls-Sem, Kathy Qing Ma, Weiyong Zhang 2023 Texas A&M International University

Assessing Univariate And Multivariate Normality In Pls-Sem, Kathy Qing Ma, Weiyong Zhang

Information Technology & Decision Sciences Faculty Publications

Partial least squares structural equation modeling (PLS-SEM) has gained popularity among researchers in part due to its relaxed requirement for multivariate normality. One important step in performing structural equation modeling (SEM) is to test the normality assumption. In this paper, we illustrate how to assess univariate and multivariate normality in PLS-SEM using WarpPLS.


Assessing Spurious Correlations In Big Search Data, Jesse T. Richman, Ryan J. Roberts 2023 Old Dominion University

Assessing Spurious Correlations In Big Search Data, Jesse T. Richman, Ryan J. Roberts

Political Science & Geography Faculty Publications

Big search data offers the opportunity to identify new and potentially real-time measures and predictors of important political, geographic, social, cultural, economic, and epidemiological phenomena, measures that might serve an important role as leading indicators in forecasts and nowcasts. However, it also presents vast new risks that scientists or the public will identify meaningless and totally spurious ‘relationships’ between variables. This study is the first to quantify that risk in the context of search data. We find that spurious correlations arise at exceptionally high frequencies among probability distributions examined for random variables based upon gamma (1, 1) and Gaussian random …


Practical Ai Value Alignment Using Stories, Md Sultan Al Nahian 2023 University of Kentucky

Practical Ai Value Alignment Using Stories, Md Sultan Al Nahian

Theses and Dissertations--Computer Science

As more machine learning agents interact with humans, it is increasingly a prospect that an agent trained to perform a task optimally - using only a measure of task performance as feedback--can violate societal norms for acceptable behavior or cause harm. Consequently, it becomes necessary to prioritize task performance and ensure that AI actions do not have detrimental effects. Value alignment is a property of intelligent agents, wherein they solely pursue goals and activities that are non-harmful and beneficial to humans. Current approaches to value alignment largely depend on imitation learning or learning from demonstration methods. However, the dynamic nature …


Distributed Spatial Data Sharing: A New Era In Sharing Spatial Data, Majid Hojati 2023 Wilfrid Laurier University

Distributed Spatial Data Sharing: A New Era In Sharing Spatial Data, Majid Hojati

Theses and Dissertations (Comprehensive)

The advancements in information and communications technology, including the widespread adoption of GPS-based sensors, improvements in computational data processing, and satellite imagery, have resulted in new data sources, stakeholders, and methods of producing, using, and sharing spatial data. Daily, vast amounts of data are produced by individuals interacting with digital content and through automated and semi-automated sensors deployed across the environment. A growing portion of this information contains geographic information directly or indirectly embedded within it. The widespread use of automated smart sensors and an increased variety of georeferenced media resulted in new individual data collectors. This raises a new …


Dynamic Predictions Of Thermal Heating And Cooling Of Silicon Wafer, Hitesh Kumar 2023 San Jose State University

Dynamic Predictions Of Thermal Heating And Cooling Of Silicon Wafer, Hitesh Kumar

Master's Projects

Neural Networks are now emerging in every industry. All the industries are trying their best to exploit the benefits of neural networks and deep learning to make predictions or simulate their ongoing process with the use of their generated data. The purpose of this report is to study the heating pattern of a silicon wafer and make predictions using various machine learning techniques. The heating of the silicon wafer involves various factors ranging from number of lamps, wafer properties and points taken in consideration to capture the heating temperature. This process involves dynamic inputs which facilitates the heating of the …


Digital Commons powered by bepress