Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Data mining

Discipline
Institution
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 395

Full-Text Articles in Physical Sciences and Mathematics

Using Data Mining To Analyze Job Reviews, Nicholas Bornkamp, Tony Breitzman Apr 2024

Using Data Mining To Analyze Job Reviews, Nicholas Bornkamp, Tony Breitzman

STEM Student Research Symposium Posters

Job review websites like Glassdoor are not always clear on how well the company operates, especially as viewed from differing levels of employment. For instance, a middle or upper manager from Amazon may have an overall positive review of the company with minor issues about it, but someone who works in the warehouse may have a mixed experience. To solve this issue and determine any correlation between employee level and their review, data mining techniques were utilized such as website scraping and neural network training to develop a model that analyzes employee reviews.


Spatio-Temporal Association Rule Mining Of Traffic Congestion In A Large-Scale Road Network Based On Trajectory Data, Qifan Zhou, Haixu Liu, Zhipeng Dong, Yin Xu Jan 2024

Spatio-Temporal Association Rule Mining Of Traffic Congestion In A Large-Scale Road Network Based On Trajectory Data, Qifan Zhou, Haixu Liu, Zhipeng Dong, Yin Xu

Journal of System Simulation

Abstract: A K neighbor-RElim (KNR) algorithm and a sequential KNbr-RElim (SKNR) algorithm are proposed to mine traffic congestion association rules and congestion propagation spatio-temporal association rules by vehicle trajectory data in a large-scale road network. The KNR algorithm extends the spatial topology constraint based on the RElim algorithm. The KNR can be used to mine the road links prone to congestion from the large-scale trajectory dataset in a large-scale road network and quantify the strength of association for congested road links. The SKNR algorithm expands the time dimension in the form of sliding window and can be applied for mining …


Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang Jan 2024

Conceptthread: Visualizing Threaded Concepts In Mooc Videos, Zhiguang Zhou, Li Ye, Lihong Cai, Lei Wang, Yigang Wang, Yongheng Wang, Wei Chen, Yong Wang

Research Collection School Of Computing and Information Systems

Massive Open Online Courses (MOOCs) platforms are becoming increasingly popular in recent years. Online learners need to watch the whole course video on MOOC platforms to learn the underlying new knowledge, which is often tedious and time-consuming due to the lack of a quick overview of the covered knowledge and their structures. In this paper, we propose ConceptThread , a visual analytics approach to effectively show the concepts and the relations among them to facilitate effective online learning. Specifically, given that the majority of MOOC videos contain slides, we first leverage video processing and speech analysis techniques, including shot recognition, …


Impact Of Weather Factors On Airport Arrival Rates: Application Of Machine Learning In Air Transportation, Robert W. Maxson, Dothang Truong, Woojin Choi Dec 2023

Impact Of Weather Factors On Airport Arrival Rates: Application Of Machine Learning In Air Transportation, Robert W. Maxson, Dothang Truong, Woojin Choi

Publications

Weather is responsible for approximately 70% of air transportation delays in the National Airspace System, and delays resulting from convective weather alone cost airlines and passengers millions of dollars each year due to delays that could be avoided. This research sought to establish relationships between environmental variables and airport efficiency estimates by data mining archived weather and airport performance data at ten geographically and climatologically different airports. Several meaningful relationships were discovered from six out of ten airports using various machine learning methods within an overarching data mining protocol, and the developed models were tested using historical data.


On The Effect Of Emotion Identification From Limited Translated Text Samples Using Computational Intelligence, Madiha Tahir, Zahid Halim, Muhmmad Waqas, Shanshan Tu Dec 2023

On The Effect Of Emotion Identification From Limited Translated Text Samples Using Computational Intelligence, Madiha Tahir, Zahid Halim, Muhmmad Waqas, Shanshan Tu

Research outputs 2022 to 2026

Emotion identification from text data has recently gained focus of the research community. This has multiple utilities in an assortment of domains. Many times, the original text is written in a different language and the end-user translates it to her native language using online utilities. Therefore, this paper presents a framework to detect emotions on translated text data in four different languages. The source language is English, whereas the four target languages include Chinese, French, German, and Spanish. Computational intelligence (CI) techniques are applied to extract features, dimensionality reduction, and classification of data into five basic classes of emotions. Results …


Predictive Analysis Of Students’ Learning Performance Using Data Mining Techniques: A Comparative Study Of Feature Selection Methods, S. M. F. D. Syed Mustapha Sep 2023

Predictive Analysis Of Students’ Learning Performance Using Data Mining Techniques: A Comparative Study Of Feature Selection Methods, S. M. F. D. Syed Mustapha

All Works

The utilization of data mining techniques for the prompt prediction of academic success has gained significant importance in the current era. There is an increasing interest in utilizing these methodologies to forecast the academic performance of students, thereby facilitating educators to intervene and furnish suitable assistance when required. The purpose of this study was to determine the optimal methods for feature engineering and selection in the context of regression and classification tasks. This study compared the Boruta algorithm and Lasso regression for regression, and Recursive Feature Elimination (RFE) and Random Forest Importance (RFI) for classification. According to the findings, Gradient …


Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner Aug 2023

Cannabidiol Tweet Miner: A Framework For Identifying Misinformation In Cbd Tweets., Jason Turner

Electronic Theses and Dissertations

As regulations surrounding cannabis continue to develop, the demand for cannabis-based products is on the rise. Despite not producing the psychoactive effects commonly associated with THC, products containing cannabidiol (CBD) have gained immense popularity in recent years as a potential treatment option for a range of conditions, particularly those associated with pain or sleep disorders. However, due to current federal policies, these products have yet to undergo comprehensive safety and efficacy testing. Fortunately, utilizing advanced natural language processing (NLP) techniques, data harvested from social networks have been employed to investigate various social trends within healthcare, such as disease tracking and …


Dense & Attention Convolutional Neural Networks For Toe Walking Recognition, Junde Chen, Rahul Soangra, Marybeth Grant-Beuttler, Y. A. Nanehkaran, Yuxin Wen May 2023

Dense & Attention Convolutional Neural Networks For Toe Walking Recognition, Junde Chen, Rahul Soangra, Marybeth Grant-Beuttler, Y. A. Nanehkaran, Yuxin Wen

Physical Therapy Faculty Articles and Research

Idiopathic toe walking (ITW) is a gait disorder where children’s initial contacts show limited or no heel touch during the gait cycle. Toe walking can lead to poor balance, increased risk of falling or tripping, leg pain, and stunted growth in children. Early detection and identification can facilitate targeted interventions for children diagnosed with ITW. This study proposes a new one-dimensional (1D) Dense & Attention convolutional network architecture, which is termed as the DANet, to detect idiopathic toe walking. The dense block is integrated into the network to maximize information transfer and avoid missed features. Further, the attention modules are …


Analyzing Syntactic Constructs Of Java Programs With Machine Learning, Francisco Ortin, Guillermo Facundo, Miguel Garcia Apr 2023

Analyzing Syntactic Constructs Of Java Programs With Machine Learning, Francisco Ortin, Guillermo Facundo, Miguel Garcia

Department of Computer Science Publications

The massive number of open-source projects in public repositories has notably increased in the last years. Such repositories represent valuable information to be mined for different purposes, such as documenting recurrent syntactic constructs, analyzing the particular constructs used by experts and beginners, using them to teach programming and to detect bad programming practices, and building programming tools such as decompilers, Integrated Development Environments or Intelligent Tutoring Systems. An inherent problem of source code is that its syntactic information is represented with tree structures, while traditional machine learning algorithms use -dimensional datasets. Therefore, we present a feature engineering process to translate …


Learning Relation Prototype From Unlabeled Texts For Long-Tail Relation Extraction, Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua Feb 2023

Learning Relation Prototype From Unlabeled Texts For Long-Tail Relation Extraction, Yixin Cao, Jun Kuang, Ming Gao, Aoying Zhou, Yonggang Wen, Tat-Seng Chua

Research Collection School Of Computing and Information Systems

Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by extracting entity relations from texts. However, it usually suffers from the long-tail issue. The training data mainly concentrates on a few types of relations, leading to the lack of sufficient annotations for the remaining types of relations. In this paper, we propose a general approach to learn relation prototypes from unlabeled texts, to facilitate the long-tail relation extraction by transferring knowledge from the relation types with sufficient training data. We learn relation prototypes as an implicit factor between entities, which reflects the meanings of relations as well …


Dashboard Design Mining And Recommendation, Yanna Lin, Haotian Li, Aoyu Wu, Yong Wang, Huamin Qu Jan 2023

Dashboard Design Mining And Recommendation, Yanna Lin, Haotian Li, Aoyu Wu, Yong Wang, Huamin Qu

Research Collection School Of Computing and Information Systems

Dashboards, which comprise multiple views on a single display, help analyze and communicate multiple perspectives of data simultaneously. However, creating effective and elegant dashboards is challenging since it requires careful and logical arrangement and coordination of multiple visualizations. To solve the problem, we propose a data-driven approach for mining design rules from dashboards and automating dashboard organization. Specifically, we focus on two prominent aspects of the organization: , which describes the position, size, and layout of each view in the display space; and, which indicates the interaction between pairwise views. We build a new dataset containing 854 dashboards crawled online, …


Campus Safety Data Gathering, Classification, And Ranking Based On Clery-Act Reports, Walaa F. Abo Elenin Jan 2023

Campus Safety Data Gathering, Classification, And Ranking Based On Clery-Act Reports, Walaa F. Abo Elenin

Electronic Theses and Dissertations

Most existing campus safety rankings are based on criminal incident history with minimal or no consideration of campus security conditions and standard safety measures. Campus safety information published by universities/colleges is usually conceptual/qualitative and not quantitative and are based-on criminal records of these campuses. Thus, no explicit and trusted ranking method for these campuses considers the level of compliance with the standard safety measures. A quantitative safety measure is important to compare different campuses easily and to learn about specific campus safety conditions.

In this thesis, we utilize Clery-Act reports of campuses to automatically analyze their safety conditions and generate …


Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying Ren, Lei Wang, Kunpeng Liu, Ruocheng Guo, Ee-Peng Lim, Yanjie Fu Dec 2022

Mitigating Popularity Bias In Recommendation With Unbalanced Interactions: A Gradient Perspective, Weijieying Ren, Lei Wang, Kunpeng Liu, Ruocheng Guo, Ee-Peng Lim, Yanjie Fu

Research Collection School Of Computing and Information Systems

Recommender systems learn from historical user-item interactions to identify preferred items for target users. These observed interactions are usually unbalanced following a long-tailed distribution. Such long-tailed data lead to popularity bias to recommend popular but not personalized items to users. We present a gradient perspective to understand two negative impacts of popularity bias in recommendation model optimization: (i) the gradient direction of popular item embeddings is closer to that of positive interactions, and (ii) the magnitude of positive gradient for popular items are much greater than that of unpopular items. To address these issues, we propose a simple yet efficient …


Hybrid Feature Selection Based On Principal Component Analysis And Grey Wolf Optimizer Algorithm For Arabic News Article Classification, Osama Ahmad Alomari, Ashraf Elnagar, Imad Afyouni, Ismail Shahin, Ali Bou Nassif, Ibrahim Abaker Hashem, Mohammad Tubishat Nov 2022

Hybrid Feature Selection Based On Principal Component Analysis And Grey Wolf Optimizer Algorithm For Arabic News Article Classification, Osama Ahmad Alomari, Ashraf Elnagar, Imad Afyouni, Ismail Shahin, Ali Bou Nassif, Ibrahim Abaker Hashem, Mohammad Tubishat

All Works

The rapid growth of electronic documents has resulted from the expansion and development of internet technologies. Text-documents classification is a key task in natural language processing that converts unstructured data into structured form and then extract knowledge from it. This conversion generates a high dimensional data that needs further analusis using data mining techniques like feature extraction, feature selection, and classification to derive meaningful insights from the data. Feature selection is a technique used for reducing dimensionality in order to prune the feature space and, as a result, lowering the computational cost and enhancing classification accuracy. This work presents a …


An Empirical Study Of Blockchain System Vulnerabilities: Modules, Types, And Patterns, Xiao Yi, Daoyuan Wu, Lingxiao Jiang, Yuzhou Fang, Kehuan Zhang, Wei Zhang Nov 2022

An Empirical Study Of Blockchain System Vulnerabilities: Modules, Types, And Patterns, Xiao Yi, Daoyuan Wu, Lingxiao Jiang, Yuzhou Fang, Kehuan Zhang, Wei Zhang

Research Collection School Of Computing and Information Systems

Blockchain, as a distributed ledger technology, becomes increasingly popular, especially for enabling valuable cryptocurrencies and smart contracts. However, the blockchain software systems inevitably have many bugs. Although bugs in smart contracts have been extensively investigated, security bugs of the underlying blockchain systems are much less explored. In this paper, we conduct an empirical study on blockchain’s system vulnerabilities from four representative blockchains, Bitcoin, Ethereum, Monero, and Stellar. Specifically, we first design a systematic filtering process to effectively identify 1,037 vulnerabilities and their 2,317 patches from 34,245 issues/PRs (pull requests) and 85,164 commits on GitHub. We thus build the first blockchain …


Analyzing The Production And Use Of Fossil Fuels: A Case For Data Mining And Gis, Alejandro Conde Oct 2022

Analyzing The Production And Use Of Fossil Fuels: A Case For Data Mining And Gis, Alejandro Conde

Geography and the Environment: Graduate Student Capstones

As technology progresses and data grows both larger and more complex, techniques are being developed to keep up with the exponential growth of information. The term “data mining” is a blanket term used to describe an approach to find anomalies and correlations in a large dataset. This approach involves leveraging data mining software to manipulate and prepare data, apply statistics to quantify trends and characteristics in the data from a high level, and potentially apply advanced techniques like machine learning to identify patterns that wouldn’t be apparent otherwise. In this case study, data mining aided a GIS in displaying substantial …


Exploiting Reuse For Gpu Subgraph Enumeration, Wentiao Guo, Yuchen Li, Kian-Lee Tan Sep 2022

Exploiting Reuse For Gpu Subgraph Enumeration, Wentiao Guo, Yuchen Li, Kian-Lee Tan

Research Collection School Of Computing and Information Systems

Subgraph enumeration is important for many applications such as network motif discovery, community detection, and frequent subgraph mining. To accelerate the execution, recent works utilize graphics processing units (GPUs) to parallelize subgraph enumeration. The performances of these parallel schemes are dominated by the set intersection operations which account for up to $95\%$ of the total processing time. (Un)surprisingly, a significant portion (as high as $99\%$) of these operations is actually redundant, i.e., the same set of vertices is repeatedly encountered and evaluated. Therefore, in this paper, we seek to salvage and recycle the results of such operations to avoid repeated …


Using Deep Learning To Detect Social Media ‘Trolls’, Áine Macdermott, Michal Motylinski, Farkhund Iqbal, Kellyann Stamp, Mohammed Hussain, Andrew Marrington Sep 2022

Using Deep Learning To Detect Social Media ‘Trolls’, Áine Macdermott, Michal Motylinski, Farkhund Iqbal, Kellyann Stamp, Mohammed Hussain, Andrew Marrington

All Works

Detecting criminal activity online is not a new concept but how it can occur is changing. Technology and the influx of social media applications and platforms has a vital part to play in this changing landscape. As such, we observe an increasing problem with cyber abuse and ‘trolling’/toxicity amongst social media platforms sharing stories, posts, memes sharing content. In this paper we present our work into the application of deep learning techniques for the detection of ‘trolls’ and toxic content shared on social media platforms. We propose a machine learning solution for the detection of toxic images based on embedded …


Generative Methods, Meta-Learning, And Meta-Heuristics For Robust Cyber Defense, Marc W. Chale Sep 2022

Generative Methods, Meta-Learning, And Meta-Heuristics For Robust Cyber Defense, Marc W. Chale

Theses and Dissertations

Cyberspace is the digital communications network that supports the internet of battlefield things (IoBT), the model by which defense-centric sensors, computers, actuators and humans are digitally connected. A secure IoBT infrastructure facilitates real time implementation of the observe, orient, decide, act (OODA) loop across distributed subsystems. Successful hacking efforts by cyber criminals and strategic adversaries suggest that cyber systems such as the IoBT are not secure. Three lines of effort demonstrate a path towards a more robust IoBT. First, a baseline data set of enterprise cyber network traffic was collected and modelled with generative methods allowing the generation of realistic, …


Design And Analysis Of Strategic Behavior In Networks, Sixie Yu Aug 2022

Design And Analysis Of Strategic Behavior In Networks, Sixie Yu

McKelvey School of Engineering Theses & Dissertations

Networks permeate every aspect of our social and professional life.A networked system with strategic individuals can represent a variety of real-world scenarios with socioeconomic origins. In such a system, the individuals' utilities are interdependent---one individual's decision influences the decisions of others and vice versa. In order to gain insights into the system, the highly complicated interactions necessitate some level of abstraction. To capture the otherwise complex interactions, I use a game theoretic model called Networked Public Goods (NPG) game. I develop a computational framework based on NPGs to understand strategic individuals' behavior in networked systems. The framework consists of three …


Solving The Challenges Of Concept Drift In Data Stream Classification., Hanqing Hu Aug 2022

Solving The Challenges Of Concept Drift In Data Stream Classification., Hanqing Hu

Electronic Theses and Dissertations

The rise of network connected devices and applications leads to a significant increase in the volume of data that are continuously generated overtime time, called data streams. In real world applications, storing the entirety of a data stream for analyzing later is often not practical, due to the data stream’s potentially infinite volume. Data stream mining techniques and frameworks are therefore created to analyze streaming data as they arrive. However, compared to traditional data mining techniques, challenges unique to data stream mining also emerge, due to the high arrival rate of data streams and their dynamic nature. In this dissertation, …


Innovative Heuristics To Improve The Latent Dirichlet Allocation Methodology For Textual Analysis And A New Modernized Topic Modeling Approach, Jamie T. Zimmerman Jun 2022

Innovative Heuristics To Improve The Latent Dirichlet Allocation Methodology For Textual Analysis And A New Modernized Topic Modeling Approach, Jamie T. Zimmerman

Theses and Dissertations

Natural Language Processing is a complex method of data mining the vast trove of documents created and made available every day. Topic modeling seeks to identify the topics within textual corpora with limited human input into the process to speed analysis. Current topic modeling techniques used in Natural Language Processing have limitations in the pre-processing steps. This dissertation studies topic modeling techniques, those limitations in the pre-processing, and introduces new algorithms to gain improvements from existing topic modeling techniques while being competitive with computational complexity. This research introduces four contributions to the field of Natural Language Processing and topic modeling. …


A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi Apr 2022

A Remote Sensing And Machine Learning-Based Approach To Forecast The Onset Of Harmful Algal Bloom (Red Tides), Moein Izadi

Dissertations

In the last few decades, harmful algal blooms (HABs, also known as “red tides”) have become one of the most detrimental natural phenomena all around the world especially in Florida’s coastal areas due to local environmental factors and global warming in a larger scale. Karenia brevis produces toxins that have harmful effects on humans, fisheries, and ecosystems. In this study, I developed and compared the efficiency of state-of-the-art machine learning models (e.g., XGBoost, Random Forest, and Support Vector Machine) in predicting the occurrence of HABs. In the proposed models, the K. brevis abundance is used as the target, and 10 …


Telemetry Data Mining For Unmanned Aircraft Systems, Li Yu Mar 2022

Telemetry Data Mining For Unmanned Aircraft Systems, Li Yu

Theses and Dissertations

With ever more data becoming available to the US Air Force, it is vital to develop effective methods to leverage this strategic asset. Machine learning (ML) techniques present a means of meeting this challenge, as these tools have demonstrated successful use in commercial applications. For this research, three ML methods were applied to a unmanned aircraft system (UAS) telemetry dataset with the aim of extracting useful insight related to phases of flight. It was shown that ML provides an advantage in exploratory data analysis and as well as classification of phases. Neural network models demonstrated the best performance with over …


Constructing Prediction Intervals With Neural Networks: An Empirical Evaluation Of Bootstrapping And Conformal Inference Methods, Alexander N. Contarino Mar 2022

Constructing Prediction Intervals With Neural Networks: An Empirical Evaluation Of Bootstrapping And Conformal Inference Methods, Alexander N. Contarino

Theses and Dissertations

Artificial neural networks (ANNs) are popular tools for accomplishing many machine learning tasks, including predicting continuous outcomes. However, the general lack of confidence measures provided with ANN predictions limit their applicability, especially in military settings where accuracy is paramount. Supplementing point predictions with prediction intervals (PIs) is common for other learning algorithms, but the complex structure and training of ANNs renders constructing PIs difficult. This work provides the network design choices and inferential methods for creating better performing PIs with ANNs to enable their adaptation for military use. A two-step experiment is executed across 11 datasets, including an imaged-based dataset. …


Design Demand Trend Acquisition Method Based On Short Text Mining Of User Comments In Shopping Websites, Zhiyong Xiong, Zhaoxiong Yan, Huanan Yao, Shangsong Liang Feb 2022

Design Demand Trend Acquisition Method Based On Short Text Mining Of User Comments In Shopping Websites, Zhiyong Xiong, Zhaoxiong Yan, Huanan Yao, Shangsong Liang

Machine Learning Faculty Publications

In order to facilitate designers to explore the market demand trend of laptops and to establish a better “network users-market feedback mechanism”, we propose a design and research method of a short text mining tool based on the K-means clustering algorithm and Kano mode. An improved short text clustering algorithm is used to extract the design elements of laptops. Based on the traditional questionnaire, we extract the user’s attention factors, score the emotional tendency, and analyze the user’s needs based on the Kano model. Then, we select 10 laptops, process them by the improved algorithm, cluster the evaluation words and …


Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub Feb 2022

Subomiembed: Self-Supervised Representation Learning Of Multi-Omics Data For Cancer Type Classification, Sayed Hashim, Muhammad Ali, Karthik Nandakumar, Mohammad Yaqub

Computer Vision Faculty Publications

For personalized medicines, very crucial intrinsic information is present in high dimensional omics data which is difficult to capture due to the large number of molecular features and small number of available samples. Different types of omics data show various aspects of samples. Integration and analysis of multi-omics data give us a broad view of tumours, which can improve clinical decision making. Omics data, mainly DNA methylation and gene expression profiles are usually high dimensional data with a lot of molecular features. In recent years, variational autoencoders (VAE) [13] have been extensively used in embedding image and text data into …


Data Science Applied To Discover Ancient Minoan-Indus Valley Trade Routes Implied By Commonweight Measures, Peter Revesz Jan 2022

Data Science Applied To Discover Ancient Minoan-Indus Valley Trade Routes Implied By Commonweight Measures, Peter Revesz

CSE Conference and Workshop Papers

This paper applies data mining of weight measures to discover possible long-distance trade routes among Bronze Age civilizations from the Mediterranean area to India. As a result, a new northern route via the Black Sea is discovered between the Minoan and the Indus Valley civilizations. This discovery enhances the growing set of evidence for a strong and vibrant connection among Bronze Age civilizations.


Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman Jan 2022

Framework For The Evaluation Of Perturbations In The Systems Biology Landscape And Inter-Sample Similarity From Transcriptomic Datasets — A Digital Twin Perspective, Mariah Marie Hoffman

Dissertations and Theses

One approach to interrogating the complexities of human systems in their well-regulated and dysregulated states is through the use of digital twins. Digital twins are virtual representations of physical systems that are descriptive of an individual's state of health, an object fundamentally related to precision medicine. A key element for building a functional digital twin type for a disease or predicting the therapeutic efficacy of a potential treatment is harmonized, machine-parsable domain knowledge. Hypothesis-driven investigations are the gold standard for representing subsystems, but their results encompass a limited knowledge of the full biosystem. Multi-omics data is one rich source of …


Impact Of Sleep And Training On Game Performance And Injury In Division-1 Women’S Basketball Amidst The Pandemic, Samah Senbel, S. Sharma, S. M. Raval, Christopher B. Taber, Julie K. Nolan, N. S. Artan, Diala Ezzeddine, Kaya Tolga Jan 2022

Impact Of Sleep And Training On Game Performance And Injury In Division-1 Women’S Basketball Amidst The Pandemic, Samah Senbel, S. Sharma, S. M. Raval, Christopher B. Taber, Julie K. Nolan, N. S. Artan, Diala Ezzeddine, Kaya Tolga

School of Computer Science & Engineering Faculty Publications

We investigated the impact of sleep and training load of Division - 1 women’s basketball players on their game performance and injury prediction using machine learning algorithms. The data was collected during a pandemic-condensed season with unpredictable interruptions to the games and athletic training schedules. We collected data from sleep monitoring devices, training data from coaches, injury reports from medical staff, and weekly survey data from athletes for 22 weeks.With proper data imputation, interpretable feature set, data balancing, and classifiers, we showed that we could predict game performance and injuries with more than 90% accuracy. More importantly, our F1 and …