Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Machine learning

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 1519

Full-Text Articles in Physical Sciences and Mathematics

Sustainable Energysense: A Predictive Machine Learning Framework For Optimizing Residential Electricity Consumption, Murad Al-Rajab, Samia Loucif Dec 2024

Sustainable Energysense: A Predictive Machine Learning Framework For Optimizing Residential Electricity Consumption, Murad Al-Rajab, Samia Loucif

All Works

In a world where electricity is often taken for granted, the surge in consumption poses significant challenges, including elevated CO2 emissions and rising prices. These issues not only impact consumers but also have broader implications for the global environment. This paper endeavors to propose a smart application dedicated to optimizing the electricity consumption of household appliances. It employs Augmented Reality (AR) technology along with YOLO to detect electrical appliances and provide detailed electricity consumption insights, such as displaying the appliance consumption rate and computing the total electricity consumption based on the number of hours the appliance was used. The application …


Comparative Analysis Of Surrogate Models For The Dissolution Of Spent Nuclear Fuel, Dayo Awe May 2024

Comparative Analysis Of Surrogate Models For The Dissolution Of Spent Nuclear Fuel, Dayo Awe

Electronic Theses and Dissertations

This thesis presents a comparative analysis of surrogate models for the dissolution of spent nuclear fuel, with a focus on the use of deep learning techniques. The study explores the accuracy and efficiency of different machine learning methods in predicting the dissolution behavior of nuclear waste, and compares them to traditional modeling approaches. The results show that deep learning models can achieve high accuracy in predicting the dissolution rate, while also being computationally efficient. The study also discusses the potential applications of surrogate modeling in the field of nuclear waste management, including the optimization of waste disposal strategies and the …


Accurate Characterization Of Binding Kinetics And Allosteric Mechanisms For The Hsp90 Chaperone Inhibitors Using Ai-Augmented Integrative Biophysical Studies, Chao Xu, Xianglei Zhang, Lianghao Zhao, Gennady M. Verkhivker, Fang Bai Apr 2024

Accurate Characterization Of Binding Kinetics And Allosteric Mechanisms For The Hsp90 Chaperone Inhibitors Using Ai-Augmented Integrative Biophysical Studies, Chao Xu, Xianglei Zhang, Lianghao Zhao, Gennady M. Verkhivker, Fang Bai

Mathematics, Physics, and Computer Science Faculty Articles and Research

The binding kinetics of drugs to their targets are gradually being recognized as a crucial indicator of the efficacy of drugs in vivo, leading to the development of various computational methods for predicting the binding kinetics in recent years. However, compared with the prediction of binding affinity, the underlying structure and dynamic determinants of binding kinetics are more complicated. Efficient and accurate methods for predicting binding kinetics are still lacking. In this study, quantitative structure–kinetics relationship (QSKR) models were developed using 132 inhibitors targeting the ATP binding domain of heat shock protein 90α (HSP90α) to predict the dissociation rate …


Multi-Aspect Rule-Based Ai: Methods, Taxonomy, Challenges And Directions Towards Automation, Intelligence And Transparent Cybersecurity Modeling For Critical Infrastructures, Iqbal H. Sarker, Helge Janicke, Mohamed A. Ferrag, Alsharif Abuadbba Apr 2024

Multi-Aspect Rule-Based Ai: Methods, Taxonomy, Challenges And Directions Towards Automation, Intelligence And Transparent Cybersecurity Modeling For Critical Infrastructures, Iqbal H. Sarker, Helge Janicke, Mohamed A. Ferrag, Alsharif Abuadbba

Research outputs 2022 to 2026

Critical infrastructure (CI) typically refers to the essential physical and virtual systems, assets, and services that are vital for the functioning and well-being of a society, economy, or nation. However, the rapid proliferation and dynamism of today's cyber threats in digital environments may disrupt CI functionalities, which would have a debilitating impact on public safety, economic stability, and national security. This has led to much interest in effective cybersecurity solutions regarding automation and intelligent decision-making, where AI-based modeling is potentially significant. In this paper, we take into account “Rule-based AI” rather than other black-box solutions since model transparency, i.e., human …


Enhancing Landslide Susceptibility Modelling Through A Novel Non-Landslide Sampling Method And Ensemble Learning Technique, Chao Zhou, Yue Wang, Ying Cao, Ramesh P. Singh, Bayes Ahmed, Mahdi Motagh, Yang Wang, Ling Chen, Guangchao Tan, Shanshan Li Mar 2024

Enhancing Landslide Susceptibility Modelling Through A Novel Non-Landslide Sampling Method And Ensemble Learning Technique, Chao Zhou, Yue Wang, Ying Cao, Ramesh P. Singh, Bayes Ahmed, Mahdi Motagh, Yang Wang, Ling Chen, Guangchao Tan, Shanshan Li

Mathematics, Physics, and Computer Science Faculty Articles and Research

In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development …


Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan Mar 2024

Automated Identification And Mapping Of Interesting Mineral Spectra In Crism Images, Arun M. Saranathan

Doctoral Dissertations

The Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) has proven to be an invaluable tool for the mineralogical analysis of the Martian surface. It has been crucial in identifying and mapping the spatial extents of various minerals. Primarily, the identification and mapping of these mineral spectral-shapes have been performed manually. Given the size of the CRISM image dataset, manual analysis of the full dataset would be arduous/infeasible. This dissertation attempts to address this issue by describing an (machine learning based) automated processing pipeline for CRISM data that can be used to identify and map the unique mineral signatures present in …


Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia Mar 2024

Data To Science With Ai And Human-In-The-Loop, Gustavo Perez Sarabia

Doctoral Dissertations

AI has the potential to accelerate scientific discovery by enabling scientists to analyze vast datasets more efficiently than traditional methods. For example, this thesis considers the detection of star clusters in high-resolution images of galaxies taken from space telescopes, as well as studying bird migration from RADAR images. In these applications, the goal is to make measurements to answer scientific questions, such as how the star formation rate is affected by mass, or how the phenology of bird migration is influenced by climate change. However, current computer vision systems are far from perfect for conducting these measurements directly. They may …


Preserving Linguistic Diversity In The Digital Age: A Scalable Model For Cultural Heritage Continuity, James Hutson, Pace Ellsworth, Matt Ellsworth Mar 2024

Preserving Linguistic Diversity In The Digital Age: A Scalable Model For Cultural Heritage Continuity, James Hutson, Pace Ellsworth, Matt Ellsworth

Faculty Scholarship

In the face of the rapid erosion of both tangible and intangible cultural heritage globally, the urgency for effective, wide-ranging preservation methods has never been greater. Traditional approaches in cultural preservation often focus narrowly on specific niches, overlooking the broader cultural tapestry, particularly the preservation of everyday cultural elements. This article addresses this critical gap by advocating for a comprehensive, scalable model for cultural preservation that leverages machine learning and big data analytics. This model aims to document and archive a diverse range of cultural artifacts, encompassing both extraordinary and mundane aspects of heritage. A central issue highlighted in the …


Early Warning And Prediction Of Kicks And Lost Circulation Accident During Rescue Drilling Of Mine, Chen Weiming, Wang Jiawen, Fan Dong, Hao Shijun, Zhao Jiangpeng, Qiu Yu Mar 2024

Early Warning And Prediction Of Kicks And Lost Circulation Accident During Rescue Drilling Of Mine, Chen Weiming, Wang Jiawen, Fan Dong, Hao Shijun, Zhao Jiangpeng, Qiu Yu

Coal Geology & Exploration

In order to solve the problems such as the difficulty in early warning and prediction of kicks and lost circulation accidents during emergency rescue drilling of mine, a machine learning-based early for warning and prediction model of drilling process was established. Firstly, the accident characterization parameters of the drilling parameters in the early stage of kicks and lost circulation accidents were analyzed. Secondly, the accident characterization parameters were cleaned and processed. On this basis, XGBoost and early warning model was used to carry out the early diagnosis and identification of kicks and lost circulation accidents. Then, the PSO-LSTM accident development …


Artificial Intelligence Usage And Data Privacy Discoveries Within Mhealth, Jennifer Schulte Mar 2024

Artificial Intelligence Usage And Data Privacy Discoveries Within Mhealth, Jennifer Schulte

Faculty Research & Publications

Advancements in artificial intelligence continue to impact nearly every aspect of human life by providing integration options that aim to supplement or improve current processes. One industry that continues to benefit from artificial intelligence integration is healthcare. For years now, elements of artificial intelligence have been used to assist in clinical decision making, helping to identify potential health risks at earlier stages, and supplementing precision medicine. An area of healthcare that specifically looks at wearable devices, sensors, phone applications, and other such devices is mobile health (mHealth). These devices are used to aid in health data collection and delivery. This …


Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany Mar 2024

Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany

Conference papers

Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven …


Preprocessing Of Astronomical Images From The Neowise Survey For Near-Earth Asteroid Detection With Machine Learning, Rachel Meyer Mar 2024

Preprocessing Of Astronomical Images From The Neowise Survey For Near-Earth Asteroid Detection With Machine Learning, Rachel Meyer

ELAIA

Asteroid detection is a common field in astronomy for planetary defense, requiring observations from survey telescopes to detect and classify different objects. The amount of data collected each night is continually increasing as new and better-designed telescopes begin collecting information each year. This amount of data is quickly becoming unmanageable, and researchers are looking for ways to better process this data. The most feasible current solution is to implement computer algorithms to automatically detect these sources and then use machine learning to create a more efficient and accurate method of classification. Implementation of such methods has previously focused on larger …


Gt-Ches And Dycon: Improved Classification For Human Evolutionary Systems, Joseph S. Johnson Mar 2024

Gt-Ches And Dycon: Improved Classification For Human Evolutionary Systems, Joseph S. Johnson

Theses and Dissertations

The purpose of this work is to rethink the process of learning in human evolutionary systems. We take a sober look at how game theory, network theory, and chaos theory pertain specifically to the modeling, data, and training components of generalization in human systems. The value of our research is three-fold. First, our work is a direct approach to align machine learning generalization with core behavioral theories. We made our best effort to directly reconcile the axioms of these heretofore incompatible disciplines -- rather than moving from AI/ML towards the behavioral theories while building exclusively on AI/ML intuition. Second, this …


The Impact Of Artificial Intelligence And Machine Learning On Organizations Cybersecurity, Mustafa Abdulhussein Feb 2024

The Impact Of Artificial Intelligence And Machine Learning On Organizations Cybersecurity, Mustafa Abdulhussein

Doctoral Dissertations and Projects

As internet technology proliferate in volume and complexity, the ever-evolving landscape of malicious cyberattacks presents unprecedented security risks in cyberspace. Cybersecurity challenges have been further exacerbated by the continuous growth in the prevalence and sophistication of cyber-attacks. These threats have the capacity to disrupt business operations, erase critical data, and inflict reputational damage, constituting an existential threat to businesses, critical services, and infrastructure. The escalating threat is further compounded by the malicious use of artificial intelligence (AI) and machine learning (ML), which have increasingly become tools in the cybercriminal arsenal. In this dynamic landscape, the emergence of offensive AI introduces …


Predicting Forage Provision Of Grasslands Across Climate Zones By Hyperspectral Measurements, F. A. Männer, J. Muro, J. Ferner, S. Schmidtlein, A. Linstädter Feb 2024

Predicting Forage Provision Of Grasslands Across Climate Zones By Hyperspectral Measurements, F. A. Männer, J. Muro, J. Ferner, S. Schmidtlein, A. Linstädter

IGC Proceedings (1997-2023)

The potential of grasslands’ fodder production is a crucial management measure, while its quantification is still laborious and costly. Remote sensing technologies, such as hyperspectral field measurements, enable fast and non-destructive estimation. However, such methods are still limited in transferability to other locations or climatic conditions. With this study, we aim to predict forage nutritive value, quantity, and energy yield from hyperspectral canopy reflections of grasslands across three climate zones. We took hyperspectral measurements with a field spectrometer from grassland canopies in temperate, tropical and semi-arid grasslands, and analyzed corresponding biomass samples for their quantity (BM), metabolizable energy content (ME) …


Predicting Open-Pit Mine Production Using Machine Learning Techniques, Faustin Nartey Kumah, Alex Kwasi Saim, Millicent Nkrumah Oppong, Clement Kweku Arthur Feb 2024

Predicting Open-Pit Mine Production Using Machine Learning Techniques, Faustin Nartey Kumah, Alex Kwasi Saim, Millicent Nkrumah Oppong, Clement Kweku Arthur

Journal of Sustainable Mining

In mining, where production is affected by several factors, including equipment availability, it is necessary to develop reliable models to accurately predict mine production to improve operational efficiency. Hence, in this study, four (4) machine learning algorithms – namely: artificial neural network (ANN), random forest (RF), gradient boosting regression (GBR) and decision tree (DT)) – were implemented to predict mine production. Multiple Linear Regression (MLR) analysis was used as a baseline study for comparison purposes. In that regard, one hundred and twenty-six (126) datasets from an open-pit gold mine were used. The developed models were evaluated and compared using the …


Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu Feb 2024

Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu

Computer Science Faculty Publications and Presentations

Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, …


Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno Feb 2024

Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno

All Works

This data article provides a dataset of 132421 posts and their corresponding information collected from Twitter social media. The data has two classes, ham or spam, where ham indicates non-spam clean tweets. The main target of this dataset is to study a way to classify whether a post is a spam or not automatically. The data is in Arabic language only, which makes the data essential to the researchers in Arabic natural language processing (NLP) due to the lack of resources in this language. The data is made publicly available to allow researchers to use it as a benchmark for …


Molecular Understanding And Design Of Deep Eutectic Solvents And Proteins Using Computer Simulations And Machine Learning, Usman Lame Abbas Jan 2024

Molecular Understanding And Design Of Deep Eutectic Solvents And Proteins Using Computer Simulations And Machine Learning, Usman Lame Abbas

Theses and Dissertations--Chemical and Materials Engineering

Hydrophobic deep eutectic solvents (DESs) have emerged as excellent extractants. A major challenge is the lack of an efficient tool to discover DES candidates. Currently, the search relies heavily on the researchers’ intuition or a trial-and-error process, which leads to a low success rate or bypassing of promising candidates. DES performance depends on the heterogeneous hydrogen bond environment formed by multiple hydrogen bond donors and acceptors. Understanding this heterogeneous hydrogen bond environment can help develop principles for designing high performance DESs for extraction and other separation applications. This work investigates the structure and dynamics of hydrogen bonds in hydrophobic DESs …


Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando Jan 2024

Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando

Community & Environmental Health Faculty Publications

Purpose: To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. Methods: An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. Results: Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, …


Identifying Patterns For Neurological Disabilities By Integrating Discrete Wavelet Transform And Visualization, Soo Yeon Ji, Sampath Jayarathna, Anne M. Perrotti, Katrina Kardiasmenos, Dong Hyun Jeong Jan 2024

Identifying Patterns For Neurological Disabilities By Integrating Discrete Wavelet Transform And Visualization, Soo Yeon Ji, Sampath Jayarathna, Anne M. Perrotti, Katrina Kardiasmenos, Dong Hyun Jeong

Computer Science Faculty Publications

Neurological disabilities cause diverse health and mental challenges, impacting quality of life and imposing financial burdens on both the individuals diagnosed with these conditions and their caregivers. Abnormal brain activity, stemming from malfunctions in the human nervous system, characterizes neurological disorders. Therefore, the early identification of these abnormalities is crucial for devising suitable treatments and interventions aimed at promoting and sustaining quality of life. Electroencephalogram (EEG), a non-invasive method for monitoring brain activity, is frequently employed to detect abnormal brain activity in neurological and mental disorders. This study introduces an approach that extends the understanding and identification of neurological disabilities …


Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain Jan 2024

Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain

All Works

Sepsis remains a major challenge that necessitates improved approaches to enhance patient outcomes. This study explored the potential of machine learning (ML) techniques to bridge the gap between clinical data and gene expression information to better predict and understand sepsis. We discuss the application of ML algorithms, including neural networks, deep learning, and ensemble methods, to address key evidence gaps and overcome the challenges in sepsis research. The lack of a clear definition of sepsis is highlighted as a major hurdle, but ML models offer a workaround by focusing on endpoint prediction. We emphasize the significance of gene transcript information …


Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar Jan 2024

Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar

All Works

Accurate assignment of meaning to a word based on its context, known as Word Sense Disambiguation (WSD), remains challenging across languages. Extensive research aims to develop automated methods for determining word senses in different contexts. However, the literature lacks the presence of datasets generated for the Arabic language WSD. This paper presents a dataset comprising a hundred polysemous Arabic words. Each word in the dataset encompasses 3–8 distinct senses, with ten example sentences per sense. Some statistical operations are conducted to gain insights into the dataset, enlightening its characteristics and properties. Subsequently, a novel WSD approach is proposed to utilize …


Adaptive Multi-Label Classification On Drifting Data Streams, Martha Roseberry Jan 2024

Adaptive Multi-Label Classification On Drifting Data Streams, Martha Roseberry

Theses and Dissertations

Drifting data streams and multi-label data are both challenging problems. When multi-label data arrives as a stream, the challenges of both problems must be addressed along with additional challenges unique to the combined problem. Algorithms must be fast and flexible, able to match both the speed and evolving nature of the stream. We propose four methods for learning from multi-label drifting data streams. First, a multi-label k Nearest Neighbors with Self Adjusting Memory (ML-SAM-kNN) exploits short- and long-term memories to predict the current and evolving states of the data stream. Second, a punitive k nearest neighbors algorithm with a self-adjusting …


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White Jan 2024

Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White

Physics Faculty Publications

Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the context of image synthesis before providing a streamlined diffusion model tailored towards low-dimensional data arrays. We then present our adapted Metropolis-Hastings algorithm which combines local proposals with global proposals taken from a diffusion model that is regularly trained on the samples produced during the …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


A Data-Driven Machine Learning Approach For Electron-Molecule Ionization Cross Sections, Allison Harris, Josh Nepomuceno Jan 2024

A Data-Driven Machine Learning Approach For Electron-Molecule Ionization Cross Sections, Allison Harris, Josh Nepomuceno

Faculty publications – Physics

Despite their importance in a wide variety of applications, the estimation of ionization cross sections for large molecules continues to present challenges for both experiment and theory. Machine learning (ML) algorithms have been shown to be an effective mechanism for estimating cross section data for atomic targets and a select number of molecular targets. We present an efficient ML model for predicting ionization cross sections for a broad array of molecular targets. Our model is a 3-layer neural network that is trained using published experimental datasets. There is minimal input to the network, making it widely applicable. We show that …


Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke Jan 2024

Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke

Research outputs 2022 to 2026

In this survey, we review the key developments in the field of malware detection using AI and analyze core challenges. We systematically survey state-of-the-art methods across five critical aspects of building an accurate and robust AI-powered malware-detection model: malware sophistication, analysis techniques, malware repositories, feature selection, and machine learning vs. deep learning. The effectiveness of an AI model is dependent on the quality of the features it is trained with. In turn, the quality and authenticity of these features is dependent on the quality of the dataset and the suitability of the analysis tool. Static analysis is fast but is …


Pdf Malware Detection: Toward Machine Learning Modeling With Explainability Analysis, G. M.Sakhawat Hossain, Kaushik Deb, Helge Janicke, Iqbal H. Sarker Jan 2024

Pdf Malware Detection: Toward Machine Learning Modeling With Explainability Analysis, G. M.Sakhawat Hossain, Kaushik Deb, Helge Janicke, Iqbal H. Sarker

Research outputs 2022 to 2026

The Portable Document Format (PDF) is one of the most widely used file types, thus fraudsters insert harmful code into victims' PDF documents to compromise their equipment. Conventional solutions and identification techniques are often insufficient and may only partially prevent PDF malware because of their versatile character and excessive dependence on a certain typical feature set. The primary goal of this work is to detect PDF malware efficiently in order to alleviate the current difficulties. To accomplish the goal, we first develop a comprehensive dataset of 15958 PDF samples taking into account the non-malevolent, malicious, and evasive behaviors of the …