Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

Machine learning

Series

Discipline
Institution
Publication Year
Publication

Articles 1 - 30 of 566

Full-Text Articles in Physical Sciences and Mathematics

Sustainable Energysense: A Predictive Machine Learning Framework For Optimizing Residential Electricity Consumption, Murad Al-Rajab, Samia Loucif Dec 2024

Sustainable Energysense: A Predictive Machine Learning Framework For Optimizing Residential Electricity Consumption, Murad Al-Rajab, Samia Loucif

All Works

In a world where electricity is often taken for granted, the surge in consumption poses significant challenges, including elevated CO2 emissions and rising prices. These issues not only impact consumers but also have broader implications for the global environment. This paper endeavors to propose a smart application dedicated to optimizing the electricity consumption of household appliances. It employs Augmented Reality (AR) technology along with YOLO to detect electrical appliances and provide detailed electricity consumption insights, such as displaying the appliance consumption rate and computing the total electricity consumption based on the number of hours the appliance was used. The application …


Cardiogpt: An Ecg Interpretation Generation Model, Guohua Fu, Jianwei Zheng, Islam Abudayyeh, Chizobam Ani, Cyril Rakovski, Louis Ehwerhemuepha, Hongxia Lu, Yongjuan Guo, Shenglin Liu, Huimin Chu, Bing Yang Apr 2024

Cardiogpt: An Ecg Interpretation Generation Model, Guohua Fu, Jianwei Zheng, Islam Abudayyeh, Chizobam Ani, Cyril Rakovski, Louis Ehwerhemuepha, Hongxia Lu, Yongjuan Guo, Shenglin Liu, Huimin Chu, Bing Yang

Mathematics, Physics, and Computer Science Faculty Articles and Research

Numerous supervised learning models aimed at classifying 12-lead electrocardiograms into different groups have shown impressive performance by utilizing deep learning algorithms. However, few studies are dedicated to applying the Generative Pre-trained Transformer (GPT) model in interpreting electrocardiogram (ECG) using natural language. Thus, we are pioneering the exploration of this uncharted territory by employing the CardioGPT model to tackle this challenge. We used a dataset of ECGs (standard 10s, 12-channel format) from adult patients, with 60 distinct rhythms or conduction abnormalities annotated by board-certified, actively practicing cardiologists. The ECGs were collected from The First Affiliated Hospital of Ningbo University and Shanghai …


Accurate Characterization Of Binding Kinetics And Allosteric Mechanisms For The Hsp90 Chaperone Inhibitors Using Ai-Augmented Integrative Biophysical Studies, Chao Xu, Xianglei Zhang, Lianghao Zhao, Gennady M. Verkhivker, Fang Bai Apr 2024

Accurate Characterization Of Binding Kinetics And Allosteric Mechanisms For The Hsp90 Chaperone Inhibitors Using Ai-Augmented Integrative Biophysical Studies, Chao Xu, Xianglei Zhang, Lianghao Zhao, Gennady M. Verkhivker, Fang Bai

Mathematics, Physics, and Computer Science Faculty Articles and Research

The binding kinetics of drugs to their targets are gradually being recognized as a crucial indicator of the efficacy of drugs in vivo, leading to the development of various computational methods for predicting the binding kinetics in recent years. However, compared with the prediction of binding affinity, the underlying structure and dynamic determinants of binding kinetics are more complicated. Efficient and accurate methods for predicting binding kinetics are still lacking. In this study, quantitative structure–kinetics relationship (QSKR) models were developed using 132 inhibitors targeting the ATP binding domain of heat shock protein 90α (HSP90α) to predict the dissociation rate …


Multi-Aspect Rule-Based Ai: Methods, Taxonomy, Challenges And Directions Towards Automation, Intelligence And Transparent Cybersecurity Modeling For Critical Infrastructures, Iqbal H. Sarker, Helge Janicke, Mohamed A. Ferrag, Alsharif Abuadbba Apr 2024

Multi-Aspect Rule-Based Ai: Methods, Taxonomy, Challenges And Directions Towards Automation, Intelligence And Transparent Cybersecurity Modeling For Critical Infrastructures, Iqbal H. Sarker, Helge Janicke, Mohamed A. Ferrag, Alsharif Abuadbba

Research outputs 2022 to 2026

Critical infrastructure (CI) typically refers to the essential physical and virtual systems, assets, and services that are vital for the functioning and well-being of a society, economy, or nation. However, the rapid proliferation and dynamism of today's cyber threats in digital environments may disrupt CI functionalities, which would have a debilitating impact on public safety, economic stability, and national security. This has led to much interest in effective cybersecurity solutions regarding automation and intelligent decision-making, where AI-based modeling is potentially significant. In this paper, we take into account “Rule-based AI” rather than other black-box solutions since model transparency, i.e., human …


Enhancing Landslide Susceptibility Modelling Through A Novel Non-Landslide Sampling Method And Ensemble Learning Technique, Chao Zhou, Yue Wang, Ying Cao, Ramesh P. Singh, Bayes Ahmed, Mahdi Motagh, Yang Wang, Ling Chen, Guangchao Tan, Shanshan Li Mar 2024

Enhancing Landslide Susceptibility Modelling Through A Novel Non-Landslide Sampling Method And Ensemble Learning Technique, Chao Zhou, Yue Wang, Ying Cao, Ramesh P. Singh, Bayes Ahmed, Mahdi Motagh, Yang Wang, Ling Chen, Guangchao Tan, Shanshan Li

Mathematics, Physics, and Computer Science Faculty Articles and Research

In recent years, several catastrophic landslide events have been observed throughout the globe, threatening to lives and infrastructures. To minimize the impact of landslides, the need of landslide susceptibility map is important. The study aims to extract high-quality non-landslide samples and improve the accuracy of landslide susceptibility modelling (LSM) outcomes by applying a coupled method of ensemble learning and Machine Learning (ML). The Zigui-Badong section of the Three Gorges Reservoir area (TGRA) in China was considered in the present study. Twelve influencing factors were selected as inputs for LSM, and the relationship between each causal factor and landslide spatial development …


Preserving Linguistic Diversity In The Digital Age: A Scalable Model For Cultural Heritage Continuity, James Hutson, Pace Ellsworth, Matt Ellsworth Mar 2024

Preserving Linguistic Diversity In The Digital Age: A Scalable Model For Cultural Heritage Continuity, James Hutson, Pace Ellsworth, Matt Ellsworth

Faculty Scholarship

In the face of the rapid erosion of both tangible and intangible cultural heritage globally, the urgency for effective, wide-ranging preservation methods has never been greater. Traditional approaches in cultural preservation often focus narrowly on specific niches, overlooking the broader cultural tapestry, particularly the preservation of everyday cultural elements. This article addresses this critical gap by advocating for a comprehensive, scalable model for cultural preservation that leverages machine learning and big data analytics. This model aims to document and archive a diverse range of cultural artifacts, encompassing both extraordinary and mundane aspects of heritage. A central issue highlighted in the …


Artificial Intelligence Usage And Data Privacy Discoveries Within Mhealth, Jennifer Schulte Mar 2024

Artificial Intelligence Usage And Data Privacy Discoveries Within Mhealth, Jennifer Schulte

Faculty Research & Publications

Advancements in artificial intelligence continue to impact nearly every aspect of human life by providing integration options that aim to supplement or improve current processes. One industry that continues to benefit from artificial intelligence integration is healthcare. For years now, elements of artificial intelligence have been used to assist in clinical decision making, helping to identify potential health risks at earlier stages, and supplementing precision medicine. An area of healthcare that specifically looks at wearable devices, sensors, phone applications, and other such devices is mobile health (mHealth). These devices are used to aid in health data collection and delivery. This …


Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany Mar 2024

Using Chatgpt To Generate Gendered Language, Shweta Soundararajan, Manuela Nayantara Jeyaraj, Sarah Jane Delany

Conference papers

Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven …


The Impact Of Artificial Intelligence And Machine Learning On Organizations Cybersecurity, Mustafa Abdulhussein Feb 2024

The Impact Of Artificial Intelligence And Machine Learning On Organizations Cybersecurity, Mustafa Abdulhussein

Doctoral Dissertations and Projects

As internet technology proliferate in volume and complexity, the ever-evolving landscape of malicious cyberattacks presents unprecedented security risks in cyberspace. Cybersecurity challenges have been further exacerbated by the continuous growth in the prevalence and sophistication of cyber-attacks. These threats have the capacity to disrupt business operations, erase critical data, and inflict reputational damage, constituting an existential threat to businesses, critical services, and infrastructure. The escalating threat is further compounded by the malicious use of artificial intelligence (AI) and machine learning (ML), which have increasingly become tools in the cybercriminal arsenal. In this dynamic landscape, the emergence of offensive AI introduces …


Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno Feb 2024

Dataset Of Arabic Spam And Ham Tweets, Sanaa Kaddoura, Safaa Henno

All Works

This data article provides a dataset of 132421 posts and their corresponding information collected from Twitter social media. The data has two classes, ham or spam, where ham indicates non-spam clean tweets. The main target of this dataset is to study a way to classify whether a post is a spam or not automatically. The data is in Arabic language only, which makes the data essential to the researchers in Arabic natural language processing (NLP) due to the lack of resources in this language. The data is made publicly available to allow researchers to use it as a benchmark for …


Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu Feb 2024

Self-Optimizing Feature Generation Via Categorical Hashing Representation And Hierarchical Reinforcement Crossing, Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, Yanjie Fu

Computer Science Faculty Publications and Presentations

Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, …


Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando Jan 2024

Machine Learning As A Tool For Early Detection: A Focus On Late-Stage Colorectal Cancer Across Socioeconomic Spectrums, Hadiza Galadima, Rexford Anson-Dwamena, Ashley Johnson, Ghalib Bello, Georges Adunlin, James Blando

Community & Environmental Health Faculty Publications

Purpose: To assess the efficacy of various machine learning (ML) algorithms in predicting late-stage colorectal cancer (CRC) diagnoses against the backdrop of socio-economic and regional healthcare disparities. Methods: An innovative theoretical framework was developed to integrate individual- and census tract-level social determinants of health (SDOH) with sociodemographic factors. A comparative analysis of the ML models was conducted using key performance metrics such as AUC-ROC to evaluate their predictive accuracy. Spatio-temporal analysis was used to identify disparities in late-stage CRC diagnosis probabilities. Results: Gradient boosting emerged as the superior model, with the top predictors for late-stage CRC diagnosis being anatomic site, …


Identifying Patterns For Neurological Disabilities By Integrating Discrete Wavelet Transform And Visualization, Soo Yeon Ji, Sampath Jayarathna, Anne M. Perrotti, Katrina Kardiasmenos, Dong Hyun Jeong Jan 2024

Identifying Patterns For Neurological Disabilities By Integrating Discrete Wavelet Transform And Visualization, Soo Yeon Ji, Sampath Jayarathna, Anne M. Perrotti, Katrina Kardiasmenos, Dong Hyun Jeong

Computer Science Faculty Publications

Neurological disabilities cause diverse health and mental challenges, impacting quality of life and imposing financial burdens on both the individuals diagnosed with these conditions and their caregivers. Abnormal brain activity, stemming from malfunctions in the human nervous system, characterizes neurological disorders. Therefore, the early identification of these abnormalities is crucial for devising suitable treatments and interventions aimed at promoting and sustaining quality of life. Electroencephalogram (EEG), a non-invasive method for monitoring brain activity, is frequently employed to detect abnormal brain activity in neurological and mental disorders. This study introduces an approach that extends the understanding and identification of neurological disabilities …


Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain Jan 2024

Advancing The Understanding Of Clinical Sepsis Using Gene Expression–Driven Machine Learning To Improve Patient Outcomes, Asrar Rashid, Feras Al-Obeidat, Wael Hafez, Govind Benakatti, Rayaz A. Malik, Christos Koutentis, Javed Sharief, Joe Brierley, Nasir Quraishi, Zainab A. Malik, Arif Anwary, Hoda Alkhzaimi, Syed Ahmed Zaki, Praveen Khilnani, Raziya Kadwa, Rajesh Phatak, Maike Schumacher, M. Guftar Shaikh, Ahmed Al-Dubai, Amir Hussain

All Works

Sepsis remains a major challenge that necessitates improved approaches to enhance patient outcomes. This study explored the potential of machine learning (ML) techniques to bridge the gap between clinical data and gene expression information to better predict and understand sepsis. We discuss the application of ML algorithms, including neural networks, deep learning, and ensemble methods, to address key evidence gaps and overcome the challenges in sepsis research. The lack of a clear definition of sepsis is highlighted as a major hurdle, but ML models offer a workaround by focusing on endpoint prediction. We emphasize the significance of gene transcript information …


Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar Jan 2024

Enhancedbert: A Feature-Rich Ensemble Model For Arabic Word Sense Disambiguation With Statistical Analysis And Optimized Data Collection, Sanaa Kaddoura, Reem Nassar

All Works

Accurate assignment of meaning to a word based on its context, known as Word Sense Disambiguation (WSD), remains challenging across languages. Extensive research aims to develop automated methods for determining word senses in different contexts. However, the literature lacks the presence of datasets generated for the Arabic language WSD. This paper presents a dataset comprising a hundred polysemous Arabic words. Each word in the dataset encompasses 3–8 distinct senses, with ten example sentences per sense. Some statistical operations are conducted to gain insights into the dataset, enlightening its characteristics and properties. Subsequently, a novel WSD approach is proposed to utilize …


Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede Jan 2024

Infusing Machine Learning And Computational Linguistics Into Clinical Notes, Funke V. Alabi, Onyeka Omose, Omotomilola Jegede

Mathematics & Statistics Faculty Publications

Entering free-form text notes into Electronic Health Records (EHR) systems takes a lot of time from clinicians. A large portion of this paper work is viewed as a burden, which cuts into the amount of time doctors spend with patients and increases the risk of burnout. We will see how machine learning and computational linguistics can be infused in the processing of taking clinical notes. We are presenting a new language modeling task that predicts the content of notes conditioned on historical data from a patient's medical record, such as patient demographics, lab results, medications, and previous notes, with the …


Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White Jan 2024

Accelerating Markov Chain Monte Carlo Sampling With Diffusion Models, N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W. Thomas, M. J. White

Physics Faculty Publications

Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the context of image synthesis before providing a streamlined diffusion model tailored towards low-dimensional data arrays. We then present our adapted Metropolis-Hastings algorithm which combines local proposals with global proposals taken from a diffusion model that is regularly trained on the samples produced during the …


A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari Jan 2024

A Survey On Few-Shot Class-Incremental Learning, Songsong Tian, Lusi Li, Weijun Li, Hang Ran, Xin Ning, Prayag Tiwari

Computer Science Faculty Publications

Large deep learning models are impressive, but they struggle when real-time data is not available. Few-shot class-incremental learning (FSCIL) poses a significant challenge for deep neural networks to learn new tasks from just a few labeled samples without forgetting the previously learned ones. This setup can easily leads to catastrophic forgetting and overfitting problems, severely affecting model performance. Studying FSCIL helps overcome deep learning model limitations on data volume and acquisition time, while improving practicality and adaptability of machine learning models. This paper provides a comprehensive survey on FSCIL. Unlike previous surveys, we aim to synthesize few-shot learning and incremental …


A Data-Driven Machine Learning Approach For Electron-Molecule Ionization Cross Sections, Allison Harris, Josh Nepomuceno Jan 2024

A Data-Driven Machine Learning Approach For Electron-Molecule Ionization Cross Sections, Allison Harris, Josh Nepomuceno

Faculty publications – Physics

Despite their importance in a wide variety of applications, the estimation of ionization cross sections for large molecules continues to present challenges for both experiment and theory. Machine learning (ML) algorithms have been shown to be an effective mechanism for estimating cross section data for atomic targets and a select number of molecular targets. We present an efficient ML model for predicting ionization cross sections for a broad array of molecular targets. Our model is a 3-layer neural network that is trained using published experimental datasets. There is minimal input to the network, making it widely applicable. We show that …


Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke Jan 2024

Malware Detection With Artificial Intelligence: A Systematic Literature Review, Matthew G. Gaber, Mohiuddin Ahmed, Helge Janicke

Research outputs 2022 to 2026

In this survey, we review the key developments in the field of malware detection using AI and analyze core challenges. We systematically survey state-of-the-art methods across five critical aspects of building an accurate and robust AI-powered malware-detection model: malware sophistication, analysis techniques, malware repositories, feature selection, and machine learning vs. deep learning. The effectiveness of an AI model is dependent on the quality of the features it is trained with. In turn, the quality and authenticity of these features is dependent on the quality of the dataset and the suitability of the analysis tool. Static analysis is fast but is …


Pdf Malware Detection: Toward Machine Learning Modeling With Explainability Analysis, G. M.Sakhawat Hossain, Kaushik Deb, Helge Janicke, Iqbal H. Sarker Jan 2024

Pdf Malware Detection: Toward Machine Learning Modeling With Explainability Analysis, G. M.Sakhawat Hossain, Kaushik Deb, Helge Janicke, Iqbal H. Sarker

Research outputs 2022 to 2026

The Portable Document Format (PDF) is one of the most widely used file types, thus fraudsters insert harmful code into victims' PDF documents to compromise their equipment. Conventional solutions and identification techniques are often insufficient and may only partially prevent PDF malware because of their versatile character and excessive dependence on a certain typical feature set. The primary goal of this work is to detect PDF malware efficiently in order to alleviate the current difficulties. To accomplish the goal, we first develop a comprehensive dataset of 15958 PDF samples taking into account the non-malevolent, malicious, and evasive behaviors of the …


Impact Of Weather Factors On Airport Arrival Rates: Application Of Machine Learning In Air Transportation, Robert W. Maxson, Dothang Truong, Woojin Choi Dec 2023

Impact Of Weather Factors On Airport Arrival Rates: Application Of Machine Learning In Air Transportation, Robert W. Maxson, Dothang Truong, Woojin Choi

Publications

Weather is responsible for approximately 70% of air transportation delays in the National Airspace System, and delays resulting from convective weather alone cost airlines and passengers millions of dollars each year due to delays that could be avoided. This research sought to establish relationships between environmental variables and airport efficiency estimates by data mining archived weather and airport performance data at ten geographically and climatologically different airports. Several meaningful relationships were discovered from six out of ten airports using various machine learning methods within an overarching data mining protocol, and the developed models were tested using historical data.


Algorithm Selection Using Edge Ml And Case-Based Reasoning, Rahman Ali, Muhammad Sadiq Hassan Zada, Asad Masood Khatak, Jamil Hussain Dec 2023

Algorithm Selection Using Edge Ml And Case-Based Reasoning, Rahman Ali, Muhammad Sadiq Hassan Zada, Asad Masood Khatak, Jamil Hussain

All Works

In practical data mining, a wide range of classification algorithms is employed for prediction tasks. However, selecting the best algorithm poses a challenging task for machine learning practitioners and experts, primarily due to the inherent variability in the characteristics of classification problems, referred to as datasets, and the unpredictable performance of these algorithms. Dataset characteristics are quantified in terms of meta-features, while classifier performance is evaluated using various performance metrics. The assessment of classifiers through empirical methods across multiple classification datasets, while considering multiple performance metrics, presents a computationally expensive and time-consuming obstacle in the pursuit of selecting the optimal …


Predicting New Crescent Moon Visibility Applying Machine Learning Algorithms, Murad Al-Rajab, Samia Loucif, Yazan Al Risheh Dec 2023

Predicting New Crescent Moon Visibility Applying Machine Learning Algorithms, Murad Al-Rajab, Samia Loucif, Yazan Al Risheh

All Works

The world's population is projected to grow 32% in the coming years, and the number of Muslims is expected to grow by 70%—from 1.8 billion in 2015 to about 3 billion in 2060. Hijri is the Islamic calendar, also known as the lunar Hijri calendar, which consists of 12 lunar months, and it is tied to the Moon phases where a new crescent Moon marks the beginning of each month. Muslims use the Hijri calendar to determine important dates and religious events such as Ramadan, Haj, Muharram, etc. Till today, there is no consensus on deciding on the beginning of …


Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng Dec 2023

Designing An Overseas Experiential Course In Data Science, Hua Leong Fwa, Graham Ng

Research Collection School Of Computing and Information Systems

Unprecedented demand for data science professionals in the industry has led to many educational institutions launching new data science courses. It is however imperative that students of data science programmes learn through execution of real-world, authentic projects on top of acquiring foundational knowledge on the basics of data science. In the process of working on authentic, real-world projects, students not only create new knowledge but also learn to solve open, sophisticated, and ill-structured problems in an inter-disciplinary fashion. In this paper, we detailed our approach to design a data science curriculum premised on learners solving authentic data science problems sourced …


Development Of An Explainable Artificial Intelligence Model For Asian Vascular Wound Images, Zhiwen Joseph Lo, Malcolm Han Wen Mak, Shanying Liang, Yam Meng Chan, Cheng Cheng Goh, Tina Peiting Lai, Audrey Hui Min Tan, Patrick Thng, Patrick Thng, Tillman Weyde, Sylvia Smit Dec 2023

Development Of An Explainable Artificial Intelligence Model For Asian Vascular Wound Images, Zhiwen Joseph Lo, Malcolm Han Wen Mak, Shanying Liang, Yam Meng Chan, Cheng Cheng Goh, Tina Peiting Lai, Audrey Hui Min Tan, Patrick Thng, Patrick Thng, Tillman Weyde, Sylvia Smit

Research Collection School Of Computing and Information Systems

Chronic wounds contribute to significant healthcare and economic burden worldwide. Wound assessment remains challenging given its complex and dynamic nature. The use of artificial intelligence (AI) and machine learning methods in wound analysis is promising. Explainable modelling can help its integration and acceptance in healthcare systems. We aim to develop an explainable AI model for analysing vascular wound images among an Asian population. Two thousand nine hundred and fifty-seven wound images from a vascular wound image registry from a tertiary institution in Singapore were utilized. The dataset was split into training, validation and test sets. Wound images were classified into …


Enhanced Privacy-Enabled Face Recognition Using Κ-Identity Optimization, Ryan Karl Dec 2023

Enhanced Privacy-Enabled Face Recognition Using Κ-Identity Optimization, Ryan Karl

Department of Electrical and Computer Engineering: Dissertations, Theses, and Student Research

Facial recognition is becoming more and more prevalent in the daily lives of the common person. Law enforcement utilizes facial recognition to find and track suspects. The newest smartphones have the ability to unlock using the user's face. Some door locks utilize facial recognition to allow correct users to enter restricted spaces. The list of applications that use facial recognition will only increase as hardware becomes more cost-effective and more computationally powerful. As this technology becomes more prevalent in our lives, it is important to understand and protect the data provided to these companies. Any data transmitted should be encrypted …


Offenseval 2023: Offensive Language Identification In The Age Of Large Language Models, Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe Nov 2023

Offenseval 2023: Offensive Language Identification In The Age Of Large Language Models, Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe

Natural Language Processing Faculty Publications

The OffensEval shared tasks organized as part of SemEval-2019-2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance …


Evaluating The Efficacy Of Chatgpt In Navigating The Spanish Medical Residency Entrance Examination (Mir): Promising Horizons For Ai In Clinical Medicine., Francisco Guillen-Grima, Sara Guillen-Aguinaga, Laura Guillen-Aguinaga, Rosa Alas-Brun, Luc Onambele, Wilfrido Ortega, Rocio Montejo, Enrique Aguinaga-Ontoso, Paul Barach, Ines Aguinaga-Ontoso Nov 2023

Evaluating The Efficacy Of Chatgpt In Navigating The Spanish Medical Residency Entrance Examination (Mir): Promising Horizons For Ai In Clinical Medicine., Francisco Guillen-Grima, Sara Guillen-Aguinaga, Laura Guillen-Aguinaga, Rosa Alas-Brun, Luc Onambele, Wilfrido Ortega, Rocio Montejo, Enrique Aguinaga-Ontoso, Paul Barach, Ines Aguinaga-Ontoso

Department of Medicine Faculty Papers

UNLABELLED: The rapid progress in artificial intelligence, machine learning, and natural language processing has led to increasingly sophisticated large language models (LLMs) for use in healthcare. This study assesses the performance of two LLMs, the GPT-3.5 and GPT-4 models, in passing the MIR medical examination for access to medical specialist training in Spain. Our objectives included gauging the model's overall performance, analyzing discrepancies across different medical specialties, discerning between theoretical and practical questions, estimating error proportions, and assessing the hypothetical severity of errors committed by a physician.

MATERIAL AND METHODS: We studied the 2022 Spanish MIR examination results after excluding …


Migrating 120,000 Legacy Publications From Several Systems Into A Current Research Information System Using Advanced Data Wrangling Techniques, Yrjö Lappalainen, Matti Lassila, Tanja Heikkilä, Jani Nieminen, Tapani Lehtilä Nov 2023

Migrating 120,000 Legacy Publications From Several Systems Into A Current Research Information System Using Advanced Data Wrangling Techniques, Yrjö Lappalainen, Matti Lassila, Tanja Heikkilä, Jani Nieminen, Tapani Lehtilä

All Works

This article describes a complex CRIS (current research information system) implementation project involving the migration of around 120,000 legacy publication records from three different systems. The project, undertaken by Tampere University, encountered several challenges in data diversity, data quality, and resource allocation. To handle the extensive and heterogenous dataset, innovative approaches such as machine learning techniques and various data wrangling tools were used to process data, correct errors, and merge information from different sources. Despite significant delays and unforeseen obstacles, the project was ultimately successful in achieving its goals. The project served as a valuable learning experience, highlighting the importance …