Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

Purdue University

Discipline
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 31

Full-Text Articles in Data Science

Online Class-Incremental Learning For Real-World Food Image Classification, Siddeshwar Raghavan, Jiangpeng He, Fengqing Zhu Mar 2024

Online Class-Incremental Learning For Real-World Food Image Classification, Siddeshwar Raghavan, Jiangpeng He, Fengqing Zhu

Graduate Industrial Research Symposium

Food image classification is essential for monitoring health and tracking dietary in image-based dietary assessment methods. However, conventional systems often rely on static datasets with fixed classes and uniform distribution. In contrast, real-world food consumption patterns, shaped by cultural, economic, and personal influences, involve dynamic and evolving data. Thus, it requires the classification system to cope with continuously evolving data. Online Class Incremental Learning (OCIL) addresses the challenge of learning continuously from a single-pass data stream while adapting to the new knowledge and reducing catastrophic forgetting. Experience Replay (ER) based OCIL methods store a small portion of previous data and …


Characterization Of Biological Particles Using An Integrated Hyperspectral Imaging And Machine Learning, Kaeul Lim, Arezoo Ardekani Mar 2024

Characterization Of Biological Particles Using An Integrated Hyperspectral Imaging And Machine Learning, Kaeul Lim, Arezoo Ardekani

Graduate Industrial Research Symposium

Hyperspectral imaging (HSI) is a promising modality in medicine with many potential applications. This study focuses on developing a label-free lipid nanoparticle characterization method using a convolutional neural network (CNN) analysis of HSI images. The HSI data, hypercube, consists of a series of images acquired at different wavelengths for the same field of view, providing continuous spectra information for each pixel. Three distinct liposome samples were collected for analysis. Advanced image preprocessing and classification methods for HSI data were developed to differentiate liposomes based on their material compositions. Our machine learning-based classification method was able to distinguish different liposome types …


Geospatial Analysis Of Agricultural Potential In The United States, Diana Febrita Mar 2024

Geospatial Analysis Of Agricultural Potential In The United States, Diana Febrita

Graduate Industrial Research Symposium

Traditionally, the agriculture sector is responsible for providing food and crop products. However, the role of agriculture has expanded beyond its traditional function. It is the main sector that contributes to the provision of food, income, employment, environmental protection, and local economic development. Reflecting on the roles of agriculture, understanding the potential of agriculture in the United States is crucial to discovering the prospects and challenges. This study will briefly discuss the agricultural potential in the United States based on the five assets, including natural capital, financial capital, human capital, physical capital, and social capital. To identify the states with …


A Machine Learning Model Of Perturb-Seq Data For Use In Space Flight Gene Expression Profile Analysis, Liam F. Johnson, James Casaletto, Lauren Sanders, Sylvain Costes Mar 2024

A Machine Learning Model Of Perturb-Seq Data For Use In Space Flight Gene Expression Profile Analysis, Liam F. Johnson, James Casaletto, Lauren Sanders, Sylvain Costes

Graduate Industrial Research Symposium

The genetic perturbations caused by spaceflight on biological systems tend to have a system-wide effect which is often difficult to deconvolute it into individual signals with specific points of origin. Single cell multi-omic data can provide a profile of the perturbational effects, but does not necessarily indicate the initial point of interference within the network. The objective of this project is to take advantage of large scale and genome-wide perturbational datasets by using them to train a tuned machine learning model that is capable of predicting the effects of unseen perturbations in new data. Perturb-Seq datasets are large libraries of …


Modelling The "Bottom-Up" Development Pattern Of Tar Spot Disease In Corn, Brenden Lane, Joaquín Guillermo Ramírez-Gil, Carlos Góngora-Canul, Mariela Sofia Fernandez Campos, Andres Cruz-Sancan, Fidel E. Jiménez-Beitia, Alex G. Acosta-Guatemal, Wily Sic, C. D. Cruz Mar 2024

Modelling The "Bottom-Up" Development Pattern Of Tar Spot Disease In Corn, Brenden Lane, Joaquín Guillermo Ramírez-Gil, Carlos Góngora-Canul, Mariela Sofia Fernandez Campos, Andres Cruz-Sancan, Fidel E. Jiménez-Beitia, Alex G. Acosta-Guatemal, Wily Sic, C. D. Cruz

Graduate Industrial Research Symposium

In 2015, the corn-infecting pathogen Phyllachora maydis (causal agent of tar spot disease) was reported for the first time in the United States. The disease has since spread across the US, causing major yield losses. In 2021 alone, 5.88 million metric tons (231.3 million bushels) of US corn yield were lost to this disease, costing an estimated US$1.25 billion. Though fungicides can protect against these agroeconomic losses, application timing can be difficult to optimize because our understanding of tar spot dynamics is still evolving. The current view is that tar spot typically develops bottom-up through a repeating infection cycle. Because …


Sepsis Treatment: Reinforced Sequential Decision-Making For Saving Lives, Dipesh Tamboli, Jiayu Chen, Kiran Pranesh Jotheeswaran, Denny Yu, Vaneet Aggarwal Mar 2024

Sepsis Treatment: Reinforced Sequential Decision-Making For Saving Lives, Dipesh Tamboli, Jiayu Chen, Kiran Pranesh Jotheeswaran, Denny Yu, Vaneet Aggarwal

Graduate Industrial Research Symposium

Sepsis, a life-threatening condition triggered by the body's exaggerated response to infection, demands urgent intervention to prevent severe complications. Existing machine learning methods for managing sepsis struggle in offline scenarios, exhibiting suboptimal performance with survival rates below 50%. Our project introduces the "PosNegDM: Reinforcement Learning with Positive and Negative Demonstrations for Sequential Decision-Making" framework utilizing an innovative transformer-based model and a feedback reinforcer to replicate expert actions while considering individual patient characteristics. A mortality classifier with 96.7% accuracy guides treatment decisions towards positive outcomes. The PosNegDM framework significantly improves patient survival, saving 97.39% of patients and outperforming established machine learning …


Accuracy Of Nitrate Hysteresis And Flushing For Agricultural Watersheds In The Midwest, Noah Rudko, Sara K. W. Mcmillian, Jane Frankenberger, François Birgand Mar 2024

Accuracy Of Nitrate Hysteresis And Flushing For Agricultural Watersheds In The Midwest, Noah Rudko, Sara K. W. Mcmillian, Jane Frankenberger, François Birgand

Graduate Industrial Research Symposium

Storm event-based metrics, such as hysteresis (HI) and flushing (FI), are used to differentiate nitrate pathways and sources, which is essential for watershed management. Estimations of these event-based metrics typically use high frequency (15-minute – hourly) measurements, but daily data are also used due to their greater availability. To date, there has been no study assessing how using lower frequency samples affect the accuracy of HI and FI, which could skew interpretation of potential nutrient pathways and sources. We used continuous measurements of nitrate collected at 9 watersheds throughout the Midwest spanning 448 storms. HI and FI were estimated from …


The Impact Of Accessible Data On Cyberstalking, Elise Kwan Jan 2024

The Impact Of Accessible Data On Cyberstalking, Elise Kwan

The Journal of Purdue Undergraduate Research

No abstract provided.


Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown Jan 2024

Model Selection Through Cross-Validation For Supervised Learning Tasks With Manifold Data, Derek Brown

The Journal of Purdue Undergraduate Research

No abstract provided.


Machine Learning Of Big Data: A Gaussian Regression Model To Predict The Spatiotemporal Distribution Of Ground Ozone, Jerry Gu Jan 2024

Machine Learning Of Big Data: A Gaussian Regression Model To Predict The Spatiotemporal Distribution Of Ground Ozone, Jerry Gu

The Journal of Purdue Undergraduate Research

Tracking pollution levels on the ground is important to the environment and public health. One of the pollutants of concern is ozone, which, at high concentrations, can cause respiratory and cardiovascular problems. The National Center for Atmospheric Research (NCAR) has published valuable ozone data obtained from ground-based sensors installed at selected locations. Because it is unfeasible to measure the exact ozone levels everywhere at any time, it would be valuable to predict the temporal-spatial distributions of ozone concentration based on existing data. This would help us better understand the patterns and trends in the data and make better decisions to …


A Computational Profile Of Invasive Lionfish In Belize: A New Insight On A Destructive Species, Joshua E. Balan Jan 2024

A Computational Profile Of Invasive Lionfish In Belize: A New Insight On A Destructive Species, Joshua E. Balan

The Journal of Purdue Undergraduate Research

Since their discovery in the region in 2009, invasive Indonesian-native lionfish have been taking over the Belize Barrier Reef. As a result, populations of local species have dwindled as they are either eaten or outcompeted by the invaders. This has led to devastating losses ecologically and economically; massive industries in the local nations, such as fisheries and tourism, have suffered greatly. Attempting to combat this, local organizations, from nonprofits to ecotourism companies, have been manually spear-hunting them on scuba dives to cull the population. One such company, Reef Conservation Institute (ReefCI), operating out of Tom Owens Caye outside of Placencia, …


Les Expositions Turnus, Une Page D’Histoire Transnationale Des Beaux-Arts En Suisse À La Fin Du Xixe Siècle. Et Comment Découvrir Les Humanités Numériques, Béatrice Joyeux-Prunel Dec 2023

Les Expositions Turnus, Une Page D’Histoire Transnationale Des Beaux-Arts En Suisse À La Fin Du Xixe Siècle. Et Comment Découvrir Les Humanités Numériques, Béatrice Joyeux-Prunel

Artl@s Bulletin

Cet article présente le travail de la classe d’introduction aux humanités numériques de l’Université de Genève sur les expositions Turnus en Suisse à partir des années 1840. Près de 50 catalogues ont été retranscrits, décrits et structurés à l’aide de scripts Python, puis géolocalisés. Les données ont été ajoutées à BasArt, le répertoire mondial de catalogues d’expositions d’Artl@s (https://artlas.huma-num.fr/map). Elles permettent de mieux comprendre les premières années de ces expositions et leurs dynamiques locales, fédérales et internationales. Le Turnus fut une plaque tournante pour les artistes suisses, voire un tremplin vers le marché européen de l’art.


Deep Q-Learning Framework For Quantitative Climate Change Adaptation Policy For Florida Road Network Due To Extreme Precipitation, Orhun Aydin Oct 2023

Deep Q-Learning Framework For Quantitative Climate Change Adaptation Policy For Florida Road Network Due To Extreme Precipitation, Orhun Aydin

I-GUIDE Forum

Climate change-induced extreme weather and increasing population are increasing the pressure on the global aging road networks. Adaptation requires designing interventions and alterations to the road networks that consider future dynamics of flooding and increased traffic due to the growing population. This paper introduces a reinforcement learning approach to designing interventions for Florida's road network under future traffic and climate projections. Three climate models and a tide and surge model are used to create flooding and coastal inundation projections, respectively. The optimal sequence of decisions for adapting Florida's road network to minimize flooding-related disruptions is solved by using a graph-based …


Large-Scale Google Street View Images For Urban Change Detection, Fangzheng Lyu, Xinlin Ma, Yan Song, Eric Zhu, Shaowen Wang Oct 2023

Large-Scale Google Street View Images For Urban Change Detection, Fangzheng Lyu, Xinlin Ma, Yan Song, Eric Zhu, Shaowen Wang

I-GUIDE Forum

Urbanization has entered a new phase characterized by urban changes occurring at a micro-scale and “under the roof”, as opposed to external modifications. These changes, known as urban retrofitting, involve the incorporation of novel technologies or features into pre-existing systems to promote sustainability. Given the limitations of remote sensing images in identifying such urban changes, novel tools need to be developed for detecting urban retrofitting. In this study, we first build a pipeline to collect large-scale time-series urban street view images from Google Street View in Mecklenburg County, North Carolina. And we examine the feasibility of utilizing the acquired dataset …


Graph Transformer Network For Flood Forecasting With Heterogeneous Covariates, Jimeng Shi, Vitalii Stebliankin, Zhaonan Wang, Shaowen Wang, Giri Narasimhan Oct 2023

Graph Transformer Network For Flood Forecasting With Heterogeneous Covariates, Jimeng Shi, Vitalii Stebliankin, Zhaonan Wang, Shaowen Wang, Giri Narasimhan

I-GUIDE Forum

Floods can be very destructive causing heavy damage to life, property, and livelihoods. Global climate change and the consequent sea-level rise have increased the occurrence of extreme weather events, resulting in elevated and frequent flood risk. Therefore, accurate and timely flood forecasting in coastal river systems is critical to facilitate good flood management. However, the computational tools currently used are either slow or inaccurate. In this paper, we propose a Flood prediction tool using Graph Transformer Network (FloodGTN) for river systems. More specifically, FloodGTN learns the spatio-temporal dependencies of water levels at different monitoring stations using Graph Neural Networks (GNNs) …


Solving Geospatial Problems Under Extreme Time Constraints: A Call For Inclusive Geocomputational Education, Coline C. Dony Oct 2023

Solving Geospatial Problems Under Extreme Time Constraints: A Call For Inclusive Geocomputational Education, Coline C. Dony

I-GUIDE Forum

To prepare our next generation to face geospatial problems that have extreme time constraints (e.g., disasters, climate change) we need to create educational pathways that help students develop their geocomputational thinking skills. First, educators are central in helping us create those pathways, therefore, we need to clearly convey to them why and in which contexts this thinking is necessary. For that purpose, a new definition for geocomputational thinking is suggested that makes it clear that this thinking is needed for geospatial problems that have extreme time constraints. Secondly, we can not further burden educators with more demands, rather we should …


Reducing Uncertainty In Sea-Level Rise Prediction: A Spatial-Variability-Aware Approach, Subhankar Ghosh, Shuai An, Arun Sharma, Jayant Gupta, Shashi Shekhar, Aneesh Subramanian Oct 2023

Reducing Uncertainty In Sea-Level Rise Prediction: A Spatial-Variability-Aware Approach, Subhankar Ghosh, Shuai An, Arun Sharma, Jayant Gupta, Shashi Shekhar, Aneesh Subramanian

I-GUIDE Forum

Given multi-model ensemble climate projections, the goal is to accurately and reliably predict future sea-level rise while lowering the uncertainty. This problem is important because sea-level rise affects millions of people in coastal communities and beyond due to climate change's impacts on polar ice sheets and the ocean. This problem is challenging due to spatial variability and unknowns such as possible tipping points (e.g., collapse of Greenland or West Antarctic ice-shelf), climate feedback loops (e.g., clouds, permafrost thawing), future policy decisions, and human actions. Most existing climate modeling approaches use the same set of weights globally, during either regression or …


Cross-Scale Urban Land Cover Mapping: Empowering Classification Through Transfer Learning And Deep Learning Integration, Zhe Wang, Chao Fan, Xian Min, Shoukun Sun, Xiaogang Ma, Xiang Que Oct 2023

Cross-Scale Urban Land Cover Mapping: Empowering Classification Through Transfer Learning And Deep Learning Integration, Zhe Wang, Chao Fan, Xian Min, Shoukun Sun, Xiaogang Ma, Xiang Que

I-GUIDE Forum

Urban land cover mapping is essential for effective urban planning and resource management. Thanks to its ability to extract intricate features from urban datasets, deep learning has emerged as a powerful technique for urban classification. The U-net architecture has achieved state-of-the-art land cover classification performance, highlighting its potential for mapping urban trees at different spatial scales. However, deep learning approaches often require large, labeled datasets, which are challenging to acquire for specific urban contexts. Transfer learning addresses this limitation by leveraging pre-trained deep learning models on extensive datasets and adapting them to smaller urban datasets with limited labeled samples. Transfer …


Instagram Travel Influencers Coping With Covid-19 Travel Disruption, Andrei Kirilenko, Katarzyna Emin, Karen Tavares Jun 2023

Instagram Travel Influencers Coping With Covid-19 Travel Disruption, Andrei Kirilenko, Katarzyna Emin, Karen Tavares

ITSA 2022 Gran Canaria - 9th Biennial Conference: Corporate Entrepreneurship and Global Tourism Strategies After Covid 19

A significant portion of today’s marketing is done through social media influencers, that is, through bloggers with established online credibility in a certain area who are recognized and followed by a sizable online audience. In the travel and hospitality industry, the influencer marketing is primarily done through Instagram due to its emphasis on visual images rather than texts. Covid-19 related travel restrictions and shrinking social media advertisement in travel industry have heavily impacted travel influencers, reducing their income and forcing many out of business. We present the outcomes of a study of the top 150 online travel influencers. The analysis …


Automated Delineation Of Visual Area Boundaries And Eccentricities By A Cnn Using Functional, Anatomical, And Diffusion-Weighted Mri Data, Noah C. Benson, Bogeng Song, Toshikazu Miyata, Hiromasa Takemura, Jonathan Winawer May 2023

Automated Delineation Of Visual Area Boundaries And Eccentricities By A Cnn Using Functional, Anatomical, And Diffusion-Weighted Mri Data, Noah C. Benson, Bogeng Song, Toshikazu Miyata, Hiromasa Takemura, Jonathan Winawer

MODVIS Workshop

Delineating visual field maps and iso-eccentricities from fMRI data is an important but time-consuming task for many neuroimaging studies on the human visual cortex because the traditional methods of doing so using retinotopic mapping experiments require substantial expertise as well as scanner, computer, and human time. Automated methods based on gray-matter anatomy or a combination of anatomy and functional mapping can reduce these requirements but are less accurate than experts. Convolutional Neural Networks (CNNs) are powerful tools for automated medical image segmentation. We hypothesize that CNNs can define visual area boundaries with high accuracy. We trained U-Net CNNs with ResNet18 …


Toward A Manifold Encoding Neural Responses, Luciano Dyballa, Andra M. Rudzite, Mahmood S. Hoseini, Mishek Thapa, Michael P. Stryker, Greg D. Field, Steven W. Zucker May 2023

Toward A Manifold Encoding Neural Responses, Luciano Dyballa, Andra M. Rudzite, Mahmood S. Hoseini, Mishek Thapa, Michael P. Stryker, Greg D. Field, Steven W. Zucker

MODVIS Workshop

Understanding circuit properties from physiological data presents two challenges: (i) recordings do not reveal connectivity, and (ii) stimuli only exercise circuits to a limited extent. We address these challenges for the mouse visual system with a novel neural manifold obtained using unsupervised algorithms. Each point in our manifold is a neuron; nearby neurons respond similarly in time to similar parts of a stimulus ensemble. This ensemble includes drifting gratings and flows, i.e., patterns resembling what a mouse would “see” running through fields.

Regarding (i), our manifold differs from the standard practice in computational neuroscience: embedding trials in neural coordinates. Topology …


Polarimetric Radar And Vhf Lightning Observations In A Significantly Tornadic Supercell, Jacob Bruss Nov 2022

Polarimetric Radar And Vhf Lightning Observations In A Significantly Tornadic Supercell, Jacob Bruss

The Journal of Purdue Undergraduate Research

No abstract provided.


Supporting The Protect Initiative, Josh Lefton, Jackson Murray, Ahmed Thabet, Sriram Baireddy, Prakash Shukla, Mridul Gupta, Reagan Becker, Julie Ertle, Tony Doan, Aerin Yang Nov 2022

Supporting The Protect Initiative, Josh Lefton, Jackson Murray, Ahmed Thabet, Sriram Baireddy, Prakash Shukla, Mridul Gupta, Reagan Becker, Julie Ertle, Tony Doan, Aerin Yang

Purdue Journal of Service-Learning and International Engagement

Recently, medication dosage errors have received more political and media attention. Dosage errors are the most common medical errors, affecting about 1.5 million people annually.

Furthermore, U.S. poison-control centers reported more than 200,000 cases per year of medication errors. These cases result in medical costs of around $3.5 billion, and children under 6 years old constitute approximately 30% of these cases.

The PROTECT Initiative (Preventing Overdoses and Treatment Errors in Children Taskforce) was launched in 2008 as a collaborative effort between public health agencies and patient advocates to minimize dosage errors.

In alignment with the PROTECT Initiative effort, this project …


Towards A Burden-Free Implicit Authentication For Wearable Device Users, Bryan Lee, Sudip Vhaduri Jan 2022

Towards A Burden-Free Implicit Authentication For Wearable Device Users, Bryan Lee, Sudip Vhaduri

Discovery Undergraduate Interdisciplinary Research Internship

The state of current knowledge-based wearable authentication systems requires users to physically interact with a device to initiate and validate their presence, thereby imposing a burden on the user. However, with the recent advancements of sensor technologies in consumer smart wearables (e.g., Fitbit and Apple watches), we were able to utilize vectors of statistical features extracted from the continuous stream of data from these IoT devices to implicitly validate a user's activities and its spatiotemporal context via the use of machine learning techniques. To improve the performance of our models, additional soft biometric data (i.e., respiratory sounds) was collected, and …


Physics-Informed Machine Learning To Predict Extreme Weather Events, Rthvik Raviprakash, Jonathan Buchanan, Mahdi Bu Ali Dec 2021

Physics-Informed Machine Learning To Predict Extreme Weather Events, Rthvik Raviprakash, Jonathan Buchanan, Mahdi Bu Ali

Discovery Undergraduate Interdisciplinary Research Internship

Extreme weather events refer to unexpected, severe, or unseasonal weather events, which are dynamically related to specific large-scale atmospheric patterns. These extreme weather events have a significant impact on human society and also natural ecosystems. For example, natural disasters due to extreme weather events caused more than $90 billion global direct losses in 2015. These extreme weather events are challenging to predict due to the chaotic nature of the atmosphere and are highly correlated with the occurrence of atmospheric blocking. A key aspect for preparedness and response to extreme climate events is accurate medium-range forecasting of atmospheric blocking events.

Unlike …


Automated Data Processing: Making Community Indicators Possible For Lafayette, Indiana, Jace T. Newell, Eli W. Coltin, Eric D. Flaningam Oct 2021

Automated Data Processing: Making Community Indicators Possible For Lafayette, Indiana, Jace T. Newell, Eli W. Coltin, Eric D. Flaningam

The Journal of Purdue Undergraduate Research

No abstract provided.


Data Is Personal: We Should Treat It As Such, Kaleb Dunn Sep 2020

Data Is Personal: We Should Treat It As Such, Kaleb Dunn

Student Papers in Public Policy

The rise of the internet as a fact of daily life is the defining element of the modern age. Widespread use of the internet has fundamentally altered entire industries, and much of American life has migrated online. Dating is augmented by online dating; shopping by online shopping; television by internet streaming.

The digitization of American life has brought with it considerable benefits, including great convenience and innumerable efficiencies, but it has not come without a cost. Although there are many business models used by internet companies, many of the now-largest companies in the world have converged on one entity upon …


Implement Multi-Factor Authentication On All Federal Systems Now, Megan Walsh Sep 2020

Implement Multi-Factor Authentication On All Federal Systems Now, Megan Walsh

Student Papers in Public Policy

The White House Office of Management and Budget recorded 31,107 information security incidents in fiscal year 2018. The most common attacks to gain access to a user’s login credentials were e-mail/phishing, web-based attack, and brute force entering of username/password combinations. Given this high number of incidents, strong reliance on computers for everyday business, and common attacks that target passwords, information security should be a priority for information technology administrators working in federal agencies.


Removing Racially Biased Algorithms In Policing, Andie Lee Sep 2020

Removing Racially Biased Algorithms In Policing, Andie Lee

Student Papers in Public Policy

Local police departments use algorithm-based programs to do police work and predict crime. Technology has created the police tactic of predictive crime prevention. Police work, however, requires social skills, assessment of the environment, and most importantly human interaction. Automated policing lacks these characteristics. Moreover, the algorithms used to make crime predictions and risk assessments have disproportionately affected minorities.


The Case For Online Ranked-Choice Voting, Rayyan Khan Sep 2020

The Case For Online Ranked-Choice Voting, Rayyan Khan

Student Papers in Public Policy

Maine was the first to embrace ranked-choice voting on a statewide level in 2018, using it for all state and general elections. Maine voters will be the first to use ranked-choice voting in a presidential election in 2020. This system differs from traditional voting in that voters rank candidates rather than choose just one. Supporters of ranked-choice voting tout it as a better model for accurately representing the values of the voting population; however, a study conducted in San Francisco details a potential shortfall referred to as “ballot fatigue” that the theoretically-ideal system may face as it struggles to deal …