Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (12)
- Other Computer Sciences (6)
- Social and Behavioral Sciences (6)
- Medicine and Health Sciences (5)
- Arts and Humanities (4)
-
- Applied Mathematics (3)
- Artificial Intelligence and Robotics (3)
- Databases and Information Systems (3)
- Diseases (3)
- Engineering (3)
- Statistics and Probability (3)
- Analytical, Diagnostic and Therapeutic Techniques and Equipment (2)
- Applied Statistics (2)
- Atmospheric Sciences (2)
- Biostatistics (2)
- Cardiovascular Diseases (2)
- Diagnosis (2)
- Earth Sciences (2)
- Logic and Foundations (2)
- Logic and Foundations of Mathematics (2)
- Longitudinal Data Analysis and Time Series (2)
- Mathematics (2)
- Numerical Analysis and Scientific Computing (2)
- Oceanography and Atmospheric Sciences and Meteorology (2)
- Other Applied Mathematics (2)
- Other Mathematics (2)
- Philosophy (2)
- Psychology (2)
- Keyword
-
- Machine learning (5)
- Deep learning (3)
- Arrhythmia (2)
- Biometrics (2)
- Computer Vision (2)
-
- Convolutional neural networks (CNN) (2)
- Machine Learning (2)
- Adjusted Box-Pierce (1)
- Amazon rekognition API (1)
- Anorexia (1)
- Art (1)
- Artificial Intelligence (1)
- Autism (1)
- Autism Spectrum Disorder (ASD) (1)
- Autism spectrum disorder (1)
- Auto-correlation (1)
- Autocorrelation (1)
- BERT (1)
- Bayesian (1)
- Big data (1)
- Bioinformatics (1)
- Bitcoin (1)
- Body image (1)
- CNN (1)
- CSES and DEMETER satellites (1)
- Causal inference (1)
- Clustering (1)
- Code Comments (1)
- Computational systems (1)
- Computer vision (1)
- Publication
- Publication Type
Articles 1 - 28 of 28
Full-Text Articles in Data Science
Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto
Random Variable Spaces: Mathematical Properties And An Extension To Programming Computable Functions, Mohammed Kurd-Misto
Computational and Data Sciences (PhD) Dissertations
This dissertation aims to extend the boundaries of Programming Computable Functions (PCF) by introducing a novel collection of categories referred to as Random Variable Spaces. Originating as a generalization of Quasi-Borel Spaces, Random Variable Spaces are rigorously defined as categories where objects are sets paired with a collection of random variables from an underlying measurable space. These spaces offer a theoretical foundation for extending PCF to natively handle stochastic elements.
The dissertation is structured into seven chapters that provide a multi-disciplinary background, from PCF and Measure Theory to Category Theory with special attention to Monads and the Giry Monad. The …
Verifying Empirical Predictive Modeling Of Societal Vulnerability To Hazardous Events: A Monte Carlo Experimental Approach, Yi Victor Wang, Seung Hee Kim, Menas C. Kafatos
Verifying Empirical Predictive Modeling Of Societal Vulnerability To Hazardous Events: A Monte Carlo Experimental Approach, Yi Victor Wang, Seung Hee Kim, Menas C. Kafatos
Institute for ECHO Articles and Research
With the emergence of large amounts of historical records on adverse impacts of hazardous events, empirical predictive modeling has been revived as a foundational paradigm for quantifying disaster vulnerability of societal systems. This paradigm models societal vulnerability to hazardous events as a vulnerability curve indicating an expected loss rate of a societal system with respect to a possible spectrum of intensity measure (IM) of an event. Although the empirical predictive models (EPMs) of societal vulnerability are calibrated on historical data, they should not be experimentally tested with data derived from field experiments on any societal system. Alternatively, in this paper, …
Computational Analysis Of Antibody Binding Mechanisms To The Omicron Rbd Of Sars-Cov-2 Spike Protein: Identification Of Epitopes And Hotspots For Developing Effective Therapeutic Strategies, Mohammed Alshahrani
Computational Analysis Of Antibody Binding Mechanisms To The Omicron Rbd Of Sars-Cov-2 Spike Protein: Identification Of Epitopes And Hotspots For Developing Effective Therapeutic Strategies, Mohammed Alshahrani
Computational and Data Sciences (PhD) Dissertations
The advent of the Omicron strain of SARS-CoV-2 has elicited apprehension regarding its potential influence on the effectiveness of current vaccines and antibody treatments. The present investigation involved the implementation of mutational scanning analyses to examine the impact of Omicron mutations on the binding affinity of four categories of antibodies that target the Omicron receptor binding domain (RBD) of the Spike protein. The study demonstrates that the Omicron variant harbors 23 unique mutations across the RBD regions I, II, III, and IV. Of these mutations, seven are shared between RBD regions I and II, while three are shared among RBD …
Ideology Prediction From Scarce And Biased Supervision: Learn To Disregard The “What” And Focus On The “How”!, Chen Chen, Dylan Walker, Venkatesh Saligrama
Ideology Prediction From Scarce And Biased Supervision: Learn To Disregard The “What” And Focus On The “How”!, Chen Chen, Dylan Walker, Venkatesh Saligrama
Business Faculty Articles and Research
We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. This problem is motivated by the fact that manual data-labeling is expensive, while self-reported labels are often scarce and exhibit significant selection bias. We propose a novel statistical model that decomposes the document embeddings into a linear superposition of two vectors; a latent neutral context vector independent of ideology, and a latent position vector aligned with ideology. We train an end-to-end model that has intermediate contextual and positional vectors as outputs. At deployment time, our model predicts labels for input documents …
Dense & Attention Convolutional Neural Networks For Toe Walking Recognition, Junde Chen, Rahul Soangra, Marybeth Grant-Beuttler, Y. A. Nanehkaran, Yuxin Wen
Dense & Attention Convolutional Neural Networks For Toe Walking Recognition, Junde Chen, Rahul Soangra, Marybeth Grant-Beuttler, Y. A. Nanehkaran, Yuxin Wen
Physical Therapy Faculty Articles and Research
Idiopathic toe walking (ITW) is a gait disorder where children’s initial contacts show limited or no heel touch during the gait cycle. Toe walking can lead to poor balance, increased risk of falling or tripping, leg pain, and stunted growth in children. Early detection and identification can facilitate targeted interventions for children diagnosed with ITW. This study proposes a new one-dimensional (1D) Dense & Attention convolutional network architecture, which is termed as the DANet, to detect idiopathic toe walking. The dense block is integrated into the network to maximize information transfer and avoid missed features. Further, the attention modules are …
Text And Data Mining Applications For Teaching Music Bibliography, Taylor Greene, Laurie Sampsel
Text And Data Mining Applications For Teaching Music Bibliography, Taylor Greene, Laurie Sampsel
Library Presentations, Posters, and Audiovisual Materials
Text and data mining (TDM) is a process of increasing interdisciplinary potential and one with many practical applications for music graduate students. TDM, however, remains a topic rarely introduced in the music bibliography course. Understandably, talk of artificial intelligence, algorithms, and programming languages are intimidating to music students, but thanks to software applications, knowledge about these computer science topics are not required to participate in research using TDM. This presentation explores ways to introduce digital humanities to music students through TDM.
In our presentation, we will discuss two approaches to incorporating TDM into the music bibliography course, focusing on two …
Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead
Automated Identification Of Astronauts On Board The International Space Station: A Case Study In Space Archaeology, Rao Hamza Ali, Amir Kanan Kashefi, Alice C. Gorman, Justin St. P. Walsh, Erik J. Linstead
Art Faculty Articles and Research
We develop and apply a deep learning-based computer vision pipeline to automatically identify crew members in archival photographic imagery taken on-board the International Space Station. Our approach is able to quickly tag thousands of images from public and private photo repositories without human supervision with high degrees of accuracy, including photographs where crew faces are partially obscured. Using the results of our pipeline, we carry out a large-scale network analysis of the crew, using the imagery data to provide novel insights into the social interactions among crew during their missions.
A Comparative Study On Deep Learning Models For Text Classification Of Unstructured Medical Notes With Various Levels Of Class Imbalance, Hongxia Lu, Louis Ehwerhemuepha, Cyril Rakovski
A Comparative Study On Deep Learning Models For Text Classification Of Unstructured Medical Notes With Various Levels Of Class Imbalance, Hongxia Lu, Louis Ehwerhemuepha, Cyril Rakovski
Mathematics, Physics, and Computer Science Faculty Articles and Research
Background
Discharge medical notes written by physicians contain important information about the health condition of patients. Many deep learning algorithms have been successfully applied to extract important information from unstructured medical notes data that can entail subsequent actionable results in the medical domain. This study aims to explore the model performance of various deep learning algorithms in text classification tasks on medical notes with respect to different disease class imbalance scenarios.
Methods
In this study, we employed seven artificial intelligence models, a CNN (Convolutional Neural Network), a Transformer encoder, a pretrained BERT (Bidirectional Encoder Representations from Transformers), and four typical …
Assessing The Reidentification Risks Posed By Deep Learning Algorithms Applied To Ecg Data, Arin Ghazarian, Jianwei Zheng, Daniele Struppa, Cyril Rakovski
Assessing The Reidentification Risks Posed By Deep Learning Algorithms Applied To Ecg Data, Arin Ghazarian, Jianwei Zheng, Daniele Struppa, Cyril Rakovski
Mathematics, Physics, and Computer Science Faculty Articles and Research
ECG (Electrocardiogram) data analysis is one of the most widely used and important tools in cardiology diagnostics. In recent years the development of advanced deep learning techniques and GPU hardware have made it possible to train neural network models that attain exceptionally high levels of accuracy in complex tasks such as heart disease diagnoses and treatments. We investigate the use of ECGs as biometrics in human identification systems by implementing state-of-the-art deep learning models. We train convolutional neural network models on approximately 81k patients from the US, Germany and China. Currently, this is the largest research project on ECG identification. …
A Large-Scale Sentiment Analysis Of Tweets Pertaining To The 2020 Us Presidential Election, Rao Hamza Ali, Gabriela Pinto, Evelyn Lawrie, Erik J. Linstead
A Large-Scale Sentiment Analysis Of Tweets Pertaining To The 2020 Us Presidential Election, Rao Hamza Ali, Gabriela Pinto, Evelyn Lawrie, Erik J. Linstead
Engineering Faculty Articles and Research
We capture the public sentiment towards candidates in the 2020 US Presidential Elections, by analyzing 7.6 million tweets sent out between October 31st and November 9th, 2020. We apply a novel approach to first identify tweets and user accounts in our database that were later deleted or suspended from Twitter. This approach allows us to observe the sentiment held for each presidential candidate across various groups of users and tweets: accessible tweets and accounts, deleted tweets and accounts, and suspended or inaccessible tweets and accounts. We compare the sentiment scores calculated for these groups and provide key insights into the …
A Novel Correction For The Adjusted Box-Pierce Test, Sidy Danioko, Jianwei Zheng, Kyle Anderson, Alexander Barrett, Cyril S. Rakovski
A Novel Correction For The Adjusted Box-Pierce Test, Sidy Danioko, Jianwei Zheng, Kyle Anderson, Alexander Barrett, Cyril S. Rakovski
Mathematics, Physics, and Computer Science Faculty Articles and Research
The classical Box-Pierce and Ljung-Box tests for auto-correlation of residuals possess severe deviations from nominal type I error rates. Previous studies have attempted to address this issue by either revising existing tests or designing new techniques. The Adjusted Box-Pierce achieves the best results with respect to attaining type I error rates closer to nominal values. This research paper proposes a further correction to the adjusted Box-Pierce test that possesses near perfect type I error rates. The approach is based on an inflation of the rejection region for all sample sizes and lags calculated via a linear model applied to simulated …
Computational Approaches To Facilitate Automated Interchange Between Music And Art, Rao Hamza Ali
Computational Approaches To Facilitate Automated Interchange Between Music And Art, Rao Hamza Ali
Computational and Data Sciences (PhD) Dissertations
Recently, there has been a tremendous increase in generating and synthesizing music and art using various computational techniques. An area that is still under-researched, however, is how one medium can be converted into the other, while maintaining the overall aesthetics. Over the last few centuries, artists, composers, and scholars, have attempted to use substitute one form of art for the other: by proposing techniques where music notes are synonymous to colors, by inventing instruments that combine the aesthetics of music and visual art, and by incorporating the two media in live performances. A widely accepted computational approach, for the conversion, …
Causalmodels: An R Library For Estimating Causal Effects, Joshua Wolff Anderson
Causalmodels: An R Library For Estimating Causal Effects, Joshua Wolff Anderson
Computational and Data Sciences (MS) Theses
Free and open source software for statistical modeling and machine learning have advanced productivity in data science significantly. Packages such as SciPy in Python and caret in R provide fundamental tools for statistical modeling and machine learning in the two most popular programming languages used by data scientists. Unfortunately, robust tools similar to these are limited in terms of causal inference. The tools in R that exist lack consistent and standardized methodologies and inputs. R lacks a comprehensive package that offers traditional causal inference methods such as standardization, IP weighting, G-estimation, outcome regression, and propensity matching in one common package. …
An Information-Theoretic Analysis Of Adherence To Physical Exercise Routines, Lily Foster
An Information-Theoretic Analysis Of Adherence To Physical Exercise Routines, Lily Foster
Computational and Data Sciences (MS) Theses
One of the most common recommendations in healthcare is to simply form healthy habits, but little research has been done to understand the formation and continuation of a healthy habit that isn’t heavily influenced by an individual’s interpretation. Arizona State University’s WalkIT study aimed to analyze how goal setting and financial reinforcement can influence moderate-to-vigorous physical activity (MVPA) in adults, while using data from accelerometers to alleviate individual bias. In this trial, 512 insufficiently active adults were recruited to wear an accelerometer for 1 year and were then randomly assigned to one of the four study groups. Each group had …
Pre-Earthquake Ionospheric Perturbation Identification Using Cses Data Via Transfer Learning, Pan Xiong, Cheng Long, Huiyu Zhou, Roberto Battiston, Angelo De Santis, Dimitar Ouzounov, Xuemin Zhang, Xuhui Shen
Pre-Earthquake Ionospheric Perturbation Identification Using Cses Data Via Transfer Learning, Pan Xiong, Cheng Long, Huiyu Zhou, Roberto Battiston, Angelo De Santis, Dimitar Ouzounov, Xuemin Zhang, Xuhui Shen
Mathematics, Physics, and Computer Science Faculty Articles and Research
During the lithospheric buildup to an earthquake, complex physical changes occur within the earthquake hypocenter. Data pertaining to the changes in the ionosphere may be obtained by satellites, and the analysis of data anomalies can help identify earthquake precursors. In this paper, we present a deep-learning model, SeqNetQuake, that uses data from the first China Seismo-Electromagnetic Satellite (CSES) to identify ionospheric perturbations prior to earthquakes. SeqNetQuake achieves the best performance [F-measure (F1) = 0.6792 and Matthews correlation coefficient (MCC) = 0.427] when directly trained on the CSES dataset with a spatial window centered on the earthquake epicenter with the Dobrovolsky …
Exploring Behaviors Of Software Developers And Their Code Through Computational And Statistical Methods, Elia Eiroa Lledo
Exploring Behaviors Of Software Developers And Their Code Through Computational And Statistical Methods, Elia Eiroa Lledo
Computational and Data Sciences (PhD) Dissertations
As Artificial Intelligence (AI) increasingly penetrates all aspects of society, many obstacles emerge. This thesis identifies and discusses the issues facing Computer Vision and significant deficiencies in the Software Development Life-cycle that need to be resolved to facilitate the evolution toward true artificial intelligence. We explicitly review the concepts behind Convolutional Neural Network (CNN) models, the benchmark for computer vision. Chapter 2 highlights the mechanisms that have popularized CNNs while also specifying significant gaps that could garner the model inadequate for future use in safety-critical systems. We put forward two main limitations. Namely, CNNs do not use lack of information …
Multi-Modal Data Fusion, Image Segmentation, And Object Identification Using Unsupervised Machine Learning: Conception, Validation, Applications, And A Basis For Multi-Modal Object Detection And Tracking, Nicholas Lahaye
Computational and Data Sciences (PhD) Dissertations
Remote sensing and instrumentation is constantly improving and increasing in capability. Included within this, is the increase in amount of different instrument types, with various combinations of spatial and spectral resolutions, pointing angles, and various other instrument-specific qualities. While the increase in instruments, and therefore datasets, is a boon for those aiming to study the complexities of the various Earth systems, it can also present a large number of new challenges. With this information in mind, our group has set our aims on combining datasets with different spatial and spectral resolutions in an effective and as-general-as-possible way, with as little …
Enhancing Microbiome Host Disease Prediction With Variational Autoencoders, Celeste Manughian-Peter
Enhancing Microbiome Host Disease Prediction With Variational Autoencoders, Celeste Manughian-Peter
Computational and Data Sciences (MS) Theses
Advancements in genetic sequencing methods for microbiomes in recent decades have permitted the collection of taxonomic and functional profiles of microbial communities, accelerating the discovery of the functional aspects of the microbiome and generating an increased interest among clinicians in applying these techniques with patients. This advancement has coincided with software and hardware improvements in the field of machine learning and deep learning. Combined, these advancements implicate further potential for progress in disease diagnosis and treatment in humans. The ability to classify a human microbiome profile into a disease category, and additionally identify the differentiating factors within the profile between …
Assessing The Re-Identification Risk In Ecg Datasets And An Application Of Privacy Preserving Techniques In Ecg Analysis, Arin Ghazarian
Assessing The Re-Identification Risk In Ecg Datasets And An Application Of Privacy Preserving Techniques In Ecg Analysis, Arin Ghazarian
Computational and Data Sciences (PhD) Dissertations
In this work, first we investigate the use of ECG signal as a biometric in human identification systems using deep learning models. We train convolutional neural network models on ECG samples from approximately 81k patients. Our models achieved an over-all accuracy of 95.69%. Further, we assess the accuracy of our ECG identification model for distinct groups of patients with particular heart conditions and combinations of such conditions. For example, we observed that the identification accuracy was the highest (99.7%) for patients with both ST changes and supraventricular tachycardia. On the other hand, we also found that the identification rate was …
Novel Applications Of Statistical And Machine Learning Methods To Analyze Trial-Level Data From Cognitive Measures, Chelsea Parlett
Novel Applications Of Statistical And Machine Learning Methods To Analyze Trial-Level Data From Cognitive Measures, Chelsea Parlett
Computational and Data Sciences (PhD) Dissertations
Many cognitive tasks and measures can benefit from trial-level analyses including Item Response Theory models as well as other Bayesian and Machine Learning models. Specifically, this dissertation focuses mainly on task-based measures of metamemory and how within-set variability as well as item-level characteristics can improve the inferences researchers make about these measures.First, a clustering analysis of judgements of learning across a task is examined in order to detect different participant strategies on a metamemory task and whether strategy use differs by age. Second, the benefits of using item response theory models to analyze both individual and item-level differences in metamemory …
Optimal Analytical Methods For High Accuracy Cardiac Disease Classification And Treatment Based On Ecg Data, Jianwei Zheng
Optimal Analytical Methods For High Accuracy Cardiac Disease Classification And Treatment Based On Ecg Data, Jianwei Zheng
Computational and Data Sciences (PhD) Dissertations
This work constitutes six projects. In the first project, a newly inaugurated research database for 12-lead electrocardiogram signals was created under the auspices of Chapman University and Shaoxing People's Hospital (Shaoxing Hospital Zhejiang University School of Medicine). This database aims to enable the scientific community in conducting new studies on arrhythmia and other cardiovascular conditions. In the second project, we created a new 12-lead ECG database under the auspices of Chapman University and Ningbo First Hospital of Zhejiang University that aims to provide high quality data enabling detection of the distinctions between idiopathic ventricular arrhythmia from right ventricular outflow tract …
The Agnostic Structure Of Data Science Methods, Domenico Napoletani, Marco Panza, Daniele Struppa
The Agnostic Structure Of Data Science Methods, Domenico Napoletani, Marco Panza, Daniele Struppa
MPP Published Research
In this paper we argue that data science is a coherent and novel approach to empirical problems that, in its most general form, does not build understanding about phenomena. Within the new type of mathematization at work in data science, mathematical methods are not selected because of any relevance for a problem at hand; mathematical methods are applied to a specific problem only by `forcing’, i.e. on the basis of their ability to reorganize the data for further analysis and the intrinsic richness of their mathematical structure. In particular, we argue that deep learning neural networks are best understood within …
Applications Of Machine Learning To Facilitate Software Engineering And Scientific Computing, Natalie Best
Applications Of Machine Learning To Facilitate Software Engineering And Scientific Computing, Natalie Best
Computational and Data Sciences (PhD) Dissertations
The use of machine learning has risen in recent years, though many areas remain unexplored due to lack of data or lack of computational tools. This dissertation explores machine learning approaches in case studies involving image classification and natural language processing. In addition, a software library in the form of two-way bridge connecting deep learning models in Keras with ones available in the Fortran programming language is also presented.
In Chapter 2, we explore the applicability of transfer learning utilizing models pre-trained on non-software engineering data applied to the problem of classifying software unified modeling language diagrams where data is …
Forecasting The Prices Of Cryptocurrencies Using A Novel Parameter Optimization Of Varima Models, Alexander Barrett
Forecasting The Prices Of Cryptocurrencies Using A Novel Parameter Optimization Of Varima Models, Alexander Barrett
Computational and Data Sciences (PhD) Dissertations
This work is a comparative study of different univariate and multivariate time series predictive models as applied to Bitcoin, other cryptocurrencies, and other related financial time series data. ARIMA models, long regarded as the gold standard of univariate financial time series prediction due to both its flexibility and simplicity, are used a baseline for prediction. Given the highly correlative nature amongst different cryptocurrencies, this work aims to show the benefit of forecasting with multivariate time series models—primarily focusing on a novel parameter optimization of VARIMA models outlined in this paper.
These models are trained on 3 years of historical data, …
Spatial Frequency Implications For Global And Local Processing In Autistic Children, Riya Mody, Ayra Tusneem, Louanne Boyd, Vincent Berardi
Spatial Frequency Implications For Global And Local Processing In Autistic Children, Riya Mody, Ayra Tusneem, Louanne Boyd, Vincent Berardi
Student Scholar Symposium Abstracts and Posters
Visual processing in humans is done by integrating and updating multiple streams of global and local sensory input. Interaction between these two systems can be disrupted in individuals with ASD and other learning disabilities. When this integration is not done smoothly, it becomes difficult to see the “big picture”, which has been found to have implications on emotion recognition, social skills, and conversation skills. An example of this phenomenon is local interference, which is when local details are prioritized over the global features. Previous research in this field has aimed to decrease local interference by developing and evaluating a filter …
A Novel Correction For The Adjusted Box-Pierce Test — New Risk Factors For Emergency Department Return Visits Within 72 Hours For Children With Respiratory Conditions — General Pediatric Model For Understanding And Predicting Prolonged Length Of Stay, Sidy Danioko
Computational and Data Sciences (PhD) Dissertations
This thesis represents the results of three research projects that underline the breadth and depth of my interests.
Firstly, I devoted some efforts to the well-known Box-Pierce goodness-of-fit tests for time series models which has been an important research topic over the last few decades. All previously proposed tests are focused on changes of the test statistics. Instead, I adopted a different approach that takes the best performing test and modifying the rejection region. Thus, I developed a semiparametric correction of the Adjusted Box-Pierce test that attains the best I error rates for all sample sizes and lags and outperforms …
Gaining Computational Insight Into Psychological Data: Applications Of Machine Learning With Eating Disorders And Autism Spectrum Disorder, Natalia Rosenfield
Gaining Computational Insight Into Psychological Data: Applications Of Machine Learning With Eating Disorders And Autism Spectrum Disorder, Natalia Rosenfield
Computational and Data Sciences (PhD) Dissertations
Over the past 100 years, assessment tools have been developed that allow us to explore mental and behavioral processes that could not be measured before. However, conventional statistical models used for psychological data are lacking in thoroughness and predictability. This provides a perfect opportunity to use machine learning to study the data in a novel way. In this paper, we present examples of using machine learning techniques with data in three areas: eating disorders, body satisfaction, and Autism Spectrum Disorder (ASD). We explore clustering algorithms as well as virtual reality (VR).
Our first study employs the k-means clustering algorithm to …
Agnostic Science. Towards A Philosophy Of Data Analysis, Domenico Napoletani, Marco Panza, Daniele C. Struppa
Agnostic Science. Towards A Philosophy Of Data Analysis, Domenico Napoletani, Marco Panza, Daniele C. Struppa
MPP Published Research
In this paper we will offer a few examples to illustrate the orientation of contemporary research in data analysis and we will investigate the corresponding role of mathematics. We argue that the modus operandi of data analysis is implicitly based on the belief that if we have collected enough and sufficiently diverse data, we will be able to answer most relevant questions concerning the phenomenon itself. This is a methodological paradigm strongly related, but not limited to, biology, and we label it the microarray paradigm. In this new framework, mathematics provides powerful techniques and general ideas which generate new …