Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Southern Methodist University (4)
- West Virginia University (3)
- City University of New York (CUNY) (2)
- Louisiana State University (2)
- Technological University Dublin (2)
-
- Western University (2)
- California Polytechnic State University, San Luis Obispo (1)
- California State University, San Bernardino (1)
- Central Washington University (1)
- Claremont Colleges (1)
- Harrisburg University of Science and Technology (1)
- Kennesaw State University (1)
- The University of Southern Mississippi (1)
- University of Nebraska - Lincoln (1)
- Publication
-
- SMU Data Science Review (4)
- Electronic Thesis and Dissertation Repository (2)
- Graduate Theses, Dissertations, and Problem Reports (2)
- LSU Doctoral Dissertations (2)
- All Master's Theses (1)
-
- Articles (1)
- CMC Senior Theses (1)
- Conference papers (1)
- Dissertations (1)
- Dissertations, Theses, and Capstone Projects (1)
- Electronic Theses, Projects, and Dissertations (1)
- Faculty & Staff Scholarship (1)
- Master of Science in Computer Science Theses (1)
- Master's Theses (1)
- Other Student Works (1)
- Publications and Research (1)
- School of Computing: Faculty Publications (1)
- Publication Type
Articles 1 - 23 of 23
Full-Text Articles in Physical Sciences and Mathematics
Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott
Distributed Load Testing By Modeling And Simulating User Behavior, Chester Ira Parrott
LSU Doctoral Dissertations
Modern human-machine systems such as microservices rely upon agile engineering practices which require changes to be tested and released more frequently than classically engineered systems. A critical step in the testing of such systems is the generation of realistic workloads or load testing. Generated workload emulates the expected behaviors of users and machines within a system under test in order to find potentially unknown failure states. Typical testing tools rely on static testing artifacts to generate realistic workload conditions. Such artifacts can be cumbersome and costly to maintain; however, even model-based alternatives can prevent adaptation to changes in a system …
Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao
Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao
Articles
It is often the case with new technologies that it is very hard to predict their long-term impacts and as a result, although new technology may be beneficial in the short term, it can still cause problems in the longer term. This is what happened with oil by-products in different areas: the use of plastic as a disposable material did not take into account the hundreds of years necessary for its decomposition and its related long-term environmental damage. Data is said to be the new oil. The message to be conveyed is associated with its intrinsic value. But as in …
Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha
Towards High Performance Stock Market Prediction Methods, Warren M. Landis, Sangwhan Cha
Other Student Works
Stock markets of today, and will continue to in the future, rely on the metrics of timeliness and efficiency to reach optimal profits. A way stock investors have continued to strive for the best of these two factors of the business is through the use of predictive machine learning systems to help aid in their decision making. However, among the many systems currently in use, it could be said that the myriad of data that they are based on may not be sufficient. In an effort to devise an ensemble learning predictive system that will utilize an array of big …
Using Spatial Analysis And Machine Learning Techniques To Develop A Comprehensive Highway-Rail Grade Crossing Consolidation Model, Samira Soleimani
Using Spatial Analysis And Machine Learning Techniques To Develop A Comprehensive Highway-Rail Grade Crossing Consolidation Model, Samira Soleimani
LSU Doctoral Dissertations
The safety of highway-railroad grade crossings (HRGC) is still an issue in the United States of America (USA). The grade crossing is where a railroad crosses a road at the same level without any over or underpass. To improve the safety of crossings, the crossings’ condition should be explored from several aspects such as engineering design (speed limit, warning signs, etc.), road condition (number of lanes, surface markings, etc.), rail design (the type of track, ballast, etc.), temporal variables (weather, visibility, time of day, lightning, etc.), social variables (population, race, etc.), and last but not least, spatial variables (the type …
Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed
Cover Song Identification - A Novel Stem-Based Approach To Improve Song-To-Song Similarity Measurements, Lavonnia Newman, Dhyan Shah, Chandler Vaughn, Faizan Javed
SMU Data Science Review
Music is incorporated into our daily lives whether intentional or unintentional. It evokes responses and behavior so much so there is an entire study dedicated to the psychology of music. Music creates the mood for dancing, exercising, creative thought or even relaxation. It is a powerful tool that can be used in various venues and through advertisements to influence and guide human reactions. Music is also often "borrowed" in the industry today. The practices of sampling and remixing music in the digital age have made cover song identification an active area of research. While most of this research is focused …
Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood
Reducing Age Bias In Machine Learning: An Algorithmic Approach, Adriana Solange Garcia De Alford, Steven K. Hayden, Nicole Wittlin, Amy Atwood
SMU Data Science Review
In this paper, we study the prevalence of bias in machine learning; we explore the life cycle phases where bias is potentially introduced into a machine learning model; and lastly, we present how adversarial learning can be leveraged to measure unwanted bias and unfair behavior from a machine learning algorithm. This study focuses particularly on the topics of age bias in predicting employee attrition and presents a practical approach for how adversarial learning can be successful in mitigating age bias. To measure bias, we calculate group fairness metrics across five-year age groups and evaluate fairness between a baseline predictive model …
Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith
Forecasting Spare Parts Sporadic Demand Using Traditional Methods And Machine Learning - A Comparative Study, Bhuvana Adur Kannan, Ganesh Kodi, Oscar Padilla, Dough Gray, Barry C. Smith
SMU Data Science Review
Sporadic demand presents a particular challenge to traditional time forecasting methods. In the past 50 years, there has been developments, such as, the Croston Model [3], which has improved forecast performance. With the rise of Machine Learning (ML) there is abundant research in the field of applying ML algorithms to predict sporadic demand [8][12][9]. However, most existing research has analyzed this problem from the demand side [17]. In this paper, we tackle this predictive analytics challenge from the supply side. We perform a comparative analysis utilizing a spare parts demand dataset from an Original Equipment Manufacturer (OEM). Since traditional measurements …
Machine Learning Applications For Drug Repurposing, Hansaim Lim
Machine Learning Applications For Drug Repurposing, Hansaim Lim
Dissertations, Theses, and Capstone Projects
The cost of bringing a drug to market is astounding and the failure rate is intimidating. Drug discovery has been of limited success under the conventional reductionist model of one-drug-one-gene-one-disease paradigm, where a single disease-associated gene is identified and a molecular binder to the specific target is subsequently designed. Under the simplistic paradigm of drug discovery, a drug molecule is assumed to interact only with the intended on-target. However, small molecular drugs often interact with multiple targets, and those off-target interactions are not considered under the conventional paradigm. As a result, drug-induced side effects and adverse reactions are often neglected …
Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre
Compressed Dna Representation For Efficient Amr Classification, John Partee, Robert Hazell, Anjli Solsi, John Santerre
SMU Data Science Review
In this paper, we explore a representation methodology for the compression of DNA isolates. Using lossless string compression via tokenization of frequently repeated segments of DNA, we reduce the length of the isolates to be counted as k-mers for classification. With this new representation, we apply a previously established feature sampling method to dramatically reduce the feature space. In understanding the genetic diversity, we also look at conserving biological function across these spaces. Using a random forest model we were able to predict the resistance or susceptibility of bacteria with 85-90\% accuracy, with a 30-50\% reduction in overall isolate length, …
A Study Of Information Bots And Knowledge Bots, Amartya Hatua
A Study Of Information Bots And Knowledge Bots, Amartya Hatua
Dissertations
In this dissertation, a study of different aspects of information bots and knowledge bots is done. The research contributes to a better understanding of the various characteristics of information bots as well as the different patterns and factors responsible for the information diffusion in a social network. This research also shows how these factors can be used to predict information diffusion for a particular topic in a social network. The second part of the research is focused on strategies for improving the knowledge base of knowledge bots, where two different approaches are studied. In the first approach, knowledge is transferred …
Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat
Optimized Machine Learning Models Towards Intelligent Systems, Mohammadnoor Ahmad Mohammad Injadat
Electronic Thesis and Dissertation Repository
The rapid growth of the Internet and related technologies has led to the collection of large amounts of data by individuals, organizations, and society in general [1]. However, this often leads to information overload which occurs when the amount of input (e.g. data) a human is trying to process exceeds their cognitive capacities [2]. Machine learning (ML) has been proposed as one potential methodology capable of extracting useful information from large sets of data [1]. This thesis focuses on two applications. The first is education, namely e-Learning environments. Within this field, this thesis proposes different optimized ML ensemble models to …
Data Mining And Image Classification Using Genetic Programming, Mahsa Shokri Varniab
Data Mining And Image Classification Using Genetic Programming, Mahsa Shokri Varniab
Master of Science in Computer Science Theses
Genetic programming (GP), a capable machine learning and search method, motivated by Darwinian-evolution, is an evolutionary learning algorithm which automatically evolves computer programs in the form of trees to solve problems. This thesis studies the application of GP for data mining and image processing. Knowledge discovery and data mining have been widely used in business, healthcare, and scientific fields. In data mining, classification is supervised learning that identifies new patterns and maps the data to predefined targets. A GP based classifier is developed in order to perform these mappings. GP has been investigated in a series of studies to classify …
Variability In The Effectiveness Of Psychological Interventions Based On Machine Learning In Stem Education, Mohammad Hasan, Bilal Khan
Variability In The Effectiveness Of Psychological Interventions Based On Machine Learning In Stem Education, Mohammad Hasan, Bilal Khan
School of Computing: Faculty Publications
This manuscript presents a framework to investigate the variability in the effectiveness of psychological interventions supported by Machine Learning (ML) based early-warning systems (EWS) in science, technology, engineering, and mathematics education. It emphasizes the importance of investigating the resulting variability and suggests that effective EWS cannot be designed without a deeper understanding of the variability. The framework uses an ML-based model to predict students’ academic performance early in the semester for a Sophomore-level Computer Science course at a public university in the United States. The students were given psychological interventions by sending their end-of-term performance forecast thrice during the semester. …
Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah
Visual Analytics Of Electronic Health Records With A Focus On Acute Kidney Injury, Sheikh S. Abdullah
Electronic Thesis and Dissertation Repository
The increasing use of electronic platforms in healthcare has resulted in the generation of unprecedented amounts of data in recent years. The amount of data available to clinical researchers, physicians, and healthcare administrators continues to grow, which creates an untapped resource with the ability to improve the healthcare system drastically. Despite the enthusiasm for adopting electronic health records (EHRs), some recent studies have shown that EHR-based systems hardly improve the ability of healthcare providers to make better decisions. One reason for this inefficacy is that these systems do not allow for human-data interaction in a manner that fits and supports …
Analysis On Suicidal Ideation Among Adolescents (12-17 Years) In The Usa, Himani Raturi
Analysis On Suicidal Ideation Among Adolescents (12-17 Years) In The Usa, Himani Raturi
Electronic Theses, Projects, and Dissertations
Suicide is one of the leading health concerns in United States among adolescents and the presence of suicidal ideation (SI) is quite high, with ~20-30% of adolescents reporting it at some point. Though we have seen growth and development in the prevention of suicide, there is limited research on the ability to identify the adolescents which might be at risk for SI. The objective behind the project is to identify adolescents with SI using machine learning.
The project shows statistics from different articles on adolescents in the U.S. For this study, adolescent data was taken from NSDUH 2018. Moreover, detailed …
Combining Machine Learning And Empirical Engineering Methods Towards Improving Oil Production Forecasting, Andrew J. Allen
Combining Machine Learning And Empirical Engineering Methods Towards Improving Oil Production Forecasting, Andrew J. Allen
Master's Theses
Current methods of production forecasting such as decline curve analysis (DCA) or numerical simulation require years of historical production data, and their accuracy is limited by the choice of model parameters. Unconventional resources have proven challenging to apply traditional methods of production forecasting because they lack long production histories and have extremely variable model parameters. This research proposes a data-driven alternative to reservoir simulation and production forecasting techniques. We create a proxy-well model for predicting cumulative oil production by selecting statistically significant well completion parameters and reservoir information as independent predictor variables in regression-based models. Then, principal component analysis (PCA) …
Sensor Data Analysis In Smart Buildings, Manuel A. Mane Penton
Sensor Data Analysis In Smart Buildings, Manuel A. Mane Penton
Publications and Research
Data analysis and Machine Learning are destined to evolve the current technology infrastructure by solving technology and economy demands present mainly in developed cities like New York. This research proposes a machine learning (ML) based solution to alleviate one of the main issues that big buildings such as CUNY campuses have, that is the waste of energy resources. The analysis of data coming from the readings of different deployed sensors such as CO2, humidity and temperature can be used to estimate occupancy in a specific room and building in general. The outcome of this research established a relationship between the …
Subsurface Analytics: Contribution Of Artificial Intelligence And Machine Learning To Reservoir Engineering, Reservoir Modeling, And Reservoir Management, Shahab D. Mohaghegh
Subsurface Analytics: Contribution Of Artificial Intelligence And Machine Learning To Reservoir Engineering, Reservoir Modeling, And Reservoir Management, Shahab D. Mohaghegh
Faculty & Staff Scholarship
Subsurface Analytics is a new technology that changes the way reservoir simulation and modeling is performed. Instead of starting with the construction of mathematical equations to model the physics of the fluid flow through porous media and then modification of the geological models in order to achieve history match, Subsurface Analytics that is a completely AI-based reservoir simulation and modeling technology takes a completely different approach. In AI-based reservoir modeling, field measurements form the foundation of the reservoir model. Using data-driven, pattern recognition technologies; the physics of the fluid flow through porous media is modeled through discovering the best, most …
Three Essays On Health Economics And Policy Evaluation, Shishir Shakya
Three Essays On Health Economics And Policy Evaluation, Shishir Shakya
Graduate Theses, Dissertations, and Problem Reports
This dissertation consists of three essays on the U.S. Health care policy. Each paragraph below refers to the three abstracts for the three chapters in this dissertation, respectively. I provide quantitative evidence on how much Prescription Drug Monitoring Programs (PDMPs) affects the retail opioid prescribing behaviors. Using the American Community Survey (ACS), I retrieve county-level high dimensional panel data set from 2010 to 2017. I employ three separate identification strategies: difference-in-difference, double selection post-LASSO, and spatial difference-in-difference. I compare how the retail opioid prescribing behaviors of counties, that are mandatory for prescribers to check the PDMP before prescribing controlled substances …
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
Image Features For Tuberculosis Classification In Digital Chest Radiographs, Brian Hooper
All Master's Theses
Tuberculosis (TB) is a respiratory disease which affects millions of people each year, accounting for the tenth leading cause of death worldwide, and is especially prevalent in underdeveloped regions where access to adequate medical care may be limited. Analysis of digital chest radiographs (CXRs) is a common and inexpensive method for the diagnosis of TB; however, a trained radiologist is required to interpret the results, and is subject to human error. Computer-Aided Detection (CAD) systems are a promising machine-learning based solution to automate the diagnosis of TB from CXR images. As the dimensionality of a high-resolution CXR image is very …
How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller
How Machine Learning And Probability Concepts Can Improve Nba Player Evaluation, Harrison Miller
CMC Senior Theses
In this paper I will be breaking down a scholarly article, written by Sameer K. Deshpande and Shane T. Jensen, that proposed a new method to evaluate NBA players. The NBA is the highest level professional basketball league in America and stands for the National Basketball Association. They proposed to build a model that would result in how NBA players impact their teams chances of winning a game, using machine learning and probability concepts. I preface that by diving into these concepts and their mathematical backgrounds. These concepts include building a linear model using ordinary least squares method, the bias …
Multimodal Fusion Strategies For Outcome Prediction In Stroke, Esra Zihni, John D. Kelleher, Vince I. Madai, Ahmed Khalil, Ivana Galinovic, Jochen Fiebach, Michelle Livne, Dietmar Frey
Multimodal Fusion Strategies For Outcome Prediction In Stroke, Esra Zihni, John D. Kelleher, Vince I. Madai, Ahmed Khalil, Ivana Galinovic, Jochen Fiebach, Michelle Livne, Dietmar Frey
Conference papers
Data driven methods are increasingly being adopted in the medical domain for clinical predictive modeling. Prediction of stroke outcome using machine learning could provide a decision support system for physicians to assist them in patient-oriented diagnosis and treatment. While patient-specific clinical parameters play an important role in outcome prediction, a multimodal fusion approach that integrates neuroimaging with clinical data has the potential to improve accuracy. This paper addresses two research questions: (a) does multimodal fusion aid in the prediction of stroke outcome, and (b) what fusion strategy is more suitable for the task at hand. The baselines for our experimental …
A Machine Learning Approach To Estimate The Annihilation Photon Interactions Inside The Scintillator Of A Pet Scanner, Sai Akhil Bharthavarapu
A Machine Learning Approach To Estimate The Annihilation Photon Interactions Inside The Scintillator Of A Pet Scanner, Sai Akhil Bharthavarapu
Graduate Theses, Dissertations, and Problem Reports
Biochemical processes are chemical processes that occur in living organisms. They can be studied with nuclear medicine through the help of radioactive tracers. Based on the radioisotope used, the photons that are emitted from the body tissue are either detected by single-photon emission computed tomography (SPECT) or by positron emission tomography (PET) scanners. SPECT uses gamma rays as tracer but gives a weaker contrast and spatial resolution compared to a PET scanner which uses positrons as tracer. PET scans show the metabolic changes occurring at the cellular level in an organ or a tissue. This detection is important because diseases …