Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Statistical Models

Deriving The Distributions And Developing Methods Of Inference For R2-Type Measures, With Applications To Big Data Analysis, Gregory S. Hawk Jan 2022

Deriving The Distributions And Developing Methods Of Inference For R2-Type Measures, With Applications To Big Data Analysis, Gregory S. Hawk

Theses and Dissertations--Statistics

As computing capabilities and cloud-enhanced data sharing has accelerated exponentially in the 21st century, our access to Big Data has revolutionized the way we see data around the world, from healthcare to investments to manufacturing to retail and supply-chain. In many areas of research, however, the cost of obtaining each data point makes more than just a few observations impossible. While machine learning and artificial intelligence (AI) are improving our ability to make predictions from datasets, we need better statistical methods to improve our ability to understand and translate models into meaningful and actionable insights.

A central goal in the …


Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia May 2019

Visualization And Machine Learning Techniques For Nasa’S Em-1 Big Data Problem, Antonio P. Garza Iii, Jose Quinonez, Misael Santana, Nibhrat Lohia

SMU Data Science Review

In this paper, we help NASA solve three Exploration Mission-1 (EM-1) challenges: data storage, computation time, and visualization of complex data. NASA is studying one year of trajectory data to determine available launch opportunities (about 90TBs of data). We improve data storage by introducing a cloud-based solution that provides elasticity and server upgrades. This migration will save $120k in infrastructure costs every four years, and potentially avoid schedule slips. Additionally, it increases computational efficiency by 125%. We further enhance computation via machine learning techniques that use the classic orbital elements to predict valid trajectories. Our machine learning model decreases trajectory …


An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert Jan 2017

An Exploratory Statistical Method For Finding Interactions In A Large Dataset With An Application Toward Periodontal Diseases, Joshua Lambert

Theses and Dissertations--Epidemiology and Biostatistics

It is estimated that Periodontal Diseases effects up to 90% of the adult population. Given the complexity of the host environment, many factors contribute to expression of the disease. Age, Gender, Socioeconomic Status, Smoking Status, and Race/Ethnicity are all known risk factors, as well as a handful of known comorbidities. Certain vitamins and minerals have been shown to be protective for the disease, while some toxins and chemicals have been associated with an increased prevalence. The role of toxins, chemicals, vitamins, and minerals in relation to disease is believed to be complex and potentially modified by known risk factors. A …


Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards May 2013

Automating Large-Scale Simulation Calibration To Real-World Sensor Data, Richard Everett Edwards

Doctoral Dissertations

Many key decisions and design policies are made using sophisticated computer simulations. However, these sophisticated computer simulations have several major problems. The two main issues are 1) gaps between the simulation model and the actual structure, and 2) limitations of the modeling engine's capabilities. This dissertation's goal is to address these simulation deficiencies by presenting a general automated process for tuning simulation inputs such that simulation output matches real world measured data. The automated process involves the following key components -- 1) Identify a model that accurately estimates the real world simulation calibration target from measured sensor data; 2) Identify …