Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Data Mining (2)
- Statistics (2)
- ARIMA (1)
- Age dependency ratio; Demographic characteristics; Economic development; Female labor force participation; Lagged fertility; Newey west standard errors; Unemployment (1)
- Anomaly Detection (1)
-
- Applied cognitive psychology (1)
- Capture recapture (1)
- Chlorine (1)
- Classification (1)
- Clothing Retail Sales (1)
- Collinearity (1)
- Complex span tasks (1)
- Computational (1)
- Control chart (1)
- Coordinate descent (1)
- Ecology (1)
- Economy (1)
- Elastic net (1)
- Feature Reduction (1)
- Feature Selection (1)
- Forecasting (1)
- GDP (1)
- Healthcare (1)
- High dimensional data (1)
- Instruction Sequences (1)
- Intrusion Detection Systems (1)
- K chart (1)
- Lasso (1)
- Life testing (1)
- Machine Learning (1)
- Publication Year
Articles 1 - 18 of 18
Full-Text Articles in Physical Sciences and Mathematics
Automated Machine Learning: Intellient Binning Data Preparation And Regularized Regression Classfier, Jianbin Zhu
Automated Machine Learning: Intellient Binning Data Preparation And Regularized Regression Classfier, Jianbin Zhu
Electronic Theses and Dissertations, 2020-
Automated machine learning (AutoML) has become a new trend which is the process of automating the complete pipeline from the raw dataset to the development of machine learning model. It not only can relief data scientists' works but also allows non-experts to finish the jobs without solid knowledge and understanding of statistical inference and machine learning. One limitation of AutoML framework is the data quality differs significantly batch by batch. Consequently, fitted model quality for some batches of data can be very poor due to distribution shift for some numerical predictors. In this dissertation, we develop an intelligent binning to …
Graph Neural Networks For Improved Interpretability And Efficiency, Patrick Pho
Graph Neural Networks For Improved Interpretability And Efficiency, Patrick Pho
Electronic Theses and Dissertations, 2020-
Attributed graph is a powerful tool to model real-life systems which exist in many domains such as social science, biology, e-commerce, etc. The behaviors of those systems are mostly defined by or dependent on their corresponding network structures. Graph analysis has become an important line of research due to the rapid integration of such systems into every aspect of human life and the profound impact they have on human behaviors. Graph structured data contains a rich amount of information from the network connectivity and the supplementary input features of nodes. Machine learning algorithms or traditional network science tools have limitation …
Change Point Detection For Streaming Data Using Support Vector Methods, Charles Harrison
Change Point Detection For Streaming Data Using Support Vector Methods, Charles Harrison
Electronic Theses and Dissertations, 2020-
Sequential multiple change point detection concerns the identification of multiple points in time where the systematic behavior of a statistical process changes. A special case of this problem, called online anomaly detection, occurs when the goal is to detect the first change and then signal an alert to an analyst for further investigation. This dissertation concerns the use of methods based on kernel functions and support vectors to detect changes. A variety of support vector-based methods are considered, but the primary focus concerns Least Squares Support Vector Data Description (LS-SVDD). LS-SVDD constructs a hypersphere in a kernel space to bound …
An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela
An Evaluation Of The Performance Of Proc Arima's Identify Statement: A Data-Driven Approach Using Covid-19 Cases And Deaths In Florida, Fahmida Akter Shahela
Electronic Theses and Dissertations, 2020-
Understanding data on novel coronavirus (COVID-19) pandemic, and modeling such data over time are crucial for decision making at managing, fighting, and controlling the spread of this emerging disease. This thesis work looks at some aspects of exploratory analysis and modeling of COVID-19 data obtained from the Florida Department of Health (FDOH). In particular, the present work is devoted to data collection, preparation, description, and modeling of COVID-19 cases and deaths reported by FDOH between March 12, 2020, and April 30, 2021. For modeling data on both cases and deaths, this thesis utilized an autoregressive integrated moving average (ARIMA) times …
Time Series Forecasting And Analysis: A Study Of American Clothing Retail Sales Data, Weijun Huang
Time Series Forecasting And Analysis: A Study Of American Clothing Retail Sales Data, Weijun Huang
Honors Undergraduate Theses
This paper serves to address the effect of time on the sales of clothing retail, from 2010 to May 2019. The data was retrieved from the US Census, where N=113 observations were used, which were plotted to observe their trends. Once outliers and transformations were performed, the best model was fit, and diagnostic review occurred. Inspections for seasonality and forecasting was also conducted. The final model came out to be an ARIMA (2,0,1). Slight seasonality was present, but not enough to drastically influence the trends. Our results serve to highlight the economic growth of clothing retail sales for the past …
Systematic Review And Meta-Analysis: Tuberculosis, Tnfα Inhibitors, And Crohn's Disease, Brent L. Cao
Systematic Review And Meta-Analysis: Tuberculosis, Tnfα Inhibitors, And Crohn's Disease, Brent L. Cao
Honors Undergraduate Theses
Inflammation is often a protective reaction against harmful foreign agents. However, in many disease conditions, the mechanisms behind the inflammatory response are poorly understood. Often times, the inflammation causes adverse effects, such as joint pain, abdominal pain, fever, fatigue, and loss of appetite. Thus, many treatments aim to inhibit the inflammatory response in order to control adverse symptoms. Such treatments include TNFα inhibitors. However, a major risk associated with drugs inhibiting tumor necrosis factor alpha (TNFα) is serious infection, including tuberculosis (TB).
Anti-TNFα therapy is used to treat patients with Crohn’s disease, for which the risk of tuberculosis may be …
Psychometric Properties Of A Working Memory Span Task, Juan M. Alzate Vanegas
Psychometric Properties Of A Working Memory Span Task, Juan M. Alzate Vanegas
Honors Undergraduate Theses
The intent of this thesis is to examine the psychometric properties of a complex span task (CST) developed to measure working memory capacity (WMC) using measurements obtained from a sample of 68 undergraduate students at the University of Central Florida. The Grocery List Task (GLT) promises several design improvements over traditional CSTs in a prior study about individual differences in WMC and distraction effects on driving performance, and it offers potential benefits for studying WMC as well as the serial-position effect. Currently, the working memory system is composed of domain-general memorial storage processes and information-processing, which involves the use of …
To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle
To Hydrate Or Chlorinate: A Regression Analysis Of The Levels Of Chlorine In The Public Water Supply, Drew A. Doyle
HIM 1990-2015
Public water supplies contain disease-causing microorganisms in the water or distribution ducts. In order to kill off these pathogens, a disinfectant, such as chlorine, is added to the water. Chlorine is the most widely used disinfectant in all U.S. water treatment facilities. Chlorine is known to be one of the most powerful disinfectants to restrict harmful pathogens from reaching the consumer. In the interest of obtaining a better understanding of what variables affect the levels of chlorine in the water, this thesis will analyze a particular set of water samples randomly collected from locations in Orange County, Florida. Thirty water …
A Simulation-Based Task Analysis Using Agent-Based, Discrete Event And System Dynamics Simulation, Anastasia Angelopoulou
A Simulation-Based Task Analysis Using Agent-Based, Discrete Event And System Dynamics Simulation, Anastasia Angelopoulou
Electronic Theses and Dissertations
Recent advances in technology have increased the need for using simulation models to analyze tasks and obtain human performance data. A variety of task analysis approaches and tools have been proposed and developed over the years. Over 100 task analysis methods have been reported in the literature. However, most of the developed methods and tools allow for representation of the static aspects of the tasks performed by expert system-driven human operators, neglecting aspects of the work environment, i.e. physical layout, and dynamic aspects of the task. The use of simulation can help face the new challenges in the field of …
Mahalanobis Kernel-Based Support Vector Data Description For Detection Of Large Shifts In Mean Vector, Vu Nguyen
Electronic Theses and Dissertations
Statistical process control (SPC) applies the science of statistics to various process control in order to provide higher-quality products and better services. The K chart is one among the many important tools that SPC offers. Creation of the K chart is based on Support Vector Data Description (SVDD), a popular data classifier method inspired by Support Vector Machine (SVM). As any methods associated with SVM, SVDD benefits from a wide variety of choices of kernel, which determines the effectiveness of the whole model. Among the most popular choices is the Euclidean distance-based Gaussian kernel, which enables SVDD to obtain a …
Statistical Analysis Of Depression And Social Support Change In Arab Immigrant Women In Usa, Hazhar Blbas
Statistical Analysis Of Depression And Social Support Change In Arab Immigrant Women In Usa, Hazhar Blbas
Electronic Theses and Dissertations
Arab Muslim immigrant women encounter many stressors and are at risk for depression. Social supports from husbands, family and friends are generally considered mitigating resources for depression. However, changes in social support over time and the effects of such supports on depression at a future time period have not been fully addressed in the literature This thesis investigated the relationship between demographic characteristics, changes in social support, and depression in Arab Muslim immigrant women to the USA. A sample of 454 married Arab Muslim immigrant women provided demographic data, scores on social support variables and depression at three time periods …
How Many Are Out There? A Novel Approach For Open And Closed Systems, Zia Rehman
How Many Are Out There? A Novel Approach For Open And Closed Systems, Zia Rehman
Electronic Theses and Dissertations
We propose a ratio estimator to determine population estimates using capture-recapture sampling. It's different than traditional approaches in the following ways: (1) Ordering of recaptures: Currently data sets do not take into account the "ordering" of the recaptures, although this crucial information is available to them at no cost. (2) Dependence of trials and cluster sampling: Our model explicitly considers trials to be dependent and improves existing literature which assumes independence. (3) Rate of convergence: The percentage sampled has an inverse relationship with population size, for a chosen degree of accuracy. (4) Asymptotic Attainment of Minimum Variance (Open Systems: (=population …
Sparse Ridge Fusion For Linear Regression, Nozad Mahmood
Sparse Ridge Fusion For Linear Regression, Nozad Mahmood
Electronic Theses and Dissertations
For a linear regression, the traditional technique deals with a case where the number of observations n more than the number of predictor variables p (n > p). In the case n < p, the classical method fails to estimate the coefficients. A solution of the problem is the case of correlated predictors is provided in this thesis. A new regularization and variable selection is proposed under the name of Sparse Ridge Fusion (SRF). In the case of highly correlated predictor, the simulated examples and a real data show that the SRF always outperforms the lasso, eleastic net, and the S-Lasso, and the results show that the SRF selects more predictor variables than the sample size n while the maximum selected variables by lasso is n size.
An Analysis Of The Relationship Between Economic Development And Demographic Characteristics In The United States, Chad M. Heyne
An Analysis Of The Relationship Between Economic Development And Demographic Characteristics In The United States, Chad M. Heyne
HIM 1990-2015
Over the past several decades there has been extensive research done in an attempt to determine what demographic characteristics affect economic growth, measured in GDP per capita. Understanding what influences the growth of a country will vastly help policy makers enact policies to lead the country in a positive direction. This research focuses on isolating a new variable, women in the work force. As well as isolating a new variable, this research will modify a preexisting variable that was shown to be significant in order to make the variable more robust and sensitive to recessions. The intent of this thesis …
Data Mining Methods For Malware Detection, Muazzam Siddiqui
Data Mining Methods For Malware Detection, Muazzam Siddiqui
Electronic Theses and Dissertations
This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to find out the best features and build models that can classify a given program into a malware or a clean class. Our research is closely related to information retrieval …
Modeling And Characterizations Of New Notions In Life Testing With Statistical Applications, Mohammad Sepehrifar
Modeling And Characterizations Of New Notions In Life Testing With Statistical Applications, Mohammad Sepehrifar
Electronic Theses and Dissertations
Knowing the class to which a life distribution belongs gives us an idea about the aging of the device or system the life distribution represents, and enables us to compare the aging properties of different systems. This research intends to establish several new nonparametric classes of life distributions defined by the concept of inactivity time of a unit with a guaranteed minimum life length. These classes play an important role in the study of reliability theory, survival analysis, maintenance policies, economics, actuarial sciences and many other applied areas.
Session-Based Intrusion Detection System To Map Anomalous Network Traffic, Bruce Caulkins
Session-Based Intrusion Detection System To Map Anomalous Network Traffic, Bruce Caulkins
Electronic Theses and Dissertations
Computer crime is a large problem (CSI, 2004; Kabay, 2001a; Kabay, 2001b). Security managers have a variety of tools at their disposal -- firewalls, Intrusion Detection Systems (IDSs), encryption, authentication, and other hardware and software solutions to combat computer crime. Many IDS variants exist which allow security managers and engineers to identify attack network packets primarily through the use of signature detection; i.e., the IDS recognizes attack packets due to their well-known "fingerprints" or signatures as those packets cross the network's gateway threshold. On the other hand, anomaly-based ID systems determine what is normal traffic within a network and reports …
A Subset Selection Rule For Three Normal Populations, Bert Culpepper
A Subset Selection Rule For Three Normal Populations, Bert Culpepper
Retrospective Theses and Dissertations
No abstract provided.