Open Access. Powered by Scholars. Published by Universities.®

Data Science Commons

Open Access. Powered by Scholars. Published by Universities.®

East Tennessee State University

Articles 1 - 12 of 12

Full-Text Articles in Data Science

Interpreting Shift Encoders As State Space Models For Stationary Time Series, Patrick Donkoh May 2024

Interpreting Shift Encoders As State Space Models For Stationary Time Series, Patrick Donkoh

Electronic Theses and Dissertations

Time series analysis is a statistical technique used to analyze sequential data points collected or recorded over time. While traditional models such as autoregressive models and moving average models have performed sufficiently for time series analysis, the advent of artificial neural networks has provided models that have suggested improved performance. In this research, we provide a custom neural network; a shift encoder that can capture the intricate temporal patterns of time series data. We then compare the sparse matrix of the shift encoder to the parameters of the autoregressive model and observe the similarities. We further explore how we can …


Convolution And Autoencoders Applied To Nonlinear Differential Equations, Noah Borquaye Dec 2023

Convolution And Autoencoders Applied To Nonlinear Differential Equations, Noah Borquaye

Electronic Theses and Dissertations

Autoencoders, a type of artificial neural network, have gained recognition by researchers in various fields, especially machine learning due to their vast applications in data representations from inputs. Recently researchers have explored the possibility to extend the application of autoencoders to solve nonlinear differential equations. Algorithms and methods employed in an autoencoder framework include sparse identification of nonlinear dynamics (SINDy), dynamic mode decomposition (DMD), Koopman operator theory and singular value decomposition (SVD). These approaches use matrix multiplication to represent linear transformation. However, machine learning algorithms often use convolution to represent linear transformations. In our work, we modify these approaches to …


Exploration And Statistical Modeling Of Profit, Caleb Gibson Dec 2023

Exploration And Statistical Modeling Of Profit, Caleb Gibson

Undergraduate Honors Theses

For any company involved in sales, maximization of profit is the driving force that guides all decision-making. Many factors can influence how profitable a company can be, including external factors like changes in inflation or consumer demand or internal factors like pricing and product cost. Understanding specific trends in one's own internal data, a company can readily identify problem areas or potential growth opportunities to help increase profitability.

In this discussion, we use an extensive data set to examine how a company might analyze their own data to identify potential changes the company might investigate to drive better performance. Based …


A Bridge Between Graph Neural Networks And Transformers: Positional Encodings As Node Embeddings, Bright Kwaku Manu Dec 2023

A Bridge Between Graph Neural Networks And Transformers: Positional Encodings As Node Embeddings, Bright Kwaku Manu

Electronic Theses and Dissertations

Graph Neural Networks and Transformers are very powerful frameworks for learning machine learning tasks. While they were evolved separately in diverse fields, current research has revealed some similarities and links between them. This work focuses on bridging the gap between GNNs and Transformers by offering a uniform framework that highlights their similarities and distinctions. We perform positional encodings and identify key properties that make the positional encodings node embeddings. We found that the properties of expressiveness, efficiency and interpretability were achieved in the process. We saw that it is possible to use positional encodings as node embeddings, which can be …


A Programmatic Geographic Information Systems Analysis Of Plant Hardiness Zones, Andrew Bowen May 2023

A Programmatic Geographic Information Systems Analysis Of Plant Hardiness Zones, Andrew Bowen

Electronic Theses and Dissertations

The Plant Hardiness Zone Map consists of thirteen geographical zones that describe whether a plant can survive based on average annual minimal temperatures. As climate change progresses, minimum temperatures in all regions are expected to change. This work programmatically evaluates predicted future climate projection data and converts it to United States Department of Agriculture-defined hardiness zones. Through the next 80 years, hardiness zones are projected to move poleward; in effect, colder zones will lose area and warmer zones will gain area globally. Some implications include changes in crop growing degree days, which could alter crop productivity, migration and settlement of …


Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham May 2023

Predicting High-Cap Tech Stock Polarity: A Combined Approach Using Support Vector Machines And Bidirectional Encoders From Transformers, Ian L. Grisham

Electronic Theses and Dissertations

The abundance, accessibility, and scale of data have engendered an era where machine learning can quickly and accurately solve complex problems, identify complicated patterns, and uncover intricate trends. One research area where many have applied these techniques is the stock market. Yet, financial domains are influenced by many factors and are notoriously difficult to predict due to their volatile and multivariate behavior. However, the literature indicates that public sentiment data may exhibit significant predictive qualities and improve a model’s ability to predict intricate trends. In this study, momentum SVM classification accuracy was compared between datasets that did and did not …


Finding A Representative Distribution For The Tail Index Alpha, Α, For Stock Return Data From The New York Stock Exchange, Jett Burns May 2022

Finding A Representative Distribution For The Tail Index Alpha, Α, For Stock Return Data From The New York Stock Exchange, Jett Burns

Electronic Theses and Dissertations

Statistical inference is a tool for creating models that can accurately display real-world events. Special importance is given to the financial methods that model risk and large price movements. A parameter that describes tail heaviness, and risk overall, is α. This research finds a representative distribution that models α. The absolute value of standardized stock returns from the Center for Research on Security Prices are used in this research. The inference is performed using R. Approximations for α are found using the ptsuite package. The GAMLSS package employs maximum likelihood estimation to estimate distribution parameters using the CRSP data. The …


Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii May 2022

Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii

Undergraduate Honors Theses

Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. …


Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu Aug 2021

Applying Deep Learning To The Ice Cream Vendor Problem: An Extension Of The Newsvendor Problem, Gaffar Solihu

Electronic Theses and Dissertations

The Newsvendor problem is a classical supply chain problem used to develop strategies for inventory optimization. The goal of the newsvendor problem is to predict the optimal order quantity of a product to meet an uncertain demand in the future, given that the demand distribution itself is known. The Ice Cream Vendor Problem extends the classical newsvendor problem to an uncertain demand with unknown distribution, albeit a distribution that is known to depend on exogenous features. The goal is thus to estimate the order quantity that minimizes the total cost when demand does not follow any known statistical distribution. The …


Data Science And The Ice-Cream Vendor Problem, Makafui Azasoo Aug 2021

Data Science And The Ice-Cream Vendor Problem, Makafui Azasoo

Electronic Theses and Dissertations

Newsvendor problems in Operations Research predict the optimal inventory levels necessary to meet uncertain demands. This thesis examines an extended version of a single period multi-product newsvendor problem known as the ice cream vendor problem. In the ice cream vendor problem, there are two products – ice cream and hot chocolate – which may be substituted for one another if the outside temperature is no too hot or not too cold. In particular, the ice cream vendor problem is a data-driven extension of the conventional newsvendor problem which does not require the assumption of a specific demand distribution, thus allowing …


Manifold Learning With Tensorial Network Laplacians, Scott Sanders Aug 2021

Manifold Learning With Tensorial Network Laplacians, Scott Sanders

Electronic Theses and Dissertations

The interdisciplinary field of machine learning studies algorithms in which functionality is dependent on data sets. This data is often treated as a matrix, and a variety of mathematical methods have been developed to glean information from this data structure such as matrix decomposition. The Laplacian matrix, for example, is commonly used to reconstruct networks, and the eigenpairs of this matrix are used in matrix decomposition. Moreover, concepts such as SVD matrix factorization are closely connected to manifold learning, a subfield of machine learning that assumes the observed data lie on a low-dimensional manifold embedded in a higher-dimensional space. Since …


Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos May 2021

Machine Learning Approaches To Dribble Hand-Off Action Classification With Sportvu Nba Player Coordinate Data, Dembe Stephanos

Electronic Theses and Dissertations

Recently, strategies of National Basketball Association teams have evolved with the skillsets of players and the emergence of advanced analytics. One of the most effective actions in dynamic offensive strategies in basketball is the dribble hand-off (DHO). This thesis proposes an architecture for a classification pipeline for detecting DHOs in an accurate and automated manner. This pipeline consists of a combination of player tracking data and event labels, a rule set to identify candidate actions, manually reviewing game recordings to label the candidates, and embedding player trajectories into hexbin cell paths before passing the completed training set to the classification …