Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

Application Of Big Data Technology, Text Classification, And Azure Machine Learning For Financial Risk Management Using Data Science Methodology, Oluwaseyi A. Ijogun Jan 2023

Application Of Big Data Technology, Text Classification, And Azure Machine Learning For Financial Risk Management Using Data Science Methodology, Oluwaseyi A. Ijogun

Electronic Theses and Dissertations

Data science plays a crucial role in enabling organizations to optimize data-driven opportunities within financial risk management. It involves identifying, assessing, and mitigating risks, ultimately safeguarding investments, reducing uncertainty, ensuring regulatory compliance, enhancing decision-making, and fostering long-term sustainability. This thesis explores three facets of Data Science projects: enhancing customer understanding, fraud prevention, and predictive analysis, with the goal of improving existing tools and enabling more informed decision-making. The first project examined leveraged big data technologies, such as Hadoop and Spark, to enhance financial risk management by accurately predicting loan defaulters and their repayment likelihood. In the second project, we investigated …


Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii May 2022

Intraday Algorithmic Trading Using Momentum And Long Short-Term Memory Network Strategies, Andrew R. Whitinger Ii

Undergraduate Honors Theses

Intraday stock trading is an infamously difficult and risky strategy. Momentum and reversal strategies and long short-term memory (LSTM) neural networks have been shown to be effective for selecting stocks to buy and sell over time periods of multiple days. To explore whether these strategies can be effective for intraday trading, their implementations were simulated using intraday price data for stocks in the S&P 500 index, collected at 1-second intervals between February 11, 2021 and March 9, 2021 inclusive. The study tested 160 variations of momentum and reversal strategies for profitability in long, short, and market-neutral portfolios, totaling 480 portfolios. …


Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler Jan 2022

Exploring Cyberterrorism, Topic Models And Social Networks Of Jihadists Dark Web Forums: A Computational Social Science Approach, Vivian Fiona Guetler

Graduate Theses, Dissertations, and Problem Reports

This three-article dissertation focuses on cyber-related topics on terrorist groups, specifically Jihadists’ use of technology, the application of natural language processing, and social networks in analyzing text data derived from terrorists' Dark Web forums. The first article explores cybercrime and cyberterrorism. As technology progresses, it facilitates new forms of behavior, including tech-related crimes known as cybercrime and cyberterrorism. In this article, I provide an analysis of the problems of cybercrime and cyberterrorism within the field of criminology by reviewing existing literature focusing on (a) the issues in defining terrorism, cybercrime, and cyberterrorism, (b) ways that cybercriminals commit a crime in …


Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu Jan 2022

Finding The Best Predictors For Foot Traffic In Us Seafood Restaurants, Isabel Paige Beaulieu

Honors Theses and Capstones

COVID-19 caused state and nation-wide lockdowns, which altered human foot traffic, especially in restaurants. The seafood sector in particular suffered greatly as there was an increase in illegal fishing, it is made up of perishable goods, it is seasonal in some places, and imports and exports were slowed. Foot traffic data is useful for business owners to have to know how much to order, how many employees to schedule, etc. One issue is that the data is very expensive, hard to get, and not available until months after it is recorded. Our goal is to not only find covariates that …


Essays On Fake Review Detection, Managerial Response, And Consumer Perceptions, Long Chen Aug 2021

Essays On Fake Review Detection, Managerial Response, And Consumer Perceptions, Long Chen

Theses and Dissertations

This dissertation investigates how online reviews and managerial responses jointly affect consumer perceptions. I first examine and compare the outcomes of multiple fake review classifiers using various algorithms, including traditional machine learning methods and recently developed deep learning methods (essay I). Then, based on the findings of the first essay, I examine the interrelationship between fake review detection, managerial response, and hotel ratings and ratings’ growths (essay II).The first essay is a comparative study on the methodology of identifying fake reviews. Although online reviews have attracted much attention from academia and industry for over fifteen years, how to identify fake …


Designing Targeted Mobile Advertising Campaigns, Kimia Keshanian Jun 2021

Designing Targeted Mobile Advertising Campaigns, Kimia Keshanian

USF Tampa Graduate Theses and Dissertations

With the proliferation of smart, handheld devices, there has been a multifold increase in the ability of firms to target and engage with customers through mobile advertising. Therefore, not surprisingly, mobile advertising campaigns have become an integral aspect of firms’ brand building activities, such as improving the awareness and overall visibility of firms' brands. In addition, retailers are increasingly using mobile advertising for targeted promotional activities that increase in-store visits and eventual sales conversions. However, in recent years, mobile or in general online advertising campaigns have been facing one major challenge and one major threat that can negatively impact the …


Changing The Focus: Worker-Centric Optimization In Human-In-The-Loop Computations, Mohammadreza Esfandiari Aug 2020

Changing The Focus: Worker-Centric Optimization In Human-In-The-Loop Computations, Mohammadreza Esfandiari

Dissertations

A myriad of emerging applications from simple to complex ones involve human cognizance in the computation loop. Using the wisdom of human workers, researchers have solved a variety of problems, termed as “micro-tasks” such as, captcha recognition, sentiment analysis, image categorization, query processing, as well as “complex tasks” that are often collaborative, such as, classifying craters on planetary surfaces, discovering new galaxies (Galaxyzoo), performing text translation. The current view of “humans-in-the-loop” tends to see humans as machines, robots, or low-level agents used or exploited in the service of broader computation goals. This dissertation is developed to shift the focus back …


Mind Maps And Machine Learning: An Automation Framework For Qualitative Research In Entrepreneurship Education, Yasser Farha Aug 2020

Mind Maps And Machine Learning: An Automation Framework For Qualitative Research In Entrepreneurship Education, Yasser Farha

Dissertations

Entrepreneurship Education researchers often measure entrepreneurial motivation of college students. It is important for stakeholders, such as policymakers and educators, to assert if entrepreneurship education can encourage students to become entrepreneurs, as well as to understand factors that influence entrepreneurial motivation. For that purpose, researchers have used different methods and instruments to measure students' entrepreneurial motivation. Most of these methods are quantitative, e.g., closed-ended surveys, whereas qualitative methods, e.g., open-ended surveys, are rarely used.

Mind maps are an attractive qualitative survey tool because they capture the individual's reflections, thoughts, and experiences. For Entrepreneurship Education, mind maps can be utilized to …


Data-Driven Investment Decisions In P2p Lending: Strategies Of Integrating Credit Scoring And Profit Scoring, Yan Wang Apr 2020

Data-Driven Investment Decisions In P2p Lending: Strategies Of Integrating Credit Scoring And Profit Scoring, Yan Wang

Doctor of Data Science and Analytics Dissertations

In this dissertation, we develop and discuss several loan evaluation methods to guide the investment decisions for peer-to-peer (P2P) lending. In evaluating loans, credit scoring and profit scoring are the two widely utilized approaches. Credit scoring aims at minimizing the risk while profit scoring aims at maximizing the profit. This dissertation addresses the strengths and weaknesses of each scoring method by integrating them in various ways in order to provide the optimal investment suggestions for different investors. Before developing the methods for loan evaluation at the individual level, we applied the state-of-the-art method called the Long Short Term Memory (LSTM) …


Early Detection Of Fake News On Social Media, Yang Liu Dec 2019

Early Detection Of Fake News On Social Media, Yang Liu

Dissertations

The ever-increasing popularity and convenience of social media enable the rapid widespread of fake news, which can cause a series of negative impacts both on individuals and society. Early detection of fake news is essential to minimize its social harm. Existing machine learning approaches are incapable of detecting a fake news story soon after it starts to spread, because they require certain amounts of data to reach decent effectiveness which take time to accumulate. To solve this problem, this research first analyzes and finds that, on social media, the user characteristics of fake news spreaders distribute significantly differently from those …


Detecting Digitally Forged Faces In Online Videos, Neilesh Sambhu Oct 2019

Detecting Digitally Forged Faces In Online Videos, Neilesh Sambhu

USF Tampa Graduate Theses and Dissertations

We use Rossler’s FaceForensics dataset of 1004 online videos and their corresponding forged counterparts [1] to investigate the ability to distinguish digitally forged facial images from original images automatically with deep learning. The proposed convolutional neural network is much smaller than the current state-of-the-art solutions. Nevertheless, the network maintains a high level of accuracy (99.6%), all while using the entire FaceForensics dataset and not including any temporal information. We implement majority voting and show the impact on accuracy (99.67%), where only 1 video of 300 is misclassified. We examine why the model misclassified this one video. In terms of tuning …


Regression Tree Construction For Reinforcement Learning Problems With A General Action Space, Anthony S. Bush Jr Jan 2019

Regression Tree Construction For Reinforcement Learning Problems With A General Action Space, Anthony S. Bush Jr

Electronic Theses and Dissertations

Part of the implementation of Reinforcement Learning is constructing a regression of values against states and actions and using that regression model to optimize over actions for a given state. One such common regression technique is that of a decision tree; or in the case of continuous input, a regression tree. In such a case, we fix the states and optimize over actions; however, standard regression trees do not easily optimize over a subset of the input variables\cite{Card1993}. The technique we propose in this thesis is a hybrid of regression trees and kernel regression. First, a regression tree splits over …


Opportunity Identification For New Product Planning: Ontological Semantic Patent Classification, Farshad Madani Feb 2018

Opportunity Identification For New Product Planning: Ontological Semantic Patent Classification, Farshad Madani

Dissertations and Theses

Intelligence tools have been developed and applied widely in many different areas in engineering, business and management. Many commercialized tools for business intelligence are available in the market. However, no practically useful tools for technology intelligence are available at this time, and very little academic research in technology intelligence methods has been conducted to date.

Patent databases are the most important data source for technology intelligence tools, but patents inherently contain unstructured data. Consequently, extracting text data from patent databases, converting that data to meaningful information and generating useful knowledge from this information become complex tasks. These tasks are currently …


Estimating The Optimal Cutoff Point For Logistic Regression, Zheng Zhang Jan 2018

Estimating The Optimal Cutoff Point For Logistic Regression, Zheng Zhang

Open Access Theses & Dissertations

Binary classification is one of the main themes of supervised learning. This research is concerned about determining the optimal cutoff point for the continuous-scaled outcomes (e.g., predicted probabilities) resulting from a classifier such as logistic regression. We make note of the fact that the cutoff point obtained from various methods is a statistic, which can be unstable with substantial variation. Nevertheless, due partly to complexity involved in estimating the cutpoint, there has been no formal study on the variance or standard error of the estimated cutoff point.

In this Thesis, a bootstrap aggregation method is put forward to estimate the …


Viewability Prediction For Display Advertising, Chong Wang Apr 2017

Viewability Prediction For Display Advertising, Chong Wang

Dissertations

As a massive industry, display advertising delivers advertisers’ marketing messages to attract customers through graphic banners on webpages. Display advertising is also the most essential revenue source of online publishers. Currently, advertisers are charged by user response or ad serving. However, recent studies show that users barely click or convert display ads. Moreover, about half of the ads are actually never seen by users. In this case, advertisers cannot enhance their brand awareness and increase return on investment. Publishers also lose much revenue. Therefore, the ad pricing standards are shifting to a new model: ad impressions are paid if they …


Information Filtering By Multiple Examples, Mingzhu Zhu May 2015

Information Filtering By Multiple Examples, Mingzhu Zhu

Dissertations

A key to successfully satisfy an information need lies in how users express it using keywords as queries. However, for many users, expressing their information needs using keywords is difficult, especially when the information need is complex. Search By Multiple Examples (SBME), a promising method for overcoming this problem, allows users to specify their information needs as a set of relevant documents rather than as a set of keywords.

Most of the studies on SBME adopt the Positive Unlabeled learning (PU learning) techniques by treating the user's provided examples (denoted as query examples) as positive set and the entire data …


Svmaud: Using Textual Information To Predict The Audience Level Of Written Works Using Support Vector Machines, Todd Will Jan 2014

Svmaud: Using Textual Information To Predict The Audience Level Of Written Works Using Support Vector Machines, Todd Will

Dissertations

Information retrieval systems should seek to match resources with the reading ability of the individual user; similarly, an author must choose vocabulary and sentence structures appropriate for his or her audience. Traditional readability formulas, including the popular Flesch-Kincaid Reading Age and the Dale-Chall Reading Ease Score, rely on numerical representations of text characteristics, including syllable counts and sentence lengths, to suggest audience level of resources. However, the author’s chosen vocabulary, sentence structure, and even the page formatting can alter the predicted audience level by several levels, especially in the case of digital library resources. For these reasons, the performance of …