Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 20 of 20

Full-Text Articles in Physical Sciences and Mathematics

Campus Safety Data Gathering, Classification, And Ranking Based On Clery-Act Reports, Walaa F. Abo Elenin Jan 2023

Campus Safety Data Gathering, Classification, And Ranking Based On Clery-Act Reports, Walaa F. Abo Elenin

Electronic Theses and Dissertations

Most existing campus safety rankings are based on criminal incident history with minimal or no consideration of campus security conditions and standard safety measures. Campus safety information published by universities/colleges is usually conceptual/qualitative and not quantitative and are based-on criminal records of these campuses. Thus, no explicit and trusted ranking method for these campuses considers the level of compliance with the standard safety measures. A quantitative safety measure is important to compare different campuses easily and to learn about specific campus safety conditions.

In this thesis, we utilize Clery-Act reports of campuses to automatically analyze their safety conditions and generate …


Applied Deep Learning In Intelligent Transportation Systems And Embedding Exploration, Xiaoyuan Liang Aug 2019

Applied Deep Learning In Intelligent Transportation Systems And Embedding Exploration, Xiaoyuan Liang

Dissertations

Deep learning techniques have achieved tremendous success in many real applications in recent years and show their great potential in many areas including transportation. Even though transportation becomes increasingly indispensable in people’s daily life, its related problems, such as traffic congestion and energy waste, have not been completely solved, yet some problems have become even more critical. This dissertation focuses on solving the following fundamental problems: (1) passenger demand prediction, (2) transportation mode detection, (3) traffic light control, in the transportation field using deep learning. The dissertation also extends the application of deep learning to an embedding system for visualization …


Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao Apr 2018

Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao

Theses

The problem of community structure identification has been an extensively investigated area for biology, physics, social sciences, and computer science in recent years for studying the properties of networks representing complex relationships. Most traditional methods, such as K-means and hierarchical clustering, are based on the assumption that communities have spherical configurations. Lately, Genetic Algorithms (GA) are being utilized for efficient community detection without imposing sphericity. GAs are machine learning methods which mimic natural selection and scale with the complexity of the network. However, traditional GA approaches employ a representation method that dramatically increases the solution space to be searched by …


Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do Jul 2017

Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do

Dissertations and Theses Collection

That consumers share similar tastes on some products does not guarantee their agreement on other products. Therefore, both similarity and dierence should be taken into account for a more rounded view on consumer preferences. This manuscript focuses on mining this diversity of consumer preferences from two perspectives, namely 1) between consumers and 2) between products. Diversity of preferences between consumers is studied in the context of recommendation systems. In some preference models, measuring similarities in preferences between two consumers plays the key role. These approaches assume two consumers would share certain degree of similarity on any products, ignoring the fact …


Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad May 2017

Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad

Electronic Thesis and Dissertation Repository

Despite a large volume of research on the prognosis, diagnosis and overall burden of multimorbidity, very little is known about socio-demographic characteristics of multimorbid patients. This thesis aims to analyze the socio-demographic characteristics of patients with multiple chronic conditions (multimorbidity), focusing on patient groups sharing the same combination of diseases. Several methods were explored to analyze the co-occurrence of multiple chronic diseases as well as the associations between socio-demographics and chronic conditions. These methods include disease pair distributions over gender, age groups and income level quintiles, Multimorbidity Coefficients for measuring the concurrence of disease pairs and triples, and k-modes clustering …


Aspect Discovery From Product Reviews, Ying Ding May 2017

Aspect Discovery From Product Reviews, Ying Ding

Dissertations and Theses Collection

With the rapid development of online shopping sites and social media, product reviews are accumulating. These reviews contain information that is valuable to both businesses and customers. To businesses, companies can easily get a large number of feedback of their products, which is difficult to achieve by doing customer survey in the traditional way. To customers, they can know the products they are interested in better by reading reviews, which may be uneasy without online reviews. However, the accumulation has caused consuming all reviews impossible. It is necessary to develop automated techniques to efficiently process them. One of the most …


Dpweka: Achieving Differential Privacy In Weka, Srinidhi Katla May 2017

Dpweka: Achieving Differential Privacy In Weka, Srinidhi Katla

Graduate Theses and Dissertations

Organizations belonging to the government, commercial, and non-profit industries collect and store large amounts of sensitive data, which include medical, financial, and personal information. They use data mining methods to formulate business strategies that yield high long-term and short-term financial benefits. While analyzing such data, the private information of the individuals present in the data must be protected for moral and legal reasons. Current practices such as redacting sensitive attributes, releasing only the aggregate values, and query auditing do not provide sufficient protection against an adversary armed with auxiliary information. In the presence of additional background information, the privacy protection …


Exploring Data Mining Techniques For Tree Species Classification Using Co-Registered Lidar And Hyperspectral Data, Julia K. Marrs May 2016

Exploring Data Mining Techniques For Tree Species Classification Using Co-Registered Lidar And Hyperspectral Data, Julia K. Marrs

Theses and Dissertations

NASA Goddard’s LiDAR, Hyperspectral, and Thermal imager provides co-registered remote sensing data on experimental forests. Data mining methods were used to achieve a final tree species classification accuracy of 68% using a combined LiDAR and hyperspectral dataset, and show promise for addressing deforestation and carbon sequestration on a species-specific level.


Analyzing Proactive Fraud Detection Software Tools And The Push For Quicker Solutions, Kerri Aiken May 2016

Analyzing Proactive Fraud Detection Software Tools And The Push For Quicker Solutions, Kerri Aiken

Economic Crime Forensics Capstones

This paper focuses on proactive fraud detection software tools and how these tools can help detect and prevent possible fraudulent schemes. In addition to relying on routine audits, companies are designing proactive methods that involve the inclusion of software tools to detect and deter instances of fraud and abuse. This paper discusses examples of companies using ACL and SAS software programs and how the software tools have positively changed their auditing systems.

Novelis Inc., an aluminum and recycling company, implemented ACL into their internal audit software system. Competitive Health Analytics (Division of Humana) implemented SAS in order to improve their …


Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya Apr 2016

Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya

Open Access Theses

As the level of automation within an aircraft increases, the interactions between the pilot and autopilot play a crucial role in its proper operation. Issues with human machine interactions (HMI) have been cited as one of the main causes behind many aviation accidents. Due to the complexity of such interactions, it is challenging to identify all possible situations and develop the necessary contingencies. In this thesis, we propose a data-driven analysis tool to identify potential HMI issues in large-scale Flight Operational Quality Assurance (FOQA) dataset. The proposed tool is developed using a multi-level clustering framework, where a set of basic …


Clustering-Based Personalization, Seyed Nima Mirbakhsh Sep 2015

Clustering-Based Personalization, Seyed Nima Mirbakhsh

Electronic Thesis and Dissertation Repository

Recommendation systems have been the most emerging technology in the last decade as one of the key parts in e-commerce ecosystem. Businesses offer a wide variety of items and contents through different channels such as Internet, Smart TVs, Digital Screens, etc. The number of these items sometimes goes over millions for some businesses. Therefore, users can have trouble finding the products that they are looking for. Recommendation systems address this problem by providing powerful methods which enable users to filter through large information and product space based on their preferences. Moreover, users have different preferences. Thus, businesses can employ recommendation …


On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen Mar 2014

On Predicting User Affiliations Using Social Features In Online Social Networks, Minh Thap Nguyen

Dissertations and Theses Collection (Open Access)

User profiling such as user affiliation prediction in online social network is a challenging task, with many important applications in targeted marketing and personalized recommendation. The research task here is to predict some user affiliation attributes that suggest user participation in different social groups.


Adaptive Grid Based Localized Learning For Multidimensional Data, Sheetal Saini Oct 2012

Adaptive Grid Based Localized Learning For Multidimensional Data, Sheetal Saini

Doctoral Dissertations

Rapid advances in data-rich domains of science, technology, and business has amplified the computational challenges of "Big Data" synthesis necessary to slow the widening gap between the rate at which the data is being collected and analyzed for knowledge. This has led to the renewed need for efficient and accurate algorithms, framework, and algorithmic mechanisms essential for knowledge discovery, especially in the domains of clustering, classification, dimensionality reduction, feature ranking, and feature selection. However, data mining algorithms are frequently challenged by the sparseness due to the high dimensionality of the datasets in such domains which is particularly detrimental to the …


Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman Jan 2011

Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman

USF Tampa Graduate Theses and Dissertations

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms, such as association rule mining and decision tree induction, are used to discover classification rules for specific targets. This multi-stage pipeline approach is contrasted with traditional statistical text mining (STM) methods based on term counts and term-by-document frequencies. The aim is to create effective text analytic processes by adapting and combining individual …


Enterprise Users And Web Search Behavior, April Ann Lewis May 2010

Enterprise Users And Web Search Behavior, April Ann Lewis

Masters Theses

This thesis describes analysis of user web query behavior associated with Oak Ridge National Laboratory’s (ORNL) Enterprise Search System (Hereafter, ORNL Intranet). The ORNL Intranet provides users a means to search all kinds of data stores for relevant business and research information using a single query. The Global Intranet Trends for 2010 Report suggests the biggest current obstacle for corporate intranets is “findability and Siloed content”. Intranets differ from internets in the way they create, control, and share content which can make it often difficult and sometimes impossible for users to find information. Stenmark (2006) first noted studies of corporate …


Effects Of Similarity Metrics On Document Clustering, Rushikesh Veni Jan 2009

Effects Of Similarity Metrics On Document Clustering, Rushikesh Veni

UNLV Theses, Dissertations, Professional Papers, and Capstones

Document clustering or unsupervised document classification is an automated process of grouping documents with similar content. A typical technique uses a similarity function to compare documents. In the literature, many similarity functions such as dot product or cosine measures are proposed for the comparison operator.

For the thesis, we evaluate the effects a similarity function may have on clustering. We start by representing a document and a query, both as a vector of high-dimensional space corresponding to the keywords followed by using an appropriate distance measure in k-means to compute similarity between the document vector and the query vector to …


Text Mining With Exploitation Of User's Background Knowledge : Discovering Novel Association Rules From Text, Xin Chen Jan 2006

Text Mining With Exploitation Of User's Background Knowledge : Discovering Novel Association Rules From Text, Xin Chen

Dissertations

The goal of text mining is to find interesting and non-trivial patterns or knowledge from unstructured documents. Both objective and subjective measures have been proposed in the literature to evaluate the interestingness of discovered patterns. However, objective measures alone are insufficient because such measures do not consider knowledge and interests of the users. Subjective measures require explicit input of user expectations which is difficult or even impossible to obtain in text mining environments.

This study proposes a user-oriented text-mining framework and applies it to the problem of discovering novel association rules from documents. The developed system, uMining, consists of two …


Customer Relationship Management For Banking System, Pingyu Hou Jan 2004

Customer Relationship Management For Banking System, Pingyu Hou

Theses Digitization Project

The purpose of this project is to design, build, and implement a Customer Relationship Management (CRM) system for a bank. CRM BANKING is an online application that caters to strengthening and stabilizing customer relationships in a bank.


Data Warehouse Applications In Modern Day Business, Carla Mounir Issa Jan 2002

Data Warehouse Applications In Modern Day Business, Carla Mounir Issa

Theses Digitization Project

Data warehousing provides organizations with strategic tools to achieve the competitive advantage that organazations are constantly seeking. The use of tools such as data mining, indexing and summaries enables management to retrieve information and perform thorough analysis, planning and forcasting to meet the changes in the market environment. in addition, The data warehouse is providing security measures that, if properly implemented and planned, are helping organizations ensure that their data quality and validity remain intact.


Knowledge Discovery In Biological Databases : A Neural Network Approach, Qicheng Ma Aug 2000

Knowledge Discovery In Biological Databases : A Neural Network Approach, Qicheng Ma

Dissertations

Knowledge discovery, in databases, also known as data mining, is aimed to find significant information from a set of data. The knowledge to be mined from the dataset may refer to patterns, association rules, classification and clustering rules, and so forth. In this dissertation, we present a neural network approach to finding knowledge in biological databases. Specifically, we propose new methods to process biological sequences in two case studies: the classification of protein sequences and the prediction of E. Coli promoters in DNA sequences. Our proposed methods, based oil neural network architectures combine techniques ranging from Bayesian inference, coding theory, …