Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Databases and Information Systems (8)
- Engineering (5)
- Social and Behavioral Sciences (4)
- Computer Engineering (3)
- Numerical Analysis and Scientific Computing (3)
-
- Business (2)
- Communication (2)
- Life Sciences (2)
- Social Media (2)
- Aerospace Engineering (1)
- Artificial Intelligence and Robotics (1)
- Bioimaging and Biomedical Optics (1)
- Biomedical Engineering and Bioengineering (1)
- Business Law, Public Responsibility, and Ethics (1)
- Communication Technology and New Media (1)
- Computational Linguistics (1)
- Computational Neuroscience (1)
- Computer and Systems Architecture (1)
- Criminology (1)
- Electrical and Computer Engineering (1)
- Finance and Financial Management (1)
- Forest Sciences (1)
- Information Security (1)
- Library and Information Science (1)
- Linguistics (1)
- Neuroscience and Neurobiology (1)
- Other Forestry and Forest Sciences (1)
- Sociology (1)
- Institution
- Publication
-
- Research Collection School Of Computing and Information Systems (5)
- Computer Science Summer Fellows (1)
- Computer Science Theses & Dissertations (1)
- Doctoral Dissertations (1)
- Economic Crime Forensics Capstones (1)
-
- Faculty Publications (1)
- Information Technology Master Theses (1)
- Open Access Dissertations (1)
- Open Access Theses (1)
- Theses and Dissertations (1)
- Turkish Journal of Electrical Engineering and Computer Sciences (1)
- UNLV Theses, Dissertations, Professional Papers, and Capstones (1)
- Wayne State University Dissertations (1)
- Publication Type
Articles 1 - 17 of 17
Full-Text Articles in Computer Sciences
Enhancing The Draft Assembly With Minhash, Saju Varghese
Enhancing The Draft Assembly With Minhash, Saju Varghese
UNLV Theses, Dissertations, Professional Papers, and Capstones
In this thesis, we report on the use of minhash techniques to improve the draft assembly of a genome mapping. More specifically, we use minhash to compare the scaffolds of sea urchin and sea cucumber genomes.
One of the main contributions of this thesis is the implementation of minhash with the Message Passing Interface (MPI) utilizing Intel Phi co-processors. It is shown that our implementation significantly reduces the processing time for identification of k-mer similarities.
Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland
Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland
Doctoral Dissertations
In this work, the goal is to detect closely-linked entities within a data set. The entities of interest have a tie causing them to be similar, such as a shared origin or a channel of influence. Given a collection of people or other entities with their attributes or behavior, we identify unusually similar pairs, and we pose the question: Are these two people linked, or can their similarity be explained by chance? Computing similarities is a core operation in many domains, but two constraints differentiate our version of the problem. First, the score assigned to a pair should account for …
Arise-Pie: A People Information Integration Engine Over The Web, Vincent W. Zheng, Tao Hoang, Penghe Chen, Yuan Fang, Xiaoyan Yang
Arise-Pie: A People Information Integration Engine Over The Web, Vincent W. Zheng, Tao Hoang, Penghe Chen, Yuan Fang, Xiaoyan Yang
Research Collection School Of Computing and Information Systems
Searching for people information on the Web is a common practice in life. However, it is time consuming to search for such information manually. In this paper, we aim to develop an automatic people information search system, named ARISE-PIE. To build such a system, we tackle two major technical challenges: data harvesting and data integration. For data harvesting, we study how to leverage search engine to help crawl the relevant Web pages for a target entity; then we propose a novel learning to query model that can automatically select a set of "best" queries to maximize collective utility (e.g., precision …
Controlling For Confounding Network Properties In Hypothesis Testing And Anomaly Detection, Timothy La Fond
Controlling For Confounding Network Properties In Hypothesis Testing And Anomaly Detection, Timothy La Fond
Open Access Dissertations
An important task in network analysis is the detection of anomalous events in a network time series. These events could merely be times of interest in the network timeline or they could be examples of malicious activity or network malfunction. Hypothesis testing using network statistics to summarize the behavior of the network provides a robust framework for the anomaly detection decision process. Unfortunately, choosing network statistics that are dependent on confounding factors like the total number of nodes or edges can lead to incorrect conclusions (e.g., false positives and false negatives). In this dissertation we describe the challenges that face …
Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley
Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley
Computer Science Summer Fellows
Cyberbullying is a type of bullying that uses technology such as cell phones to harass or malign another person. To detect acts of cyberbullying, we are developing an algorithm that will detect cyberbullying in SMS (text) messages. Over 80,000 text messages have been collected by software installed on cell phones carried by participants in our study. This paper describes the development of the algorithm to detect cyberbullying messages, using the cell phone data collected previously. The algorithm works by first separating the messages into conversations in an automated way. The algorithm then analyzes the conversations and scores the severity and …
Machine Learning Methods For Medical And Biological Image Computing, Rongjian Li
Machine Learning Methods For Medical And Biological Image Computing, Rongjian Li
Computer Science Theses & Dissertations
Medical and biological imaging technologies provide valuable visualization information of structure and function for an organ from the level of individual molecules to the whole object. Brain is the most complex organ in body, and it increasingly attracts intense research attentions with the rapid development of medical and bio-logical imaging technologies. A massive amount of high-dimensional brain imaging data being generated makes the design of computational methods for efficient analysis on those images highly demanded. The current study of computational methods using hand-crafted features does not scale with the increasing number of brain images, hindering the pace of scientific discoveries …
Robust Median Reversion Strategy For Online Portfolio Selection, Dingjiang Huang, Junlong Zhou, Bin Li, Hoi, Steven C. H., Shuigeng Zhou
Robust Median Reversion Strategy For Online Portfolio Selection, Dingjiang Huang, Junlong Zhou, Bin Li, Hoi, Steven C. H., Shuigeng Zhou
Research Collection School Of Computing and Information Systems
On-line portfolio selection has been attracting increasing interests from artificial intelligence community in recent decades. Mean reversion, as one most frequent pattern in financial markets, plays an important role in some state-of-the-art strategies. Though successful in certain datasets, existing mean reversion strategies do not fully consider noises and outliers in the data, leading to estimation error and thus non-optimal portfolios, which results in poor performance in practice. To overcome the limitation, we propose to exploit the reversion phenomenon by robust L1-median estimator, and design a novel on-line portfolio selection strategy named "Robust Median Reversion" (RMR), which makes optimal portfolios based …
Exploring Data Mining Techniques For Tree Species Classification Using Co-Registered Lidar And Hyperspectral Data, Julia K. Marrs
Exploring Data Mining Techniques For Tree Species Classification Using Co-Registered Lidar And Hyperspectral Data, Julia K. Marrs
Theses and Dissertations
NASA Goddard’s LiDAR, Hyperspectral, and Thermal imager provides co-registered remote sensing data on experimental forests. Data mining methods were used to achieve a final tree species classification accuracy of 68% using a combined LiDAR and hyperspectral dataset, and show promise for addressing deforestation and carbon sequestration on a species-specific level.
Analyzing Proactive Fraud Detection Software Tools And The Push For Quicker Solutions, Kerri Aiken
Analyzing Proactive Fraud Detection Software Tools And The Push For Quicker Solutions, Kerri Aiken
Economic Crime Forensics Capstones
This paper focuses on proactive fraud detection software tools and how these tools can help detect and prevent possible fraudulent schemes. In addition to relying on routine audits, companies are designing proactive methods that involve the inclusion of software tools to detect and deter instances of fraud and abuse. This paper discusses examples of companies using ACL and SAS software programs and how the software tools have positively changed their auditing systems.
Novelis Inc., an aluminum and recycling company, implemented ACL into their internal audit software system. Competitive Health Analytics (Division of Humana) implemented SAS in order to improve their …
Mining And Clustering Mobility Evolution Patterns From Social Media For Urban Informatics, Chien-Cheng Chen, Meng-Fen Chiang, Wen-Chih Peng
Mining And Clustering Mobility Evolution Patterns From Social Media For Urban Informatics, Chien-Cheng Chen, Meng-Fen Chiang, Wen-Chih Peng
Research Collection School Of Computing and Information Systems
In this paper, given a set of check-in data, we aim at discovering representative daily movement behavior of users in a city. For example, daily movement behavior on a weekday may show users moving from one to another spatial region associated with time information. Since check-in data contain both spatial and temporal information, we propose a mobility evolution pattern to capture the daily movement behavior of users in a city. Furthermore, given a set of daily mobility evolution patterns, we formulate their similarity distances and then discover representative mobility evolution patterns via the clustering process. Representative mobility evolution patterns are …
Euclidean Co-Embedding Of Ordinal Data For Multi-Type Visualization, Le, Hady W. Lauw
Euclidean Co-Embedding Of Ordinal Data For Multi-Type Visualization, Le, Hady W. Lauw
Research Collection School Of Computing and Information Systems
Embedding deals with reducing the high-dimensional representation of data into a low-dimensional representation. Previous work mostly focuses on preserving similarities among objects. Here, not only do we explicitly recognize multiple types of objects, but we also focus on the ordinal relationships across types. Collaborative Ordinal Embedding or COE is based on generative modelling of ordinal triples. Experiments show that COE outperforms the baselines on objective metrics, revealing its capacity for information preservation for ordinal data.
Identifying Terrorist Affiliations Through Social Network Analysis Using Data Mining Techniques, Govand A. Ali
Identifying Terrorist Affiliations Through Social Network Analysis Using Data Mining Techniques, Govand A. Ali
Information Technology Master Theses
In a technologically enabled world, local ideologically inspired warfare becomes global all too quickly, specifically terrorist groups like Al Quaeda and ISIS (Daesh) have successfully used modern computing technology and social networking environments to broadcast their message, recruit new members, and plot attacks. This is especially true for such platforms as Twitter and encrypted mobile apps like Telegram or the clandestine Alrawi. As early detection of such activity is crucial to attack prevention data mining techniques have become increasingly important in the fight against the spread of global terrorist activity. This study employs data mining tools to mine Twitter for …
Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya
Unsupervised Learning Framework For Large-Scale Flight Data Analysis Of Cockpit Human Machine Interaction Issues, Abhishek B. Vaidya
Open Access Theses
As the level of automation within an aircraft increases, the interactions between the pilot and autopilot play a crucial role in its proper operation. Issues with human machine interactions (HMI) have been cited as one of the main causes behind many aviation accidents. Due to the complexity of such interactions, it is challenging to identify all possible situations and develop the necessary contingencies. In this thesis, we propose a data-driven analysis tool to identify potential HMI issues in large-scale Flight Operational Quality Assurance (FOQA) dataset. The proposed tool is developed using a multi-level clustering framework, where a set of basic …
A Cloud-Based Framework For Smart Permit System For Buildings, Magdalini Eirinaki, Subhankar Dhar, Shishir Mathur
A Cloud-Based Framework For Smart Permit System For Buildings, Magdalini Eirinaki, Subhankar Dhar, Shishir Mathur
Faculty Publications
In this paper we propose a novel cloud-based platform for building permit system that is efficient, user-friendly, transparent, and has quick turn-around time for homeowners. Compared to the existing permit systems, the proposed smart city permit framework provides a pre-permitting decision workflow, and incorporates a data analytics and mining module that enables the continuous improvement of a) the end user experience, by analyzing explicit and implicit user feedback, and b) the permitting and urban planning process, allowing a gleaning of key insights for real estate development and city planning purposes, by analyzing how users interact with the system depending on …
Novel Machine Learning Methods For Modeling Time-To-Event Data, Bhanukiran Vinzamuri
Novel Machine Learning Methods For Modeling Time-To-Event Data, Bhanukiran Vinzamuri
Wayne State University Dissertations
Predicting time-to-event from longitudinal data where different events occur at different time points is an extremely important problem in several domains such as healthcare, economics, social networks and seismology, to name a few. A unique challenge in this problem involves building predictive models from right censored data (also called as survival data). This is a phenomenon where instances whose event of interest are not yet observed within a given observation time window and are considered to be right censored. Effective models for predicting time-to-event labels from such right censored data with good accuracy can have a significant impact in these …
Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker
Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker
Turkish Journal of Electrical Engineering and Computer Sciences
The support vector machine (SVM) is one of the highly powerful classifiers that have been shown to be capable of dealing with high-dimensional data. However, its complexity increases requirements of computational power. Recent technologies including the postgenome data of high-dimensional nature add further complexity to the construction of SVM classifiers. In order to overcome this problem, hardware implementations of the SVM classifier have been proposed to benefit from parallelism to accelerate the SVM. On the other hand, those implementations offer limited flexibility in terms of changing parameters and require the reconfiguration of the whole device. The latter interrupts the operation …
Iot+Small Data: Transforming In-Store Shopping Analytics And Services, Meera Radhakrishnan, Sougata Sen, Vigneshwaran Subbaraju, Archan Misra, Rajesh Balan
Iot+Small Data: Transforming In-Store Shopping Analytics And Services, Meera Radhakrishnan, Sougata Sen, Vigneshwaran Subbaraju, Archan Misra, Rajesh Balan
Research Collection School Of Computing and Information Systems
We espouse a vision of small data-based immersive retail analytics, where a combination of sensor data, from personal wearable-devices and store-deployed sensors & IoT devices, is used to create real-time, individualized services for in-store shoppers. Key challenges include (a) appropriate joint mining of sensor & wearable data to capture a shopper’s product level interactions, and (b) judicious triggering of power-hungry wearable sensors (e.g., camera) to capture only relevant portions of a shopper’s in-store activities. To explore the feasibility of our vision, we conducted experiments with 5 smartwatch-wearing users who interacted with objects placed on cupboard racks in our lab (to …