Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms

2016

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 42

Full-Text Articles in Databases and Information Systems

Spatial Data Mining Analytical Environment For Large Scale Geospatial Data, Zhao Yang Dec 2016

Spatial Data Mining Analytical Environment For Large Scale Geospatial Data, Zhao Yang

University of New Orleans Theses and Dissertations

Nowadays, many applications are continuously generating large-scale geospatial data. Vehicle GPS tracking data, aerial surveillance drones, LiDAR (Light Detection and Ranging), world-wide spatial networks, and high resolution optical or Synthetic Aperture Radar imagery data all generate a huge amount of geospatial data. However, as data collection increases our ability to process this large-scale geospatial data in a flexible fashion is still limited. We propose a framework for processing and analyzing large-scale geospatial and environmental data using a “Big Data” infrastructure. Existing Big Data solutions do not include a specific mechanism to analyze large-scale geospatial data. In this work, we extend …


Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper Dec 2016

Databrarianship: The Academic Data Librarian In Theory And Practice, Darren Sweeper

Sprague Library Scholarship and Creative Works

No abstract provided.


Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou Dec 2016

Answering Why-Not And Why Questions On Reverse Top-K Queries, Qing Liu, Yunjun Gao, Gang Chen, Baihua Zheng, Linlin Zhou

Research Collection School Of Computing and Information Systems

Why-not and why questions can be posed by database users to seek clarifications on unexpected query results. Specifically, why-not questions aim to explain why certain expected tuples are absent from the query results, while why questions try to clarify why certain unexpected tuples are present in the query results. This paper systematically explores the why-not and why questions on reverse top-k queries, owing to its importance in multi-criteria decision making. We first formalize why-not questions on reverse top-k queries, which try to include the missing objects in the reverse top-k query results, and then, we propose a unified framework called …


Large Scale Data Mining For It Service Management, Chunqiu Zeng Nov 2016

Large Scale Data Mining For It Service Management, Chunqiu Zeng

FIU Electronic Theses and Dissertations

More than ever, businesses heavily rely on IT service delivery to meet their current and frequently changing business requirements. Optimizing the quality of service delivery improves customer satisfaction and continues to be a critical driver for business growth. The routine maintenance procedure plays a key function in IT service management, which typically involves problem detection, determination and resolution for the service infrastructure.

Many IT Service Providers adopt partial automation for incident diagnosis and resolution where the operation of the system administrators and automation operation are intertwined. Often the system administrators' roles are limited to helping triage tickets to the processing …


State Preserving Extreme Learning Machine For Face Recognition, Md. Zahangir Alom, Paheding Sidike, Vijayan K. Asari, Tarek M. Taha Oct 2016

State Preserving Extreme Learning Machine For Face Recognition, Md. Zahangir Alom, Paheding Sidike, Vijayan K. Asari, Tarek M. Taha

Vijayan K. Asari

Extreme Learning Machine (ELM) has been introduced as a new algorithm for training single hidden layer feed-forward neural networks (SLFNs) instead of the classical gradient-based algorithms. Based on the consistency property of data, which enforce similar samples to share similar properties, ELM is a biologically inspired learning algorithm with SLFNs that learns much faster with good generalization and performs well in classification applications. However, the random generation of the weight matrix in current ELM based techniques leads to the possibility of unstable outputs in the learning and testing phases. Therefore, we present a novel approach for computing the weight matrix …


Online Adaptive Passive-Aggressive Methods For Non-Negative Matrix Factorization And Its Applications, Chenghao Liu, Hoi, Steven C. H., Peilin Zhao, Jianling Sun, Ee-Peng Lim Oct 2016

Online Adaptive Passive-Aggressive Methods For Non-Negative Matrix Factorization And Its Applications, Chenghao Liu, Hoi, Steven C. H., Peilin Zhao, Jianling Sun, Ee-Peng Lim

Research Collection School Of Computing and Information Systems

This paper aims to investigate efficient and scalable machine learning algorithms for resolving Non-negative Matrix Factorization (NMF), which is important for many real-world applications, particularly for collaborative filtering and recommender systems. Unlike traditional batch learning methods, a recently proposed online learning technique named "NN-PA" tackles NMF by applying the popular Passive-Aggressive (PA) online learning, and found promising results. Despite its simplicity and high efficiency, NN-PA falls short in at least two critical limitations: (i) it only exploits the first-order information and thus may converge slowly especially at the beginning of online learning tasks; (ii) it is sensitive to some key …


Representation Learning For Homophilic Preferences, Trong T. Nguyen, Hady W. Lauw Sep 2016

Representation Learning For Homophilic Preferences, Trong T. Nguyen, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Users express their personal preferences through ratings, adoptions, and other consumption behaviors. We seek tolearn latent representations for user preferences from such behavioral data. One representation learning model that has been shown to be effective for large preference datasets is Restricted Boltzmann Machine (RBM). While homophily, or the tendency of friends to share their preferences at some level, is an established notion in sociology, thus far it has not yet been clearly demonstrated on RBM-based preference models. The question lies in how to appropriately incorporate social network into the architecture of RBM-based models for learning representations of preferences. In this …


Probabilistic Models For Contextual Agreement In Preferences, Loc Do, Hady W. Lauw Sep 2016

Probabilistic Models For Contextual Agreement In Preferences, Loc Do, Hady W. Lauw

Research Collection School Of Computing and Information Systems

The long-tail theory for consumer demand implies the need for more accurate personalization technologies to target items to the users who most desire them. A key tenet of personalization is the capacity to model user preferences. Most of the previous work on recommendation and personalization has focused primarily on individual preferences. While some focus on shared preferences between pairs of users, they assume that the same similarity value applies to all items. Here we investigate the notion of "context," hypothesizing that while two users may agree on their preferences on some items, they may also disagree on other items. To …


Autoquery: Automatic Construction Of Dependency Queries For Code Search, Shaowei Wang, David Lo, Lingxiao Jiang Sep 2016

Autoquery: Automatic Construction Of Dependency Queries For Code Search, Shaowei Wang, David Lo, Lingxiao Jiang

Research Collection School Of Computing and Information Systems

Many code search techniques have been proposed to return relevant code for a user query expressed as textual descriptions. However, source code is not mere text. It contains dependency relations among various program elements. To leverage these dependencies for more accurate code search results, techniques have been proposed to allow user queries to be expressed as control and data dependency relationships among program elements. Although such techniques have been shown to be effective for finding relevant code, it remains a question whether appropriate queries can be generated by average users. In this work, we address this concern by proposing a …


Modeling Sequential Preferences With Dynamic User And Context Factors, Duc Trong Le, Yuan Fang, Hady W. Lauw Sep 2016

Modeling Sequential Preferences With Dynamic User And Context Factors, Duc Trong Le, Yuan Fang, Hady W. Lauw

Research Collection School Of Computing and Information Systems

Users express their preferences for items in diverse forms, through their liking for items, as well as through the sequence in which they consume items. The latter, referred to as “sequential preference”, manifests itself in scenarios such as song or video playlists, topics one reads or writes about in social media, etc. The current approach to modeling sequential preferences relies primarily on the sequence information, i.e., which item follows another item. However, there are other important factors, due to either the user or the context, which may dynamically affect the way a sequence unfolds. In this work, we develop generative …


Control Flow Integrity Enforcement With Dynamic Code Optimization, Yan Lin, Xiaoxiao Tang, Debin Gao, Jianming Fu Sep 2016

Control Flow Integrity Enforcement With Dynamic Code Optimization, Yan Lin, Xiaoxiao Tang, Debin Gao, Jianming Fu

Research Collection School Of Computing and Information Systems

Control Flow Integrity (CFI) is an attractive security property with which most injected and code reuse attacks can be defeated, including advanced attacking techniques like Return-Oriented Programming (ROP). However, comprehensive enforcement of CFI is expensive due to additional supports needed (e.g., compiler support and presence of relocation or debug information) and performance overhead. Recent research has been trying to strike the balance among reasonable approximation of the CFI properties, minimal additional supports needed, and acceptable performance. We investigate existing dynamic code optimization techniques and find that they provide an architecture on which CFI can be enforced effectively and efficiently. In …


Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H. Sep 2016

Soft Confidence-Weighted Learning, Jialei Wang, Peilin Zhao, Hoi, Steven C. H.

Research Collection School Of Computing and Information Systems

Online learning plays an important role in many big datamining problems because of its high efficiency and scalability. In theliterature, many online learning algorithms using gradient information havebeen applied to solve online classification problems. Recently, more effectivesecond-order algorithms have been proposed, where the correlation between thefeatures is utilized to improve the learning efficiency. Among them,Confidence-Weighted (CW) learning algorithms are very effective, which assumethat the classification model is drawn from a Gaussian distribution, whichenables the model to be effectively updated with the second-order informationof the data stream. Despite being studied actively, these CW algorithms cannothandle nonseparable datasets and noisy datasets very …


A Novel Digital Image Classification Algorithm Via Low-Rank Sparse Bag-Of-Features Model, Xiu-Ming Zou, Huai-Jiang Sun, Sai Yang, Yan Zhu Aug 2016

A Novel Digital Image Classification Algorithm Via Low-Rank Sparse Bag-Of-Features Model, Xiu-Ming Zou, Huai-Jiang Sun, Sai Yang, Yan Zhu

Research Collection School of Computing and Information Systems

Bag-of-features (BoF) is one of the most well-known methods used to represent digital image features because of its simplicity and efficiency. A variety of improved algorithms have been employed to enhance the performance of BoF in characterization. However, challenges in the application of BoF in the field still exist. This study focused on BoF by decomposing local features and presented a novel framework for BoF on the basis of low-rank and sparse matrix decomposition to obtain a more robust and discriminative digital image classification. First, the local feature matrix of a digital image is decomposed into a low-rank matrix and …


User Identity Linkage By Latent User Space Modelling, Xin Mu, Feida Zhu, Ee-Peng Lim, Jing Xiao, Jianzong Wang, Zhi-Hua Zhou Aug 2016

User Identity Linkage By Latent User Space Modelling, Xin Mu, Feida Zhu, Ee-Peng Lim, Jing Xiao, Jianzong Wang, Zhi-Hua Zhou

Research Collection School Of Computing and Information Systems

User identity linkage across social platforms is an important problem of great research challenge and practical value. In real applications, the task often assumes an extra degree of difficulty by requiring linkage across multiple platforms. While pair-wise user linkage between two platforms, which has been the focus of most existing solutions, provides reasonably convincing linkage, the result depends by nature on the order of platform pairs in execution with no theoretical guarantee on its stability. In this paper, we explore a new concept of “Latent User Space” to more naturally model the relationship between the underlying real users and their …


Latent Semantic Indexing In The Discovery Of Cyber-Bullying In Online Text, Jacob L. Bigelow Jul 2016

Latent Semantic Indexing In The Discovery Of Cyber-Bullying In Online Text, Jacob L. Bigelow

Computer Science Summer Fellows

The rise in the use of social media and particularly the rise of adolescent use has led to a new means of bullying. Cyber-bullying has proven consequential to youth internet users causing a need for a response. In order to effectively stop this problem we need a verified method of detecting cyber-bullying in online text; we aim to find that method. For this project we look at thirteen thousand labeled posts from Formspring and create a bank of words used in the posts. First the posts are cleaned up by taking out punctuation, normalizing emoticons, and removing high and low …


Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley Jul 2016

Detection Of Cyberbullying In Sms Messaging, Bryan W. Bradley

Computer Science Summer Fellows

Cyberbullying is a type of bullying that uses technology such as cell phones to harass or malign another person. To detect acts of cyberbullying, we are developing an algorithm that will detect cyberbullying in SMS (text) messages. Over 80,000 text messages have been collected by software installed on cell phones carried by participants in our study. This paper describes the development of the algorithm to detect cyberbullying messages, using the cell phone data collected previously. The algorithm works by first separating the messages into conversations in an automated way. The algorithm then analyzes the conversations and scores the severity and …


On Understanding Preference For Agile Methods Among Software Developers, David Brian Bishop, Amit V. Deokar, Surendra Sarnikar Jul 2016

On Understanding Preference For Agile Methods Among Software Developers, David Brian Bishop, Amit V. Deokar, Surendra Sarnikar

Research & Publications

Agile methods are gaining widespread use in industry. Although management is keen on adopting agile, not all developers exhibit preference for agile methods. The literature is sparse in regard to why developers may show preference for agile. Understanding the factors informing the preference for agile can lead to more effective formation of teams, better training approaches, and optimizing software development efforts by focusing on key desirable components of agile. This study, using a grounded theory methodology, finds a variety of categories of factors that influence software developer preference for agile methods including self-efficacy, affective response, interpersonal response, external contingencies, and …


Real-Time Salient Object Detection With A Minimum Spanning Tree, Wei-Chih Tu, Shengfeng He, Qingxiong Yang, Shao-Yi Chien Jul 2016

Real-Time Salient Object Detection With A Minimum Spanning Tree, Wei-Chih Tu, Shengfeng He, Qingxiong Yang, Shao-Yi Chien

Research Collection School Of Computing and Information Systems

In this paper, we present a real-time salient object detection system based on the minimum spanning tree. Due to the fact that background regions are typically connected to the image boundaries, salient objects can be extracted by computing the distances to the boundaries. However, measuring the image boundary connectivity efficiently is a challenging problem. Existing methods either rely on superpixel representation to reduce the processing units or approximate the distance transform. Instead, we propose an exact and iteration free solution on a minimum spanning tree. The minimum spanning tree representation of an image inherently reveals the object geometry information in …


Online Sparse Passive Aggressive Learning With Kernels, Jing Lu, Peilin Zhao, Hoi, Steven C. H. May 2016

Online Sparse Passive Aggressive Learning With Kernels, Jing Lu, Peilin Zhao, Hoi, Steven C. H.

Research Collection School Of Computing and Information Systems

Conventional online kernel methods often yield an unboundedlarge number of support vectors, making them inefficient and non-scalable forlarge-scale applications. Recent studies on bounded kernel-based onlinelearning have attempted to overcome this shortcoming. Although they can boundthe number of support vectors at each iteration, most of them fail to bound thenumber of support vectors for the final output solution which is often obtainedby averaging the series of solutions over all the iterations. In this paper, wepropose a novel kernel-based online learning method, Sparse Passive Aggressivelearning (SPA), which can output a final solution with a bounded number ofsupport vectors. The key idea of …


User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang Apr 2016

User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang

Saverio Perugini

As detailed in Chap. 1, system implementations for dynamic taxonomies and faceted search allow a wide range of query possibilities on the data. Only when these are made accessible by appropriate user interfaces, the resulting applications can support a variety of search, browsing and analysis tasks. User interface design in this area is confronted with specific challenges. This chapter presents an overview of both established and novel principles and solutions.


Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan Apr 2016

Program Transformations For Information Personalization, Saverio Perugini, Naren Ramakrishnan

Saverio Perugini

Personalization constitutes the mechanisms necessary to automatically customize information content, structure, and presentation to the end user to reduce information overload. Unlike traditional approaches to personalization, the central theme of our approach is to model a website as a program and conduct website transformation for personalization by program transformation (e.g., partial evaluation, program slicing). The goal of this paper is study personalization through a program transformation lens and develop a formal model, based on program transformations, for personalized interaction with hierarchical hypermedia. The specific research issues addressed involve identifying and developing program representations and transformations suitable for classes of hierarchical …


A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini Apr 2016

A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini

Saverio Perugini

We discuss and demonstrate a tool for prototyping dialog-based systems that, given a high-level specification of a human-computer dialog, stages the dialog for interactive use. The tool enables a dialog designer to evaluate a variety of dialogs without having to program each individual dialog, and serves as a proof-of-concept for our approach to mixed-initiative dialog modeling and implementation from a programming language-based perspective.


A Study Of Android Malware Detection Techniques And Machine Learning, Balaji Baskaran, Anca Ralescu Apr 2016

A Study Of Android Malware Detection Techniques And Machine Learning, Balaji Baskaran, Anca Ralescu

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

Android OS is one of the widely used mobile Operating Systems. The number of malicious applications and adwares are increasing constantly on par with the number of mobile devices. A great number of commercial signature based tools are available on the market which prevent to an extent the penetration and distribution of malicious applications. Numerous researches have been conducted which claims that traditional signature based detection system work well up to certain level and malware authors use numerous techniques to evade these tools. So given this state of affairs, there is an increasing need for an alternative, really tough malware …


Extended Pixel Representation For Image Segmentation, Deeptha Girish, Vineeta Singh, Anca Ralescu Apr 2016

Extended Pixel Representation For Image Segmentation, Deeptha Girish, Vineeta Singh, Anca Ralescu

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

We explore the use of extended pixel representation for color based image segmentation using the K-means clustering algorithm. Various extended pixel representations have been implemented in this paper and their results have been compared. By extending the representation of pixels an image is mapped to a higher dimensional space. Unlike other approaches, where data is mapped into an implicit features space of higher dimension (kernel methods), in the approach considered here, the higher dimensions are defined explicitly. Preliminary experimental results which illustrate the proposed approach are promising.


An Autonomic Computing System Based On A Rule-Based Policy Engine And Artificial Immune Systems, Rahmira Rufus, William Nick, Joseph Shelton, Albert Esterline Apr 2016

An Autonomic Computing System Based On A Rule-Based Policy Engine And Artificial Immune Systems, Rahmira Rufus, William Nick, Joseph Shelton, Albert Esterline

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

Autonomic computing systems arose from the notion that complex computing systems should have properties like those of the autonomic nervous system, which coordinates bodily functions and allows attention to be directed to more pressing needs. An autonomic system allows the system administrator to specify high-level policies, which the system maintains without administrator assistance. Policy enforcement can be done with a rule based system such as Jess (a java expert system shell). An autonomic system must be able to monitor itself, and this is often a limiting factor. We are developing an automatic system that has a policy engine and uses …


Towards The Development Of A Cyber Analysis & Advisement Tool (Caat) For Mitigating De-Anonymization Attacks, Siobahn Day, Henry Williams, Joseph Shelton, Gerry Dozier Apr 2016

Towards The Development Of A Cyber Analysis & Advisement Tool (Caat) For Mitigating De-Anonymization Attacks, Siobahn Day, Henry Williams, Joseph Shelton, Gerry Dozier

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

We are seeing a rise in the number of Anonymous Social Networks (ASN) that claim to provide a sense of user anonymity. However, what many users of ASNs do not know that a person can be identified by their writing style.

In this paper, we provide an overview of a number of author concealment techniques, their impact on the semantic meaning of an author's original text, and introduce AuthorCAAT, an application for mitigating de-anonymization attacks. Our results show that iterative paraphrasing performs the best in terms of author concealment and performs well with respect to Latent Semantic Analysis.


Situations And Evidence For Identity Using Dempster-Shafer Theory, William Nick, Yenny Dominguez, Albert Esterline Apr 2016

Situations And Evidence For Identity Using Dempster-Shafer Theory, William Nick, Yenny Dominguez, Albert Esterline

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

We present a computational framework for identity based on Barwise and Devlin’s situation theory. We present an example with constellations of situations identifying an individual to create what we call id-situations, where id-actions are performed, along with supporting situations. We use Semantic Web standards to represent and reason about the situations in our example. We show how to represent the strength of the evidence, within the situations, as a measure of the support for judgments reached in the id-situation. To measure evidence of an identity from the supporting situations, we use the Dempster-Shafer theory of evidence. We enhance Dempster- Shafer …


Student Understanding And Engagement In A Class Employing Comps Computer Mediated Problem Solving: A First Look, Jung Hee Kim, Michael Glass, Taehee Kim, Kelvin Bryant, Angelica Willis, Ebonie Mcneil, Zachery Thomas Apr 2016

Student Understanding And Engagement In A Class Employing Comps Computer Mediated Problem Solving: A First Look, Jung Hee Kim, Michael Glass, Taehee Kim, Kelvin Bryant, Angelica Willis, Ebonie Mcneil, Zachery Thomas

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

COMPS computer-mediated group discussion exercises are being added to a second-semester computer programming class. The class is a gateway for computer science and computer engineering students, where many students have difficulty succeeding well enough to proceed in their major. This paper reports on first results of surveys on student experience with the exercises. It also reports on the affective states observed in the discussions that are candidates for analysis of group functioning. As a step toward computer monitoring of the discussions, an experiment in using dialogue features to identify the gender of the participants is described.


A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini Apr 2016

A Tool For Staging Mixed-Initiative Dialogs, Joshua W. Buck, Saverio Perugini

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

We discuss and demonstrate a tool for prototyping dialog-based systems that, given a high-level specification of a human-computer dialog, stages the dialog for interactive use. The tool enables a dialog designer to evaluate a variety of dialogs without having to program each individual dialog, and serves as a proof-of-concept for our approach to mixed-initiative dialog modeling and implementation from a programming language-based perspective.


Keynote Talk 2: Social And Perceptual Fidelity Of Avatars And Autonomous Agents In Virtual Reality, Benjamin Kunz Apr 2016

Keynote Talk 2: Social And Perceptual Fidelity Of Avatars And Autonomous Agents In Virtual Reality, Benjamin Kunz

MAICS: The Modern Artificial Intelligence and Cognitive Science Conference

Advances in display, computing and sensor technologies have led to a revival of interest and excitement surrounding immersive virtual reality. Here, on the cusp of the arrival of practical and affordable virtual reality technology, are open questions regarding the factors that contribute to compelling and immersive virtual worlds.

In order for virtual reality to be useful as a tool for use in training, education, communication, research, content-creation and entertainment, we must understand the degree to which the perception of the virtual environment and virtual characters resembles perception of the real world.

Relatedly, virtual reality's utility in these contexts demands evidence …