Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems

2017

Institution
Keyword
Publication
Publication Type

Articles 1 - 28 of 28

Full-Text Articles in Artificial Intelligence and Robotics

Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen Dec 2017

Ethics And Bias In Machine Learning: A Technical Study Of What Makes Us “Good”, Ashley Nicole Shadowen

Student Theses

The topic of machine ethics is growing in recognition and energy, but bias in machine learning algorithms outpaces it to date. Bias is a complicated term with good and bad connotations in the field of algorithmic prediction making. Especially in circumstances with legal and ethical consequences, we must study the results of these machines to ensure fairness. This paper attempts to address ethics at the algorithmic level of autonomous machines. There is no one solution to solving machine bias, it depends on the context of the given system and the most reasonable way to avoid biased decisions while maintaining the …


Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou Dec 2017

Leveraging The Trade-Off Between Accuracy And Interpretability In A Hybrid Intelligent System, Di Wang, Chai Quek, Ah-Hwee Tan, Chunyan Miao, Geok See Ng, You Zhou

Research Collection School Of Computing and Information Systems

Neural Fuzzy Inference System (NFIS) is a widely adopted paradigm to develop a data-driven learning system. This hybrid system has been widely adopted due to its accurate reasoning procedure and comprehensible inference rules. Although most NFISs primarily focus on accuracy, we have observed an ever increasing demand on improving the interpretability of NFISs and other types of machine learning systems. In this paper, we illustrate how we leverage the trade-off between accuracy and interpretability in an NFIS called Genetic Algorithm and Rough Set Incorporated Neural Fuzzy Inference System (GARSINFIS). In a nutshell, GARSINFIS self-organizes its network structure with a small …


Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang Nov 2017

Nbpmf: Novel Network-Based Inference Methods For Peptide Mass Fingerprinting, Zhewei Liang

Electronic Thesis and Dissertation Repository

Proteins are large, complex molecules that perform a vast array of functions in every living cell. A proteome is a set of proteins produced in an organism, and proteomics is the large-scale study of proteomes. Several high-throughput technologies have been developed in proteomics, where the most commonly applied are mass spectrometry (MS) based approaches. MS is an analytical technique for determining the composition of a sample. Recently it has become a primary tool for protein identification, quantification, and post translational modification (PTM) characterization in proteomics research. There are usually two different ways to identify proteins: top-down and bottom-up. Top-down approaches …


Controversy Analysis And Detection, Shiri Dori-Hacohen Nov 2017

Controversy Analysis And Detection, Shiri Dori-Hacohen

Doctoral Dissertations

Seeking information on a controversial topic is often a complex task. Alerting users about controversial search results can encourage critical literacy, promote healthy civic discourse and counteract the "filter bubble" effect, and therefore would be a useful feature in a search engine or browser extension. Additionally, presenting information to the user about the different stances or sides of the debate can help her navigate the landscape of search results beyond a simple "list of 10 links". This thesis has made strides in the emerging niche of controversy detection and analysis. The body of work in this thesis revolves around two …


An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le Nov 2017

An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le

Dissertations and Theses Collection (Open Access)

This thesis proposes a general solution framework that integrates methods in machine learning in creative ways to solve a diverse set of problems arising in urban environments. It particularly focuses on modeling spatiotemporal data for the purpose of predicting urban phenomena. Concretely, the framework is applied to solve three specific real-world problems: human mobility prediction, trac speed prediction and incident prediction. For human mobility prediction, I use visitor trajectories collected a large theme park in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered sequence of attraction visits and corresponding timestamps produced by a visitor. …


Leveraging Social Analytics Data For Identifying Customer Segments For Online News Media, Jansen, Bernard J, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Haewoon Kwak Nov 2017

Leveraging Social Analytics Data For Identifying Customer Segments For Online News Media, Jansen, Bernard J, Soon-Gyo Jung, Jisun An, Haewoon Kwak, Haewoon Kwak

Research Collection School Of Computing and Information Systems

In this work, we describe a methodology for leveraging large amounts of customer interaction data with online content from major social media platforms in order to isolate meaningful customer segments. The methodology is robust in that it can rapidly identify diverse customer segments using solely online behaviors and then associate these behavioral customer segments with the related distinct demographic segments, presenting a holistic picture of the customer base of an organization. We validate our methodology via the implementation of a working system that rapidly and in near real-time processes tens of millions of online customer interactions with content posted on …


Interactive Social Recommendation, Xin Wang, Steven C. H. Hoi, Chenghao Liu, Martin Ester Nov 2017

Interactive Social Recommendation, Xin Wang, Steven C. H. Hoi, Chenghao Liu, Martin Ester

Research Collection School Of Computing and Information Systems

Social recommendation has been an active research topic over the last decade, based on the assumption that social information from friendship networks is beneficial for improving recommendation accuracy, especially when dealing with cold-start users who lack sufficient past behavior information for accurate recommendation. However, it is nontrivial to use such information, since some of a person's friends may share similar preferences in certain aspects, but others may be totally irrelevant for recommendations. Thus one challenge is to explore and exploit the extend to which a user trusts his/her friends when utilizing social information to improve recommendations. On the other hand, …


Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal Aug 2017

Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal

University of New Orleans Theses and Dissertations

Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that …


Ancr—An Adaptive Network Coding Routing Scheme For Wsns With Different-Success-Rate Links †, Xiang Ji, Anwen Wang, Chunyu Li, Chun Ma, Yao Peng, Dajin Wang, Qingyi Hua, Feng Chen, Dingyi Fang Aug 2017

Ancr—An Adaptive Network Coding Routing Scheme For Wsns With Different-Success-Rate Links †, Xiang Ji, Anwen Wang, Chunyu Li, Chun Ma, Yao Peng, Dajin Wang, Qingyi Hua, Feng Chen, Dingyi Fang

Department of Computer Science Faculty Scholarship and Creative Works

As the underlying infrastructure of the Internet of Things (IoT), wireless sensor networks (WSNs) have been widely used in many applications. Network coding is a technique in WSNs to combine multiple channels of data in one transmission, wherever possible, to save node’s energy as well as increase the network throughput. So far most works on network coding are based on two assumptions to determine coding opportunities: (1) All the links in the network have the same transmission success rate; (2) Each link is bidirectional, and has the same transmission success rate on both ways. However, these assumptions may not be …


Online Multitask Relative Similarity Learning, Shuji Hao, Peilin Zhao, Yong Liu, Steven C. H. Hoi, Chunyan Miao Aug 2017

Online Multitask Relative Similarity Learning, Shuji Hao, Peilin Zhao, Yong Liu, Steven C. H. Hoi, Chunyan Miao

Research Collection School Of Computing and Information Systems

Relative similarity learning (RSL) aims to learn similarity functions from data with relative constraints. Most previous algorithms developed for RSL are batch-based learning approaches which suffer from poor scalability when dealing with real world data arriving sequentially. These methods are often designed to learn a single similarity function for a specific task. Therefore, they may be sub-optimal to solve multiple task learning problems. To overcome these limitations, we propose a scalable RSL framework named OMTRSL (Online Multi-Task Relative Similarity Learning). Specifically, we first develop a simple yet effective online learning algorithm for multi-task relative similarity learning. Then, we also propose …


Deepfacade: A Deep Learning Approach To Facade Parsing, Hantang Liu, Jialiang Zhang, Jianke Zhu, Steven C. H. Hoi Aug 2017

Deepfacade: A Deep Learning Approach To Facade Parsing, Hantang Liu, Jialiang Zhang, Jianke Zhu, Steven C. H. Hoi

Research Collection School Of Computing and Information Systems

The parsing of building facades is a key component to the problem of 3D street scenes reconstruction, which is long desired in computer vision. In this paper, we propose a deep learning based method for segmenting a facade into semantic categories. Man-made structures often present the characteristic of symmetry. Based on this observation, we propose a symmetric regularizer for training the neural network. Our proposed method can make use of both the power of deep neural networks and the structure of man-made architectures. We also propose a method to refine the segmentation results using bounding boxes generated by the Region …


Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi Aug 2017

Dynamic Adversarial Mining - Effectively Applying Machine Learning In Adversarial Non-Stationary Environments., Tegjyot Singh Sethi

Electronic Theses and Dissertations

While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race …


Incentivizing The Use Of Bike Trailers For Dynamic Repositioning In Bike Sharing Systems, Supriyo Ghosh, Pradeep Varakantham Jul 2017

Incentivizing The Use Of Bike Trailers For Dynamic Repositioning In Bike Sharing Systems, Supriyo Ghosh, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

Bike Sharing System (BSS) is a green mode of transportation that is employed extensively for short distance travels in major cities of the world. Unfortunately, the users behaviour driven by their personal needs can often result in empty or full base stations, thereby resulting in loss of customer demand. To counter this loss in customer demand, BSS operators typically utilize a fleet of carrier vehicles for repositioning the bikes between stations. However, this fuel burning mode of repositioning incurs a significant amount of routing, labor cost and further increases carbon emissions. Therefore, we propose a potentially self-sustaining and environment friendly …


Question Type Recognition Using Natural Language Input, Aishwarya Soni Jun 2017

Question Type Recognition Using Natural Language Input, Aishwarya Soni

Master's Projects

Recently, numerous specialists are concentrating on the utilization of Natural Language Processing (NLP) systems in various domains, for example, data extraction and content mining. One of the difficulties with these innovations is building up a precise Question and Answering (QA) System. Question type recognition is the most significant task in a QA system, for example, chat bots. Organization such as National Institute of Standards (NIST) hosts a conference series called as Text REtrieval Conference (TREC) series which keeps a competition every year to encourage and improve the technique of information retrieval from a large corpus of text. When a user …


Improving Text Classification With Word Embedding, Lihao Ge Jun 2017

Improving Text Classification With Word Embedding, Lihao Ge

Master's Projects

One challenge in text classification is that it is hard to make feature reduction basing upon the meaning of the features. An improper feature reduction may even worsen the classification accuracy. Word2Vec, a word embedding method, has recently been gaining popularity due to its high precision rate of analyzing the semantic similarity between words at relatively low computational cost. However, there are only a limited number of researchers focusing on feature reduction using Word2Vec. In this project, we developed a Word2Vec based method to reduce the feature size while increasing the classification accuracy. The feature reduction is achieved by loosely …


An Open Source Discussion Group Recommendation System, Sarika Padmashali May 2017

An Open Source Discussion Group Recommendation System, Sarika Padmashali

Master's Projects

A recommendation system analyzes user behavior on a website to make suggestions about what a user should do in the future on the website. It basically tries to predict the “rating” or “preference” a user would have for an action. Yioop is an open source search engine, wiki system, and user discussion group system managed by Dr. Christopher Pollett at SJSU. In this project, we have developed a recommendation system for Yioop where users are given suggestions about the threads and groups they could join based on their user history. We have used collaborative filtering techniques to make recommendations and …


Document Classification Using Machine Learning, Ankit Basarkar May 2017

Document Classification Using Machine Learning, Ankit Basarkar

Master's Projects

To perform document classification algorithmically, documents need to be represented such that it is understandable to the machine learning classifier. The report discusses the different types of feature vectors through which document can be represented and later classified. The project aims at comparing the Binary, Count and TfIdf feature vectors and their impact on document classification. To test how well each of the three mentioned feature vectors perform, we used the 20-newsgroup dataset and converted the documents to all the three feature vectors. For each feature vector representation, we trained the Naïve Bayes classifier and then tested the generated classifier …


A Chatbot Framework For Yioop, Harika Nukala May 2017

A Chatbot Framework For Yioop, Harika Nukala

Master's Projects

Over the past few years, messaging applications have become more popular than Social networking sites. Instead of using a specific application or website to access some service, chatbots are created on messaging platforms to allow users to interact with companies’ products and also give assistance as needed. In this project, we designed and implemented a chatbot Framework for Yioop. The goal of the Chatbot Framework for Yioop project is to provide a platform for developers in Yioop to build and deploy chatbot applications. A chatbot is a web service that can converse with users using artificial intelligence in messaging platforms. …


Named Entity Recognition And Classification For Natural Language Inputs At Scale, Shreeraj Dabholkar May 2017

Named Entity Recognition And Classification For Natural Language Inputs At Scale, Shreeraj Dabholkar

Master's Projects

Natural language processing (NLP) is a technique by which computers can analyze, understand, and derive meaning from human language. Phrases in a body of natural text that represent names, such as those of persons, organizations or locations are referred to as named entities. Identifying and categorizing these named entities is still a challenging task, research on which, has been carried out for many years. In this project, we build a supervised learning based classifier which can perform named entity recognition and classification (NERC) on input text and implement it as part of a chatbot application. The implementation is then scaled …


Headline Generation Using Deep Neural Networks, Dhruven Vora May 2017

Headline Generation Using Deep Neural Networks, Dhruven Vora

Master's Projects

News headline generation is one of the important text summarization tasks. Human generated news headlines are generally intended to catch the eye rather than provide useful information. There have been many approaches to generate meaningful headlines by either using neural networks or using linguistic features. In this report, we are proposing a novel approach based on integrating Hedge Trimmer, which is a grammar based extractive summarization system with a deep neural network abstractive summarization system to generate meaningful headlines. We analyze the results against current recurrent neural network based headline generation system.


Real-Time Prediction Of Length Of Stay Using Passive Wi-Fi Sensing, Truc Viet Le, Baoyang Song, Laura Wynter May 2017

Real-Time Prediction Of Length Of Stay Using Passive Wi-Fi Sensing, Truc Viet Le, Baoyang Song, Laura Wynter

Research Collection School Of Computing and Information Systems

The proliferation of wireless technologies in today's everyday life is one of the key drivers of the Internet of Things (IoT). In addition to being an enabler of connectivity, the vast penetration of wireless devices today gives rise to a secondary functionality as a means of tracking and localization of the devices themselves. Indeed, in order to discover and automatically connect to known Wi-Fi networks, mobile devices have to scan and broadcast the so-called probe requests on all available channels, which can be captured and analyzed in a non-intrusive manner. Thus, one of the key applications of this feature is …


Real-Time Prediction Of Length Of Stay Using Passive Wi-Fi Sensing, Truc Viet Le, Baoyang Song, Laura Wynter May 2017

Real-Time Prediction Of Length Of Stay Using Passive Wi-Fi Sensing, Truc Viet Le, Baoyang Song, Laura Wynter

Research Collection School Of Computing and Information Systems

The proliferation of wireless technologies in today's everyday life is one of the key drivers of the Internet of Things (IoT). In addition to being an enabler of connectivity, the vast penetration of wireless devices today gives rise to a secondary functionality as a means of tracking and localization of the devices themselves. Indeed, in order to discover and automatically connect to known Wi-Fi networks, mobile devices have to scan and broadcast the so-called probe requests on all available channels, which can be captured and analyzed in a non-intrusive manner. Thus, one of the key applications of this feature is …


A Compare-Aggregate Model For Matching Text Sequences, Shuohang Wang, Jing Jiang Apr 2017

A Compare-Aggregate Model For Matching Text Sequences, Shuohang Wang, Jing Jiang

Research Collection School Of Computing and Information Systems

Many NLP tasks including machine comprehension, answer selection and text entailment require the comparison between sequences. Matching the important units between sequences is a key to solve these problems. In this paper, we present a general "compare-aggregate" framework that performs word-level matching followed by aggregation using Convolutional Neural Networks. We particularly focus on the different comparison functions we can use to match two vectors. We use four different datasets to evaluate the model. We find that some simple comparison functions based on element-wise operations can work better than standard neural network and neural tensor network.


Discovering Anomalous Events From Urban Informatics Data, Kasthuri Jayarajah, Vigneshwaran Subbaraju, Dulanga Kaveesha Weerakoon Mudiyanselage, Archan Misra, La Thanh Tam, Noel Athaide Apr 2017

Discovering Anomalous Events From Urban Informatics Data, Kasthuri Jayarajah, Vigneshwaran Subbaraju, Dulanga Kaveesha Weerakoon Mudiyanselage, Archan Misra, La Thanh Tam, Noel Athaide

Research Collection School Of Computing and Information Systems

Singapore's "smart city" agenda is driving the government to provide public access to a broader variety of urban informatics sources, such as images from traffic cameras and information about buses servicing different bus stops. Such informatics data serves as probes of evolving conditions at different spatiotemporal scales. This paper explores how such multi-modal informatics data can be used to establish the normal operating conditions at different city locations, and then apply appropriate outlier-based analysis techniques to identify anomalous events at these selected locations. We will introduce the overall architecture of sociophysical analytics, where such infrastructural data sources can be combined …


Machine Comprehension Using Match-Lstm And Answer Pointer, Shuohang Wang, Jing Jiang Apr 2017

Machine Comprehension Using Match-Lstm And Answer Pointer, Shuohang Wang, Jing Jiang

Research Collection School Of Computing and Information Systems

Machine comprehension of text is an important problem in natural language processing. A recently released dataset, the Stanford Question Answering Dataset (SQuAD), offers a large number of real questions and their answers created by humans through crowdsourcing. SQuAD provides a challenging testbed for evaluating machine comprehension algorithms, partly because compared with previous datasets, in SQuAD the answers do not come from a small set of candidate answers and they have variable lengths. We propose an end-to-end neural architecture for the task. The architecture is based on match-LSTM, a model we proposed previously for textual entailment, and Pointer Net, a sequence-to-sequence …


Recurrent Neural Networks With Auxiliary Labels For Cross-Domain Opinion Target Extraction, Ying Ding, Jianfei Yu, Jing Jiang Feb 2017

Recurrent Neural Networks With Auxiliary Labels For Cross-Domain Opinion Target Extraction, Ying Ding, Jianfei Yu, Jing Jiang

Research Collection School Of Computing and Information Systems

Opinion target extraction is a fundamental task in opinion mining. In recent years, neural network based supervised learning methods have achieved competitive performance on this task. However, as with any supervised learning method, neural network based methods for this task cannot work well when the training data comes from a different domain than the test data. On the other hand, some rule-based unsupervised methods have shown to be robust when applied to different domains. In this work, we use rule-based unsupervised methods to create auxiliary labels and use neural network models to learn a hidden representation that works well for …


2d Vector Map And Database Design For Indoor Assisted Navigation, Luciano Caraciolo Albuquerque Jan 2017

2d Vector Map And Database Design For Indoor Assisted Navigation, Luciano Caraciolo Albuquerque

Dissertations and Theses

In this paper we implemented a 2D Vector Map, map editor and Database design intended to provide an efficient way to convert cad files from indoor environments to a set of vectors representing hallways, doors, exits, elevators, and other entities embedded in a floor plan, and save them in a database for use by other applications, such as assisted navigation for blind people.

A graphical application as developed in C++ to allow the user to input a CAD DXF file, process the file to automatically obtain nodes and edges, and save the nodes and edges to a database for posterior …


Xic Clustering By Baseyian Network, Kyle J. Handy Jan 2017

Xic Clustering By Baseyian Network, Kyle J. Handy

Graduate Student Theses, Dissertations, & Professional Papers

No abstract provided.