Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 12 of 12

Full-Text Articles in Computer Sciences

Alpha Insurance: A Predictive Analytics Case To Analyze Automobile Insurance Fraud Using Sas Enterprise Miner (Tm), Richard Mccarthy, Wendy Ceccucci, Mary Mccarthy, Leila Halawi Apr 2019

Alpha Insurance: A Predictive Analytics Case To Analyze Automobile Insurance Fraud Using Sas Enterprise Miner (Tm), Richard Mccarthy, Wendy Ceccucci, Mary Mccarthy, Leila Halawi

Publications

Automobile Insurance fraud costs the insurance industry billions of dollars annually. This case study addresses claim fraud based on data extracted from Alpha Insurance’s automobile claim database. Students are provided the business problem and data sets. Initially, the students are required to develop their hypotheses and analyze the data. This includes identification of any missing or inaccurate data values and outliers as well as evaluation of the 22 variables. Next students will develop and optimize their predictive models using five techniques: regression, decision tree, neural network, gradient boosting, and ensemble. Then students will determine which model is the best fit …


Mining Capstone Project Wikis For Knowledge Discovery, Swapna Gottipati, Venky Shankararaman, Melvrivk Goh Jul 2017

Mining Capstone Project Wikis For Knowledge Discovery, Swapna Gottipati, Venky Shankararaman, Melvrivk Goh

Research Collection School Of Computing and Information Systems

Wikis are widely used collaborative environments as sources of information and knowledge. The facilitate students to engage in collaboration and share information among members and enable collaborative learning. In particular, Wikis play an important role in capstone projects. Wikis aid in various project related tasks and aid to organize information and share. Mining project Wikis is critical to understand the students learning and latest trends in industry. Mining Wikis is useful to educationists and academicians for decision-making about how to modify the educational environment to improve student's learning. The main challenge is that the content or data in project Wikis …


Who Will Leave The Company?: A Large-Scale Industry Study Of Developer Turnover By Mining Monthly Work Report, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li May 2017

Who Will Leave The Company?: A Large-Scale Industry Study Of Developer Turnover By Mining Monthly Work Report, Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Shanping Li

Research Collection School Of Computing and Information Systems

Software developer turnover has become a big challenge for information technology (IT) companies. The departure of key software developers might cause big loss to an IT company since they also depart with important business knowledge and critical technical skills. Understanding developer turnover is very important for IT companies to retain talented developers and reduce the loss due to developers' departure. Previous studies mainly perform qualitative observations or simple statistical analysis of developers' activity data to understand developer turnover. In this paper, we investigate whether we can predict the turnover of software developers in non-open source companies by automatically analyzing monthly …


Robust Median Reversion Strategy For Online Portfolio Selection, Dingjiang Huang, Junlong Zhou, Bin Li, Hoi, Steven C. H., Shuigeng Zhou Jul 2016

Robust Median Reversion Strategy For Online Portfolio Selection, Dingjiang Huang, Junlong Zhou, Bin Li, Hoi, Steven C. H., Shuigeng Zhou

Research Collection School Of Computing and Information Systems

On-line portfolio selection has been attracting increasing interests from artificial intelligence community in recent decades. Mean reversion, as one most frequent pattern in financial markets, plays an important role in some state-of-the-art strategies. Though successful in certain datasets, existing mean reversion strategies do not fully consider noises and outliers in the data, leading to estimation error and thus non-optimal portfolios, which results in poor performance in practice. To overcome the limitation, we propose to exploit the reversion phenomenon by robust L1-median estimator, and design a novel on-line portfolio selection strategy named "Robust Median Reversion" (RMR), which makes optimal portfolios based …


Intelligshop: Enabling Intelligent Shopping In Malls Through Location-Based Augmented Reality, Aditi Adhikari, Vincent W. Zheng, Hong Cao, Miao Lin, Yuan Fang, Kevin Chen-Chuan Chang Nov 2015

Intelligshop: Enabling Intelligent Shopping In Malls Through Location-Based Augmented Reality, Aditi Adhikari, Vincent W. Zheng, Hong Cao, Miao Lin, Yuan Fang, Kevin Chen-Chuan Chang

Research Collection School Of Computing and Information Systems

Shopping experience is important for both citizens and tourists. We present IntelligShop, a novel location-based augmented reality application that supports intelligent shopping experience in malls. As the key functionality, IntelligShop provides an augmented reality interface-people can simply use ubiquitous smartphones to face mall retailers, then IntelligShop will automatically recognize the retailers and fetch their online reviews from various sources (including blogs, forums and publicly accessible social media) to display on the phones. Technically, IntelligShop addresses two challenging data mining problems, including robust feature learning to support heterogeneous smartphones in localization and learning to query for automatically gathering the retailer content …


Author Topic Model-Based Collaborative Filtering For Personalized Poi Recommendations, Shuhui Jiang, Xueming Qian, Jialie Shen, Yun Fu, Tao Mei Jun 2015

Author Topic Model-Based Collaborative Filtering For Personalized Poi Recommendations, Shuhui Jiang, Xueming Qian, Jialie Shen, Yun Fu, Tao Mei

Research Collection School Of Computing and Information Systems

From social media has emerged continuous needs for automatic travel recommendations. Collaborative filtering (CF) is the most well-known approach. However, existing approaches generally suffer from various weaknesses. For example, sparsity can significantly degrade the performance of traditional CF. If a user only visits very few locations, accurate similar user identification becomes very challenging due to lack of sufficient information for effective inference. Moreover, existing recommendation approaches often ignore rich user information like textual descriptions of photos which can reflect users' travel preferences. The topic model (TM) method is an effective way to solve the "sparsity problem," but is still far …


Drip - Data Rich, Information Poor: A Concise Synopsis Of Data Mining, Muhammad Obeidat, Max North, Lloyd Burgess, Sarah North Dec 2014

Drip - Data Rich, Information Poor: A Concise Synopsis Of Data Mining, Muhammad Obeidat, Max North, Lloyd Burgess, Sarah North

Faculty and Research Publications

As production of data is exponentially growing with a drastically lower cost, the importance of data mining required to extract and discover valuable information is becoming more paramount. To be functional in any business or industry, data must be capable of supporting sound decision-making and plausible prediction. The purpose of this paper is concisely but broadly to provide a synopsis of the technology and theory of data mining, providing an enhanced comprehension of the methods by which massive data can be transferred into meaningful information.


How Can Consumer Preferences Be Leveraged For Targeted Upselling In Cable Tv Services?, Bing Tian Dai Jan 2014

How Can Consumer Preferences Be Leveraged For Targeted Upselling In Cable Tv Services?, Bing Tian Dai

Research Collection School Of Computing and Information Systems

Internet TV has attracted a significant amount of attention from the conventional cable TV service providers, by providing customized TV programs at preferred time slots. The cable TV service providers are seeking to retain their customers by giving them a better experience: by understanding their customers’ preferences and upselling them the right products to cater to their interests. It is not easy to understand customer preferences though, since customers are not able to watch channels to which they have not subscribed. This makes it difficult to predict what they will like to watch, as a result. In this paper, I …


Detecting Click Fraud In Online Advertising: A Data Mining Approach, Richard Oentaryo, Ee Peng Lim, Michael Finegold, David Lo, Feida Zhu, Clifton Phua, Eng-Yeow Cheu, Ghim-Eng Yap, Kelvin Sim, Kasun Perera, Bijay Neupane, Mustafa Faisal, Zeyar Aung, Wei Lee Woon, Wei Chen, Dhaval Patel, Daniel Berrar Jan 2014

Detecting Click Fraud In Online Advertising: A Data Mining Approach, Richard Oentaryo, Ee Peng Lim, Michael Finegold, David Lo, Feida Zhu, Clifton Phua, Eng-Yeow Cheu, Ghim-Eng Yap, Kelvin Sim, Kasun Perera, Bijay Neupane, Mustafa Faisal, Zeyar Aung, Wei Lee Woon, Wei Chen, Dhaval Patel, Daniel Berrar

Research Collection School Of Computing and Information Systems

Click fraud - the deliberate clicking on advertisements with no real interest on the product or service offered - is one of the most daunting problems in online advertising. Building an elective fraud detection method is thus pivotal for online advertising businesses. We organized a Fraud Detection in Mobile Advertising (FDMA) 2012 Competition, opening the opportunity for participants to work on real-world fraud data from BuzzCity Pte. Ltd., a global mobile advertising company based in Singapore. In particular, the task is to identify fraudulent publishers who generate illegitimate clicks, and distinguish them from normal publishers. The competition was held from …


From Clickstreams To Searchstreams: Search Network Graph Evidence From A B2b E-Market, Mei Lin, M. F. Lin, Robert J. Kauffman Aug 2012

From Clickstreams To Searchstreams: Search Network Graph Evidence From A B2b E-Market, Mei Lin, M. F. Lin, Robert J. Kauffman

Research Collection School Of Computing and Information Systems

Consumers in e-commerce acquire information through search engines, yet to date there has been little empirical study on how users interact with the results produced by search engines. This is analogous to, but different from, the ever-expanding research on clickstreams, where users interact with static web pages. We propose a new network approach to analyzing search engine server log data. We call this searchstream data. We create graph representations based on the web pages that users traverse as they explore the search results that their use of search engines generates. We then analyze the graph-level properties of these search network …


The Impact Of Directionality In Predications On Text Mining, Gondy Leroy, Marcelo Fiszman, Thomas C. Rindflesch Jan 2008

The Impact Of Directionality In Predications On Text Mining, Gondy Leroy, Marcelo Fiszman, Thomas C. Rindflesch

CGU Faculty Publications and Research

The number of publications in biomedicine is increasing enormously each year. To help researchers digest the information in these documents, text mining tools are being developed that present co-occurrence relations between concepts. Statistical measures are used to mine interesting subsets of relations. We demonstrate how directionality of these relations affects interestingness. Support and confidence, simple data mining statistics, are used as proxies for interestingness metrics. We first built a test bed of 126,404 directional relations extracted from biomedical abstracts, which we represent as graphs containing a central starting concept and 2 rings of associated relations. We manipulated directionality in four …


Genescene: Biomedical Text And Data Mining, Gondy Leroy, Hsinchun Chen, Jesse D. Martinez, Shauna Eggers, Ryan R. Falsey, Kerri L. Kislin, Zan Huang, Jiexun Li, Jie Xu, Daniel M. Mcdonald, Gavin Ng May 2003

Genescene: Biomedical Text And Data Mining, Gondy Leroy, Hsinchun Chen, Jesse D. Martinez, Shauna Eggers, Ryan R. Falsey, Kerri L. Kislin, Zan Huang, Jiexun Li, Jie Xu, Daniel M. Mcdonald, Gavin Ng

CGU Faculty Publications and Research

To access the content of digital texts efficiently, it is necessary to provide more sophisticated access than keyword based searching. GeneScene provides biomedical researchers with research findings and background relations automatically extracted from text and experimental data. These provide a more detailed overview of the information available. The extracted relations were evaluated by qualified researchers and are precise. A qualitative ongoing evaluation of the current online interface indicates that this method to search the literature is more useful and efficient than keyword based searching.