Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Physical Sciences and Mathematics

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni Jul 2017

Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni

Computer Science ETDs

In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …


Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do Jul 2017

Mining Diverse Consumer Preferences For Bundling And Recommendation, Ha Loc Do

Dissertations and Theses Collection

That consumers share similar tastes on some products does not guarantee their agreement on other products. Therefore, both similarity and dierence should be taken into account for a more rounded view on consumer preferences. This manuscript focuses on mining this diversity of consumer preferences from two perspectives, namely 1) between consumers and 2) between products. Diversity of preferences between consumers is studied in the context of recommendation systems. In some preference models, measuring similarities in preferences between two consumers plays the key role. These approaches assume two consumers would share certain degree of similarity on any products, ignoring the fact …


Multi-Agent Simulation Of The Battle Of Ankara, 1402, Ruili Tang Jun 2017

Multi-Agent Simulation Of The Battle Of Ankara, 1402, Ruili Tang

Honors Theses

In 1402, at the north of city Ankara, Turkey, a battle between Ottoman Empire and Tamerlane Empire decided the fate of Europe and Asia. Although historians largely agree on the general battle procedure, the details are still open to dispute. Several factors may have contributed to the Ottoman defeat, such as the overwhelming size of Tamerlanes army, poisoned water, the tactical formations of the military units, and betrayal by the Tartar cavalry in the Ottoman left wing. The approach is divided into two stages: the simulation stage, which provides data to analyze the complex interactions of autonomous agents, and the …


Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad May 2017

Mining Of Primary Healthcare Patient Data With Selective Multimorbid Diseases, Annette Megerdichian Azad

Electronic Thesis and Dissertation Repository

Despite a large volume of research on the prognosis, diagnosis and overall burden of multimorbidity, very little is known about socio-demographic characteristics of multimorbid patients. This thesis aims to analyze the socio-demographic characteristics of patients with multiple chronic conditions (multimorbidity), focusing on patient groups sharing the same combination of diseases. Several methods were explored to analyze the co-occurrence of multiple chronic diseases as well as the associations between socio-demographics and chronic conditions. These methods include disease pair distributions over gender, age groups and income level quintiles, Multimorbidity Coefficients for measuring the concurrence of disease pairs and triples, and k-modes clustering …


Aspect Discovery From Product Reviews, Ying Ding May 2017

Aspect Discovery From Product Reviews, Ying Ding

Dissertations and Theses Collection

With the rapid development of online shopping sites and social media, product reviews are accumulating. These reviews contain information that is valuable to both businesses and customers. To businesses, companies can easily get a large number of feedback of their products, which is difficult to achieve by doing customer survey in the traditional way. To customers, they can know the products they are interested in better by reading reviews, which may be uneasy without online reviews. However, the accumulation has caused consuming all reviews impossible. It is necessary to develop automated techniques to efficiently process them. One of the most …


Dpweka: Achieving Differential Privacy In Weka, Srinidhi Katla May 2017

Dpweka: Achieving Differential Privacy In Weka, Srinidhi Katla

Graduate Theses and Dissertations

Organizations belonging to the government, commercial, and non-profit industries collect and store large amounts of sensitive data, which include medical, financial, and personal information. They use data mining methods to formulate business strategies that yield high long-term and short-term financial benefits. While analyzing such data, the private information of the individuals present in the data must be protected for moral and legal reasons. Current practices such as redacting sensitive attributes, releasing only the aggregate values, and query auditing do not provide sufficient protection against an adversary armed with auxiliary information. In the presence of additional background information, the privacy protection …


Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami May 2017

Peeking Into The Other Half Of The Glass : Handling Polarization In Recommender Systems., Mahsa Badami

Electronic Theses and Dissertations

This dissertation is about filtering and discovering information online while using recommender systems. In the first part of our research, we study the phenomenon of polarization and its impact on filtering and discovering information. Polarization is a social phenomenon, with serious consequences, in real-life, particularly on social media. Thus it is important to understand how machine learning algorithms, especially recommender systems, behave in polarized environments. We study polarization within the context of the users' interactions with a space of items and how this affects recommender systems. We first formalize the concept of polarization based on item ratings and then relate …


High Utility Itemsets Identification In Big Data, Ashish Tamrakar May 2017

High Utility Itemsets Identification In Big Data, Ashish Tamrakar

UNLV Theses, Dissertations, Professional Papers, and Capstones

High utility itemset mining is an important data mining problem which considers profit factors besides quantity from the transactional database. It helps find the most valuable products/items that are difficult to track using only the frequent data mining set. An item that has a high-profit value might be rare in the transactional database despite its tremendous importance. While there are many existing algorithms which generate comparatively large candidate sets while finding high utility itemsets, the major focus is to reduce the computational time significantly with the introduction of pruning strategies. Another aspect of high utility itemset mining is to compute …


Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki Apr 2017

Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki

Dissertations

Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches.

This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques …


Big Data Analytics In Computational Biology And Bioinformatics, Kevin Byron Apr 2017

Big Data Analytics In Computational Biology And Bioinformatics, Kevin Byron

Dissertations

Big data analytics in computational biology and bioinformatics refers to an array of operations including biological pattern discovery, classification, prediction, inference, clustering as well as data mining in the cloud, among others. This dissertation addresses big data analytics by investigating two important operations, namely pattern discovery and network inference.

The dissertation starts by focusing on biological pattern discovery at a genomic scale. Research reveals that the secondary structure in non-coding RNA (ncRNA) is more conserved during evolution than its primary nucleotide sequence. Using a covariance model approach, the stems and loops of an ncRNA secondary structure are represented as a …


Statistical Learning Methods For Mining Marketing And Biological Data, Jie Zhang Apr 2017

Statistical Learning Methods For Mining Marketing And Biological Data, Jie Zhang

Dissertations

Nowadays, the value of data has been broadly recognized and emphasized. More and more decisions are made based on data and analysis rather than solely on experience and intuition. With the fast development of networking, data storage, and data collection capacity, data have increased dramatically in industry, science and engineering domains, which brings both great opportunities and challenges. To take advantage of the data flood, new computational methods are in demand to process, analyze and understand these datasets.

This dissertation focuses on the development of statistical learning methods for online advertising and bioinformatics to model real world data with temporal …