Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Data mining

University of Massachusetts Amherst

Theses/Dissertations

Articles 1 - 1 of 1

Full-Text Articles in Computer Sciences

Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland Nov 2016

Detecting Anomalously Similar Entities In Unlabeled Data, Lisa D. Friedland

Doctoral Dissertations

In this work, the goal is to detect closely-linked entities within a data set. The entities of interest have a tie causing them to be similar, such as a shared origin or a channel of influence. Given a collection of people or other entities with their attributes or behavior, we identify unusually similar pairs, and we pose the question: Are these two people linked, or can their similarity be explained by chance? Computing similarities is a core operation in many domains, but two constraints differentiate our version of the problem. First, the score assigned to a pair should account for …