Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Data mining

Air Force Institute of Technology

Series

Publication Year

Articles 1 - 3 of 3

Full-Text Articles in Computer Sciences

Applicability Of Latent Dirichlet Allocation To Multi-Disk Search, George E. Noel, Gilbert L. Peterson Mar 2014

Applicability Of Latent Dirichlet Allocation To Multi-Disk Search, George E. Noel, Gilbert L. Peterson

Faculty Publications

Digital forensics practitioners face a continual increase in the volume of data they must analyze, which exacerbates the problem of finding relevant information in a noisy domain. Current technologies make use of keyword based search to isolate relevant documents and minimize false positives with respect to investigative goals. Unfortunately, selecting appropriate keywords is a complex and challenging task. Latent Dirichlet Allocation (LDA) offers a possible way to relax keyword selection by returning topically similar documents. This research compares regular expression search techniques and LDA using the Real Data Corpus (RDC). The RDC, a set of over 2400 disks from real …


Using Plsi-U To Detect Insider Threats By Datamining Email, James S. Okolica, Gilbert L. Peterson, Robert F. Mills Feb 2008

Using Plsi-U To Detect Insider Threats By Datamining Email, James S. Okolica, Gilbert L. Peterson, Robert F. Mills

Faculty Publications

Despite a technology bias that focuses on external electronic threats, insiders pose the greatest threat to an organisation. This paper discusses an approach to assist investigators in identifying potential insider threats. We discern employees' interests from e-mail using an extended version of PLSI. These interests are transformed into implicit and explicit social network graphs, which are used to locate potential insiders by identifying individuals who feel alienated from the organisation or have a hidden interest in a sensitive topic. By applying this technique to the Enron e-mail corpus, a small number of employees appear as potential insider threats.


Multi-Class Classification Averaging Fusion For Detecting Steganography, Benjamin M. Rodriguez, Gilbert L. Peterson, Sos S. Agaian Apr 2007

Multi-Class Classification Averaging Fusion For Detecting Steganography, Benjamin M. Rodriguez, Gilbert L. Peterson, Sos S. Agaian

Faculty Publications

Multiple classifier fusion has the capability of increasing classification accuracy over individual classifier systems. This paper focuses on the development of a multi-class classification fusion based on weighted averaging of posterior class probabilities. This fusion system is applied to the steganography fingerprint domain, in which the classifier identifies the statistical patterns in an image which distinguish one steganography algorithm from another. Specifically we focus on algorithms in which jpeg images provide the cover in order to communicate covertly. The embedding methods targeted are F5, JSteg, Model Based, OutGuess, and StegHide. The developed multi-class steganalvsis system consists of three levels: (1) …