Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 13 of 13

Full-Text Articles in Physical Sciences and Mathematics

Wikipedia And Medicine: Quantifying Readership, Editors, And The Significance Of Natural Language, James M. Heilman, Andrew G. West Mar 2015

Wikipedia And Medicine: Quantifying Readership, Editors, And The Significance Of Natural Language, James M. Heilman, Andrew G. West

Andrew G. West

BACKGROUND: Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of healthcare information by both professionals and the lay public.

OBJECTIVE: This document quantifies: (1) The amount of medical content on Wikipedia, (2) the citations supporting Wikipedia’s medical content, (3) the readership of medical content, and (4) the quantity/characteristics of Wikipedia’s medical contributors

METHODS: Using a well-defined categorization infrastructure we identify medically pertinent English Wikipedia articles and links to their foreign language equivalents (Objective 1). With these, Wikipedia’s API can be queried to produce metadata …


Measuring Privacy Disclosures In Url Query Strings, Andrew G. West, Adam J. Aviv Nov 2014

Measuring Privacy Disclosures In Url Query Strings, Andrew G. West, Adam J. Aviv

Andrew G. West

Publicly posted URLs may contain a wealth of information about the identities and activities of the users who share them. URLs often utilize query strings (i.e., key-value pairs appended to the URL path) as a means to pass session parameters and form data. While often benign and necessary to render the web page, query strings sometimes contain tracking mechanisms, user names, email addresses, and other information that users may not wish to publicly reveal. In isolation this is not particularly problematic, but the growth of Web 2.0 platforms such as social networks and micro-blogging means URLs (often copy-pasted from web …


Chatter: Classifying Malware Families Using System Event Ordering, Aziz Mohaisen, Andrew G. West, Allison Mankin, Omar Alrawi Oct 2014

Chatter: Classifying Malware Families Using System Event Ordering, Aziz Mohaisen, Andrew G. West, Allison Mankin, Omar Alrawi

Andrew G. West

Using runtime execution artifacts to identify malware and its associated "family" is an established technique in the security domain. Many papers in the literature rely on explicit features derived from network, file system, or registry interaction. While effective, use of these fine-granularity data points makes these techniques computationally expensive. Moreover, the signatures and heuristics this analysis produces are often circumvented by subsequent malware authors.

To this end we propose CHATTER, a system that is concerned only with the order in which high-level system events take place. Individual events are mapped onto an alphabet and execution traces are captured via terse …


Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn, Huy Kang Kim Aug 2014

Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn, Huy Kang Kim

Andrew G. West

Malicious webpages are a prevalent and severe threat in the Internet security landscape. This fact has motivated numerous static and dynamic techniques to alleviate such threats. Building on this existing literature, this work introduces the design and evaluation of ADAM, a system that uses machine-learning over network metadata derived from the sandboxed execution of webpage content. ADAM aims to detect malicious webpages and identify the nature of those vulnerabilities using a simple set of features. Machine-trained models are not novel in this problem space. Instead, it is the dynamic network artifacts (and their subsequent feature representations) collected during rendering that …


Metadata-Driven Threat Classification Of Network Endpoints Appearing In Malware, Andrew G. West, Aziz Mohaisen Jul 2014

Metadata-Driven Threat Classification Of Network Endpoints Appearing In Malware, Andrew G. West, Aziz Mohaisen

Andrew G. West

Networked machines serving as binary distribution points, C&C channels, or drop sites are a ubiquitous aspect of malware infrastructure. By sandboxing malcode one can extract the network endpoints (i.e., domains and URL paths) contacted during execution. Some endpoints are benign, e.g., connectivity tests. Exclusively malicious destinations, however, can serve as signatures enabling network alarms. Often these behavioral distinctions are drawn by expert analysts, resulting in considerable cost and labeling latency.

Leveraging 28,000 expert-labeled endpoints derived from ~100k malware binaries this paper characterizes those domains/URLs towards prioritizing manual efforts and automatic signature generation. Our analysis focuses on endpoints' static metadata properties …


On The Privacy Concerns Of Url Query Strings, Andrew G. West, Adam J. Aviv May 2014

On The Privacy Concerns Of Url Query Strings, Andrew G. West, Adam J. Aviv

Andrew G. West

URLs often utilize query strings (i.e., key-value pairs appended to the URL path) as a means to pass session parameters and form data. Often times these arguments are not privacy sensitive but are necessary to render the web page. However, query strings may also contain tracking mechanisms, user names, email addresses, and other information that users may not wish to reveal. In isolation such URLs are not particularly problematic, but the growth of Web 2.0 platforms such as social networks and micro-blogging means URLs (often copy-pasted from web browsers) are increasingly being publicly broadcast.

This position paper argues that the …


Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn Oct 2013

Adam: Automated Detection And Attribution Of Malicious Webpages, Ahmed E. Kosba, Aziz Mohaisen, Andrew G. West, Trevor Tonn

Andrew G. West

Malicious webpages are a prevalent and severe threat in the Internet security landscape. This fact has motivated numerous static and dynamic techniques for their accurate and efficient detection. Building on this existing literature, this work introduces ADAM, a system that uses machine-learning over network metadata derived from the sandboxed execution of webpage content. Machine-trained models are not novel in this problem space. Instead, it is the dynamic network artifacts (and their subsequent feature representations) collected during rendering that are the greatest contribution of this work.

There were two primary motivations in exploring this line of research. First, iDetermine, VeriSign’s status …


Babble: Identifying Malware By Its Dialects, Aziz Mohaisen, Omar Alrawi, Andrew G. West, Allison Mankin Oct 2013

Babble: Identifying Malware By Its Dialects, Aziz Mohaisen, Omar Alrawi, Andrew G. West, Allison Mankin

Andrew G. West

Using runtime execution to identify whether code is malware, and to which malware family it belongs, is an established technique in the security domain. Traditionally, literature has relied on explicit features derived from network, file system, or registry interaction. While effective, the collection and analysis of these fine-granularity data points makes the technique quite computationally expensive. Moreover, the signatures/heuristics this analysis produces are often easily circumvented by subsequent malware authors.

To this end, we propose "Babble", a system that is concerned only with the *order* in which high-level system events take place. Individual events are mapped onto an alphabet and …


Damage Detection And Mitigation In Open Collaboration Applications, Andrew G. West May 2013

Damage Detection And Mitigation In Open Collaboration Applications, Andrew G. West

Andrew G. West

Collaborative functionality is changing the way information is amassed, refined, and disseminated in online environments. A subclass of these systems characterized by "open collaboration" uniquely allow participants to modify content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabilities: 7%+ of its edits are blatantly unconstructive. Our measurement studies show this damage manifests in novel socio-technical forms, limiting the effectiveness of computational detection strategies from related domains. In turn this has made much mitigation the responsibility of a poorly organized and ill-routed human workforce. We aim to improve all facets of this incident response …


Open Wikis And The Protection Of Institutional Welfare, Andrew G. West, Insup Lee Feb 2012

Open Wikis And The Protection Of Institutional Welfare, Andrew G. West, Insup Lee

Andrew G. West

Much has been written about wikis’ reliability and use in the classroom. This research bulletin addresses the negative impacts on institutional welfare that can arise from participating in and supporting wikis. The open nature of the platform, which is fundamental to wiki operation and success, enables these negative consequences. A finite user base that can be determined a priori (e.g., a course roster) minimizes the security implications, hence our discussion in this bulletin primarily concerns open or public wikis that accept contributions from a broad and unknown set of Internet users.


Detecting Wikipedia Vandalism Via Spatio-Temporal Analysis Of Revision Metadata, Andrew G. West, Sampath Kannan, Insup Lee Jan 2010

Detecting Wikipedia Vandalism Via Spatio-Temporal Analysis Of Revision Metadata, Andrew G. West, Sampath Kannan, Insup Lee

Andrew G. West

Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia. They not only disseminate dishonest and offensive content, but force editors to waste time undoing such acts of vandalism. Language- processing has been applied to combat these malicious edits, but as with email spam, these filters are evadable and computationally complex. Meanwhile, recent research has shown spatial and temporal features effective in mitigating email spam, while being lightweight and robust.

In this paper, we leverage the spatio-temporal properties of revision metadata to detect vandalism on Wikipedia. An administrative form of reversion called rollback enables the tagging of malicious edits, …


Bound Optimization For Parallel Quadratic Sieving Using Large Prime Variations, Andrew G. West May 2007

Bound Optimization For Parallel Quadratic Sieving Using Large Prime Variations, Andrew G. West

Andrew G. West

The Quadratic Sieve (QS) factorization algorithm is a powerful means to perform prime decompositions that combines number theory, linear algebra, and brute processing power. Created by Carl Pomerance in 1985, it is the second fastest general purpose factorization method as of this writing, behind only the Number Field Sieve.

We describe an efficient QS implementation which is accessible to an undergraduate audience. The majority of papers on this topic rely on complex mathematical notation as their primary means of explanation. Instead, we attempt to combine math, discussion, and examples to promote understanding. Additionally, few authors ever present implementation level detail. …


Optimized Parallel Implementation Of The Quadratic Sieve Factorization Algorithm, Andrew G. West Jan 2007

Optimized Parallel Implementation Of The Quadratic Sieve Factorization Algorithm, Andrew G. West

Andrew G. West

No abstract provided.