Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

2014

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 59

Full-Text Articles in Bioinformatics

Anonymized Video Analysis Methods And Systems, Marjorie Skubic, James M. Keller, Fang Wang, Derek T. Anderson, Erik Stone, Robert H. Luke Iii, Tanvi Banerjee, Marilyn J. Rantz Nov 2014

Anonymized Video Analysis Methods And Systems, Marjorie Skubic, James M. Keller, Fang Wang, Derek T. Anderson, Erik Stone, Robert H. Luke Iii, Tanvi Banerjee, Marilyn J. Rantz

Kno.e.sis Publications

Methods and systems for anonymized video analysis are described. In one embodiment, a first silhouette image of a person in a living unit may be accessed. The first silhouette image may be based on a first video signal recorded by a first video camera. A second silhouette image of the person in the living unit may be accessed. The second silhouette image may be of a different view of the person than the first silhouette image. The second silhouette image may be based on a second video signal recorded by a second video camera. A three-dimensional model of the person …


Protecting Web Servers From Web Robot Traffic, Derek Doran Nov 2014

Protecting Web Servers From Web Robot Traffic, Derek Doran

Kno.e.sis Publications

No abstract provided.


Triad-Based Role Discovery For Large Social Systems, Derek Doran Nov 2014

Triad-Based Role Discovery For Large Social Systems, Derek Doran

Kno.e.sis Publications

The social role of a participant in a social system conceptualizes the circumstances under which she chooses to interact with others, making their discovery and analysis important for theoretical and practical purposes. In this paper, we propose a methodology to detect such roles by utilizing the conditional triad censuses of ego-networks. These censuses are a promising tool for social role extraction because they capture the degree to which basic social forces push upon a user to interact with others in a system. Clusters of triad censuses, inferred from network samples that preserve local structural properties, define the social roles. The …


An Analysis Of Mayo Clinic Search Query Logs For Cardiovascular Diseases, Ashutosh Sopan Jadhav, Amit P. Sheth, Jyotishman Pathak Nov 2014

An Analysis Of Mayo Clinic Search Query Logs For Cardiovascular Diseases, Ashutosh Sopan Jadhav, Amit P. Sheth, Jyotishman Pathak

Kno.e.sis Publications

Increasingly, individuals are taking active participation in learning and managing their health by leveraging online resources. Understanding online health information searching behavior can help us to study what health topics users search for and how search queries are formulated. In this work, we analyzed 10 million cardiovascular diseases (CVD) related search queries from MayoClinic.com. We performed semantic analysis on the queries using UMLS MetaMap and analyzed structural and textual properties as well as linguistic characteristics of the queries.


Discovering Perceptions In Online Social Media: A Probabilistic Approach, Derek Doran, Swapna S. Gokhale, Aldo Dagnino Nov 2014

Discovering Perceptions In Online Social Media: A Probabilistic Approach, Derek Doran, Swapna S. Gokhale, Aldo Dagnino

Kno.e.sis Publications

People across the world habitually turn to online social media to share their experiences, thoughts, ideas, and opinions as they go about their daily lives. These posts collectively contain a wealth of insights into how masses perceive their surroundings. Therefore, extracting people’s perceptions from social media posts can provide valuable information about pertinent issues such as public transportation, emergency conditions, and even reactions to political actions or other activities. This paper proposes a novel approach to extract such perceptions from a corpus of social media posts originating from a given broad geographical region. The approach divides the broad region into …


Online Information Searching For Cardiovascular Diseases: An Analysis Of Mayo Clinic Search Query Logs, Ashutosh Sopan Jadhav, Amit P. Sheth, Jyotishman Pathak Nov 2014

Online Information Searching For Cardiovascular Diseases: An Analysis Of Mayo Clinic Search Query Logs, Ashutosh Sopan Jadhav, Amit P. Sheth, Jyotishman Pathak

Kno.e.sis Publications

Since the early 2000’s, Internet usage for health information searching has increased significantly. Studying search queries can help us to understand users “information need” and how do they formulate search queries (“expression of information need”). Although cardiovascular diseases (CVD) affect a large percentage of the population, few studies have investigated how and what users search for CVD. We address this knowledge gap in the community by analyzing a large corpus of 10 million CVD related search queries from MayoClinic.com. Using UMLS MetaMap and UMLS semantic types/concepts, we developed a rule-based approach to categorize the queries into 14 health categories. We …


Tal Effector-Nucleotide Targeter (Tale-Nt) 2.0: Tools For Tal Effector Design And Target Prediction, Erin L. Doyle, Nicholas J. Booher, Daniel S. Standage, Daniel F. Voytas, Volker P. Brendel, John K. Vandyk, Adam J. Bogdanove Oct 2014

Tal Effector-Nucleotide Targeter (Tale-Nt) 2.0: Tools For Tal Effector Design And Target Prediction, Erin L. Doyle, Nicholas J. Booher, Daniel S. Standage, Daniel F. Voytas, Volker P. Brendel, John K. Vandyk, Adam J. Bogdanove

John K. VanDyk

Transcription activator-like (TAL) effectors are repeat-containing proteins used by plant pathogenic bacteria to manipulate host gene expression. Repeats are polymorphic and individually specify single nucleotides in the DNA target, with some degeneracy. A TAL effector-nucleotide binding code that links repeat type to specified nucleotide enables prediction of genomic binding sites for TAL effectors and customization of TAL effectors for use in DNA targeting, in particular as custom transcription factors for engineered gene regulation and as site-specific nucleases for genome editing. We have developed a suite of web-based tools called TAL Effector-Nucleotide Targeter 2.0 (TALE-NT 2.0;https://boglab.plp.iastate.edu/) that enables design of custom …


Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale Oct 2014

Data Analytics For Power Utility Storm Planning, Lan Lin, Aldo Dagnino, Derek Doran, Swapna S. Gokhale

Kno.e.sis Publications

As the world population grows, recent climatic changes seem to bring powerful storms to populated areas. The impact of these storms on utility services is devastating. Hurricane Sandy is a recent example of the enormous damages that storms can inflict on infrastructure, society, and the economy. Quick response to these emergencies represents a big challenge to electric power utilities. Traditionally utilities develop preparedness plans for storm emergency situations based on the experience of utility experts and with limited use of historical data. With the advent of the Smart Grid, utilities are incorporating automation and sensing technologies in their grids and …


Evaluation Of Microarray-Based Dna Methylation Measurement Using Technical Replicates: The Atherosclerosis Risk In Communities (Aric) Study, Maitreyee Bose, Chong Wu, James S. Pankow, Ellen W. Demerath, Jan Bressler, Myriam Fornage, Megan L. Grove, Thomas H. Mosley, Chindo Hicks, Kari North, Wen Hong Kao, Yu Zhang, Eric Boerwinkle, Weihua Guan Sep 2014

Evaluation Of Microarray-Based Dna Methylation Measurement Using Technical Replicates: The Atherosclerosis Risk In Communities (Aric) Study, Maitreyee Bose, Chong Wu, James S. Pankow, Ellen W. Demerath, Jan Bressler, Myriam Fornage, Megan L. Grove, Thomas H. Mosley, Chindo Hicks, Kari North, Wen Hong Kao, Yu Zhang, Eric Boerwinkle, Weihua Guan

Computer Science Faculty Publications

Background: DNA methylation is a widely studied epigenetic phenomenon; alterations in methylation patterns influence human phenotypes and risk of disease. As part of the Atherosclerosis Risk in Communities (ARIC) study, the Illumina Infinium HumanMethylation450 (HM450) BeadChip was used to measure DNA methylation in peripheral blood obtained from ~3000 African American study participants. Over 480,000 cytosine-guanine (CpG) dinucleotide sites were surveyed on the HM450 BeadChip. To evaluate the impact of technical variation, 265 technical replicates from 130 participants were included in the study.

Results: For each CpG site, we calculated the intraclass correlation coefficient (ICC) to compare variation of methylation levels …


A Keyword Sense Disambiguation Based Approach For Noise Filtering In Twitter, Sanjaya Wijeratne, Bahareh R. Heravi Sep 2014

A Keyword Sense Disambiguation Based Approach For Noise Filtering In Twitter, Sanjaya Wijeratne, Bahareh R. Heravi

Kno.e.sis Publications

In this paper, we describe an approach to filter out noisy data generated by keywords-based tweet filtering methods by performing Word Sense Disambiguation on those keywords used to collect tweets. We present the noise filtering problem as a binary classification problem and discuss our evaluation strategy which is to be carried out in future.


Rasp-Qs: Efficient And Confidential Query Services In The Cloud, Zohreh S. Alavi, Lu Zhou, James L. Powers, Keke Chen Sep 2014

Rasp-Qs: Efficient And Confidential Query Services In The Cloud, Zohreh S. Alavi, Lu Zhou, James L. Powers, Keke Chen

Kno.e.sis Publications

Hosting data query services in public clouds is an attractive solution for its great scalability and significant cost savings. However, data owners also have concerns on data privacy due to the lost control of the infrastructure. This demonstration shows a prototype for efficient and confidential range/kNN query services built on top of the random space perturbation (RASP) method. The RASP approach provides a privacy guarantee practical to the setting of cloudbased computing, while enabling much faster query processing compared to the encryption-based approach. This demonstration will allow users to more intuitively understand the technical merits of the RASP approach via …


Document Retrieval Using Predication Similarity, Kalpa Gunaratna Aug 2014

Document Retrieval Using Predication Similarity, Kalpa Gunaratna

Kno.e.sis Publications

Document retrieval has been an important research problem over many years in the information retrieval community. State-of-the-art techniques utilize various methods in matching documents to a given document including keywords, phrases, and annotations. In this paper, we propose a new approach for document retrieval that utilizes predications (subject-predicate-object triples) extracted from the documents. We represent documents as sets of predications. We measure the similarity between predications to compute the similarity between documents. Our approach utilizes the hierarchical information available in ontologies in computing concept-concept similarity, making the approach flexible. Predication-based document similarity is more precise and forms the basis for …


A Novel Web-Based Depth Video Rewind Approach Toward Fall Preventive Interventions In Hospitals, Moein Enayati, Tanvi Banerjee, Mihail Popescu, Marjorie Skubic, Marilyn J. Rantz Aug 2014

A Novel Web-Based Depth Video Rewind Approach Toward Fall Preventive Interventions In Hospitals, Moein Enayati, Tanvi Banerjee, Mihail Popescu, Marjorie Skubic, Marilyn J. Rantz

Kno.e.sis Publications

Falls in the hospital rooms are considered a huge burden on healthcare costs. They can lead to injuries, extended length of stay, and increase in cost for both the patients and the hospital. It can also lead to emotional trauma for the patients and their families [1]. Having Microsoft Kinects installed in the hospital rooms to capture and process every movement in the room, we deployed our previously developed fall-detection system to detect naturally occurring falls, generate a real-time fall alarm and broadcast it to hospital nurses for immediate intervention. These systems also store a processed and reduced version …


Genetic Predictors Of Metabolic Side Effects Of Diuretic Therapy, Jorge L. Del Aguila Aug 2014

Genetic Predictors Of Metabolic Side Effects Of Diuretic Therapy, Jorge L. Del Aguila

Dissertations & Theses (Open Access)

Thiazide diuretics are a recommended first-line monotherapy for hypertension (i.e.SBP>140 mmHg or DBP>90 mmHg). Even so, diuretics are associated with adverse metabolic side effects, such as hyperlipidemia, hyperglycemia and hypokalemia which increase the risk of developing type II diabetes. This thesis used three analytical strategies to identify and quantify genetic factors that contribute to the development of adverse metabolic effects due to thiazide diuretic treatment. I performed a genome-wide association study (GWAS) and meta-analysis of the change in fasting plasma glucose and triglycerides in response to HCTZ from two different clinical trials: the Pharmacogenomic Evaluation of Antihypertensive Responses …


Improvements On Segment Based Contours Method For Dna Microarray Image Segmentation, Yang Li Jul 2014

Improvements On Segment Based Contours Method For Dna Microarray Image Segmentation, Yang Li

Doctoral Dissertations

DNA microarray is an efficient biotechnology tool for scientists to measure the expression levels of large numbers of genes, simultaneously. To obtain the gene expression, microarray image analysis needs to be conducted. Microarray image segmentation is a fundamental step in the microarray analysis process. Segmentation gives the intensities of each probe spot in the array image, and those intensities are used to calculate the gene expression in subsequent analysis procedures. Therefore, more accurate and efficient microarray image segmentation methods are being pursued all the time.

In this dissertation, we are making efforts to obtain more accurate image segmentation results. We …


Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen Jul 2014

Improving Structural Features Prediction In Protein Structure Modeling, Ashraf Yaseen

Computer Science Theses & Dissertations

Proteins play a vital role in the biological activities of all living species. In nature, a protein folds into a specific and energetically favorable three-dimensional structure which is critical to its biological function. Hence, there has been a great effort by researchers in both experimentally determining and computationally predicting the structures of proteins.

The current experimental methods of protein structure determination are complicated, time-consuming, and expensive. On the other hand, the sequencing of proteins is fast, simple, and relatively less expensive. Thus, the gap between the number of known sequences and the determined structures is growing, and is expected to …


Assisting Coordination During Crisis: A Domain Ontology Based Approach To Infer Resource Needs From Tweets, Shreyansh Bhatt, Hemant Purohit, Andrew J. Hampton, Valerie L. Shalin, Amit P. Sheth, John Flach Jun 2014

Assisting Coordination During Crisis: A Domain Ontology Based Approach To Infer Resource Needs From Tweets, Shreyansh Bhatt, Hemant Purohit, Andrew J. Hampton, Valerie L. Shalin, Amit P. Sheth, John Flach

Kno.e.sis Publications

Ubiquitous social media during crises provides citizen reports on the situation, needs and supplies. Previous research extracts resource needs directly from the text (e.g. "Power cut to Coney Island and Brighton beach" indicates a power need). This approach assumes that citizens derive and write about specific needs from their observations, properly specified for the emergency response system, an assumption that is not consistent with general conversational behavior. In our study, Twitter messages (tweets) from Hurricane Sandy in 2012 clearly indicate power blackouts, but not their probable implications (e.g. loss of power to hospital life support systems). We use a domain …


Semantics-Enhanced Geoscience Interoperability, Analytics, And Applications, Krishnaprasad Thirunarayan, Amit P. Sheth Jun 2014

Semantics-Enhanced Geoscience Interoperability, Analytics, And Applications, Krishnaprasad Thirunarayan, Amit P. Sheth

Kno.e.sis Publications

We present our research ideas for developing cyberinfrastructure for Geoscience applications developed in the context of the EarthCube initiative, and our NSF-sponsored work on incorporating spatial-temporal-thematic semantics for enhanced querying and feature extraction from sensor data streams.


Semantic Modelling Of Smart City Data, Stefan Bischof, Athanasios Karapantelakis, Cosmin-Septimiu Nechifor, Amit P. Sheth, Alessandra Mileo, Payam Barnaghi Jun 2014

Semantic Modelling Of Smart City Data, Stefan Bischof, Athanasios Karapantelakis, Cosmin-Septimiu Nechifor, Amit P. Sheth, Alessandra Mileo, Payam Barnaghi

Kno.e.sis Publications

Cities present an opportunity for rendering Web of Things-enabled services. According to the World Health Organization, population in cities will double by the middle of this century, while cities deal with increasingly pressing issues such as environmental sustainability, economic growth and citizen mobility. In this paper, we propose a discussion around the need for common semantic descriptions for smart city data to facilitate future services in "smart cities". We present examples of data that can be collected from cities, discuss issues around this data and put forward some preliminary thoughts for creating a semantic description model to describe and help …


Active Learning With Efficient Feature Weighting Methods For Improving Data Quality And Classification Accuracy, Justin Martineau, Lu Chen, Doreen Cheng, Amit P. Sheth Jun 2014

Active Learning With Efficient Feature Weighting Methods For Improving Data Quality And Classification Accuracy, Justin Martineau, Lu Chen, Doreen Cheng, Amit P. Sheth

Kno.e.sis Publications

Many machine learning datasets are noisy with a substantial number of mislabeled instances. This noise yields sub-optimal classification performance. In this paper we study a large, low quality annotated dataset, created quickly and cheaply using Amazon Mechanical Turk to crowdsource annotations. We describe computationally cheap feature weighting techniques and a novel non-linear distribution spreading algorithm that can be used to iteratively and interactively correcting mislabeled instances to significantly improve annotation quality at low cost. Eight different emotion extraction experiments on Twitter data demonstrate that our approach is just as effective as more computationally expensive techniques. Our techniques save a considerable …


Risk Prediction With Genomic Data, Bharati Jadhav May 2014

Risk Prediction With Genomic Data, Bharati Jadhav

Theses

Genome wide association study (GWAS) is widely used with various machine learning algorithms to predict disease risk. This thesis investigates this widely used approach of GWAS using Single Nucleotide Polymorphism (SNP) genotype data and a novel approach of disease risk prediction with whole exome sequencing data, namely Whole Exome Wide Association Study (WEWAS). It further applies a discriminating machine learning algorithm, namely a Support Vector Machine (SVM) with different Kernel functions. For this study, only SNPs generated using genotyping technology, which focuses more on common variants, are used initially for disease prediction. Later, the whole exome data generated using Next …


Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984- May 2014

Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984-

Electronic Theses and Dissertations

Though randomized clinical (RCTs) trials are the gold standard for comparing treatments, they are often infeasible or exclude clinically important subjects, or generally represent an idealized medical setting rather than real practice. Observational data provide an opportunity to study practice-based evidence, but also present challenges for analysis. Traditional statistical methods which are suitable for RCTs may be inadequate for the observational studies. In this project, four of the most popular statistical methods for observational studies: ANCOVA, propensity score matching, regression with the propensity score as a covariate, and instrumental variables (IV) are investigated through application to MarketScan insurance claims data. …


Disease Name Extraction From Clinical Text Using Conditional Random Fields, Omid Ghiasvand May 2014

Disease Name Extraction From Clinical Text Using Conditional Random Fields, Omid Ghiasvand

Theses and Dissertations

The aim of the research done in this thesis was to extract disease and disorder names from clinical texts. We utilized Conditional Random Fields (CRF) as the main method to label diseases and disorders in clinical sentences. We used some other tools such as MetaMap and Stanford Core NLP tool to extract some crucial features. MetaMap tool was used to identify names of diseases/disorders that are already in UMLS Metathesaurus. Some other important features such as lemmatized versions of words, and POS tags were extracted using the Stanford Core NLP tool. Some more features were extracted directly from UMLS Metathesaurus, …


Mining Contrast Subspaces, Lei Duan, Guanting Tang, Jian Pei, James Bailey, Guozhu Dong, Akiko Campbell, Changjie Tang May 2014

Mining Contrast Subspaces, Lei Duan, Guanting Tang, Jian Pei, James Bailey, Guozhu Dong, Akiko Campbell, Changjie Tang

Kno.e.sis Publications

In this paper, we tackle a novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes C+  and C and a query object o, we want to find top-k subspaces S that maximize the ratio of likelihood of o in C+  against that in C. We demonstrate that this problem has important applications, and at the same time, is very challenging. It even does not allow polynomial time approximation. We present CSMiner, a mining method with various pruning techniques. CSMiner is substantially faster than the baseline method. Our …


With Whom To Coordinate, Why And How In Ad-Hoc Social Media Communications During Crisis Response, Hemant Purohit, Shreyansh Bhatt, Andrew Hampton, Valerie L. Shalin, Amit P. Sheth, John M. Flach May 2014

With Whom To Coordinate, Why And How In Ad-Hoc Social Media Communications During Crisis Response, Hemant Purohit, Shreyansh Bhatt, Andrew Hampton, Valerie L. Shalin, Amit P. Sheth, John M. Flach

Kno.e.sis Publications

During crises affected people, well-wishers, and observers join social media communities to discuss the event. They often share useful information relevant to response coordination, for example, specific resource needs. However, responders face the challenge of massive data overload and lack the time to monitor social media traffic for important information. Analysis shows that only a small number of event related conversations are actionable. Moreover, responders do not know which sources are trustworthy. To address these challenges, response teams may apply manual filtering methods, resulting in limited coverage and quality. We propose a framework and interface for extracting specific resource-related information …


The Association Between The Il-1 Pathway, Isaac C. Wun May 2014

The Association Between The Il-1 Pathway, Isaac C. Wun

Dissertations & Theses (Open Access)

Cutaneous malignant melanoma (CMM) is a potentially lethal malignancy that warrants attention and further research, as it is known to that there is an increasing rate of incidence in theUnited States, and it is also known that exposure to UV light is its most crucial risk factor, and family history of melanoma is also an important risk factor. Melanoma is an aggressive and lethal cancer in humans. There are an estimated new 132,000 melanoma cases annually worldwide, and the trend has doubled in the past 20 years. However, attempts to treat melanoma have encountered considerable resistance and remained ineffective. The …


Statistical Analysis Of Enhanced Ctl Killing Activity Against Irradiated Tumor Cells, Catannian Sanogo Apr 2014

Statistical Analysis Of Enhanced Ctl Killing Activity Against Irradiated Tumor Cells, Catannian Sanogo

Georgia State Undergraduate Research Conference

No abstract provided.


Hierarchical Interest Graph From Tweets, Pavan Kapanipathi, Prateek Jain, Chitra Venkataramani, Amit P. Sheth Apr 2014

Hierarchical Interest Graph From Tweets, Pavan Kapanipathi, Prateek Jain, Chitra Venkataramani, Amit P. Sheth

Kno.e.sis Publications

Industry and researchers have identified numerous ways to monetize microblogs for personalization and recommendation. A common challenge across these different works is the identification of user interests. Although techniques have been developed to address this challenge, a flexible approach that spans multiple levels of granularity in user interests has not been forthcoming. In this work, we focus on exploiting hierarchical semantics of concepts to infer richer user interests expressed as a Hierarchical Interest Graph. To create such graphs, we utilize users' tweets to first ground potential user interests to structured background knowledge such as Wikipedia Category Graph. We then adapt …


Evaluating The Impact Of Genotype Errors On Rare Variant Tests Of Association, Kaitlyn Cook, Alejandra Benitez, Casey Fu, Nathan L. Tintle Apr 2014

Evaluating The Impact Of Genotype Errors On Rare Variant Tests Of Association, Kaitlyn Cook, Alejandra Benitez, Casey Fu, Nathan L. Tintle

Faculty Work Comprehensive List

The new class of rare variant tests has usually been evaluated assuming perfect genotype information. In reality, rare variant genotypes may be incorrect, and so rare variant tests should be robust to imperfect data. Errors and uncertainty in SNP genotyping are already known to dramatically impact statistical power for single marker tests on common variants and, in some cases, inflate the type I error rate. Recent results show that uncertainty in genotype calls derived from sequencing reads are dependent on several factors, including read depth, calling algorithm, number of alleles present in the sample, and the frequency at which an …


Stream Crossing Barrier Prioritization Methods For Increasing Eastern Brook Trout Habitat In The Little Androscoggin River Watershed, Michele Windsor Apr 2014

Stream Crossing Barrier Prioritization Methods For Increasing Eastern Brook Trout Habitat In The Little Androscoggin River Watershed, Michele Windsor

Thinking Matters Symposium Archive

Eastern Brook Trout (Salvelinas fontanalis) are an important cold water fishery in the state of Maine. While populations in Maine are relatively abundant there has been decline in some parts of its range due in part to loss of habitat connectivity. Brook trout require access to specific types of stream habitat for spawning, feeding, and seasonal thermal refuges. Stream crossing structures such as undersized, poorly installed, or blocked culverts, as well as small remnant dams, can create barriers to accessing important stream habitat for brook trout. A recent Fish Barrier/Culvert Survey in the Little Androscoggin River Watershed provided data about …