Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

2015

Bioinformatics

Discipline
Institution
Publication
Publication Type
File Type

Articles 1 - 30 of 46

Full-Text Articles in Life Sciences

The Importance Of Physicochemical Characteristics And Nonlinear Classifiers In Determining Hiv-1 Protease Specificity, Timmy Manning, Paul Walsh Dec 2015

The Importance Of Physicochemical Characteristics And Nonlinear Classifiers In Determining Hiv-1 Protease Specificity, Timmy Manning, Paul Walsh

Department of Biological Sciences Publications

This paper reviews recent research relating to the application of bioinformatics approaches to determining HIV-1 protease specificity, outlines outstanding issues, and presents a new approach to addressing these issues. Leading machine learning theory for the problem currently suggests that the direct encoding of the physicochemical properties of the amino acid substrates is not required for optimal performance. A number of amino acid encoding approaches which incorporate potentially relevant physicochemical properties of the substrate are identified, and are evaluated using a nonlinear task decomposition based neuroevolution algorithm. The results are evaluated, and compared against a recent benchmark set on a nonlinear …


A Novel Approach To Identify Shared Fragments In Drugs And Natural Products, Ashkay Balasubramanya, Ishwor Thapa, Dhundy Raj Bastola, Dario Ghersi Nov 2015

A Novel Approach To Identify Shared Fragments In Drugs And Natural Products, Ashkay Balasubramanya, Ishwor Thapa, Dhundy Raj Bastola, Dario Ghersi

Interdisciplinary Informatics Faculty Proceedings & Presentations

Fragment-based approaches have now become an important component of the drug discovery process. At the same time, pharmaceutical chemists are more often turning to the natural world and its extremely large and diverse collection of natural compounds to discover new leads that can potentially be turned into drugs. In this study we introduce and discuss a computational pipeline to automatically extract statistically overrepresented chemical fragments in therapeutic classes, and search for similar fragments in a large database of natural products. By systematically identifying enriched fragments in therapeutic groups, we are able to extract and focus on few fragments that are …


Sequencing Techniques: A Comparison Assignment, Sarah O'Leary-Driscoll Oct 2015

Sequencing Techniques: A Comparison Assignment, Sarah O'Leary-Driscoll

Sequencing & Genome Mining

With your partner, create some sort of visual (table, map, chart, other, ask me!) that compares the main types of sequencing that we discussed, as well as two of the techniques considered 'next generation'.


Discussion Questions: Genome Mining, Sarah O'Leary-Driscoll Oct 2015

Discussion Questions: Genome Mining, Sarah O'Leary-Driscoll

Sequencing & Genome Mining

No abstract provided.


Alignment Information, Sarah O'Leary-Driscoll Oct 2015

Alignment Information, Sarah O'Leary-Driscoll

Sequence Alignments

Pairwise DNA alignment is frequently used to identify similar regions that will show how two sequences have functional or structural similarities. It can also be used to show how exons and introns change between different sequences and whether they have an effect on the final structure of the RNA after the DNA is processed within a cell.


Alignment Outline, Sarah O'Leary-Driscoll Oct 2015

Alignment Outline, Sarah O'Leary-Driscoll

Sequence Alignments

No abstract provided.


2: Sequence Alignment Practice Activity, Sarah O'Leary-Driscoll Oct 2015

2: Sequence Alignment Practice Activity, Sarah O'Leary-Driscoll

Sequence Alignments

Now that you have learned how to do the four basic sequence alignments (Pairwise and Multiple for both nucleotide and protein sequences) select a gene/protein, it may be one that you've used before, and run each of these alignments.


Pt. 2: Presentation / Paper Guidelines, Sarah O'Leary-Driscoll Oct 2015

Pt. 2: Presentation / Paper Guidelines, Sarah O'Leary-Driscoll

Research Project

The presentations for your project should follow the same format that the paper would, but in a much more abbreviated form, aim for 5-7 minutes.


Project Guidelines, Sarah O'Leary-Driscoll Oct 2015

Project Guidelines, Sarah O'Leary-Driscoll

Research Project

No abstract provided.


Pt. 1: Research Question & Background, Sarah O'Leary-Driscoll Oct 2015

Pt. 1: Research Question & Background, Sarah O'Leary-Driscoll

Research Project

No abstract provided.


Primer Design Activity, Sarah O'Leary-Driscoll Oct 2015

Primer Design Activity, Sarah O'Leary-Driscoll

Primer Design

No abstract provided.


Obtaining Genomic Sequence Practice, Sarah O'Leary-Driscoll Oct 2015

Obtaining Genomic Sequence Practice, Sarah O'Leary-Driscoll

Introduction to NCBI

No abstract provided.


Dna Timeline And Poster Project, Sarah O'Leary-Driscoll Oct 2015

Dna Timeline And Poster Project, Sarah O'Leary-Driscoll

Genomics: Past & Future

The DNA timeline goes through many of the major discoveries that have driven our understanding of genetics since Mendel. Pick two scientists and create a PowerPoint slide poster (to be printed out on regular printer sized paper) that covers the following:


3: Genomics: Past & Future Bibliography, Sarah O'Leary-Driscoll Oct 2015

3: Genomics: Past & Future Bibliography, Sarah O'Leary-Driscoll

Genomics: Past & Future

No abstract provided.


Future Of Genomics: Presentations, Sarah O'Leary-Driscoll Oct 2015

Future Of Genomics: Presentations, Sarah O'Leary-Driscoll

Genomics: Past & Future

In his testimony to a House of Representatives sub-committee on health, director of the National Human Genome Research Institute, Francis S. Collins, said that the future of genomics had three main focal points:

"Genomics to Biology: The human genome sequence provides foundational information that now will allow development of a comprehensive catalog of all of the genome's components, determination of the function of all human genes, and deciphering of how genes and proteins work together in pathways and networks.

Genomics to Health: Completion of the human genome sequence offers a unique opportunity to understand the role of genetic factors in …


Database/Resource Acronyms, Sarah O'Leary-Driscoll Oct 2015

Database/Resource Acronyms, Sarah O'Leary-Driscoll

Course Information

No abstract provided.


What Is Bioinformatics?, Sarah O'Leary-Driscoll Oct 2015

What Is Bioinformatics?, Sarah O'Leary-Driscoll

Course Information

Bioinformatics has evolved into a full-fledged multidisciplinary subject that integrates developments in information and computer technology as applied to Biotechnology and Biological Sciences. Bioinformatics uses computer software tools for database creation, data management, data warehousing, data mining and global communication networking. Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of the sequences and structural information as well methods to access, search, visualize and retrieve the information. Bioinformatics concern the creation and maintenance of databases of biological information whereby researchers can both access existing information …


Comprehensive Course Syllabus, Sarah O'Leary-Driscoll Oct 2015

Comprehensive Course Syllabus, Sarah O'Leary-Driscoll

Course Information

The bioinformatics seminar is focused on developing an understanding of the principles behind genomic analyses, developing skills using the different available bioinformatics programs, and becoming aware of the past developments and current research avenues that are benefited by these types of analyses.


Glossary Of Bioinformatics Terms, National Human Genome Research Institute Oct 2015

Glossary Of Bioinformatics Terms, National Human Genome Research Institute

Course Information

No abstract provided.


An Exploration Of The Phylogenetic Placement Of Recently Discovered Ultrasmall Archaeal Lineages, Jeffrey M. O'Brien Aug 2015

An Exploration Of The Phylogenetic Placement Of Recently Discovered Ultrasmall Archaeal Lineages, Jeffrey M. O'Brien

Honors Scholar Theses

In recent years, several new clades within the domain Achaea have been discovered. This is due in part to microbiological sampling of novel environments, and the increasing ability to detect and sequence uncultivable organisms through metagenomic analysis. These organisms share certain features, such as small cell size and streamlined genomes. Reduction in genome size can present difficulties to phylogenetic reconstruction programs. Since there is less genetic data to work with, these organisms often have missing genes in concatenated multiple sequence alignments. Evolutionary Biologists have not reached a consensus on the placement of these lineages in the archaeal evolutionary tree. There …


The Role Of Visualization And 3-D Printing In Biological Data Mining, Talia L. Weiss, Amanda Zieselman, Douglas P. Hill, Solomon G. Diamond, Li Shen, Andrew J. Saykin, Jason H. Moore Aug 2015

The Role Of Visualization And 3-D Printing In Biological Data Mining, Talia L. Weiss, Amanda Zieselman, Douglas P. Hill, Solomon G. Diamond, Li Shen, Andrew J. Saykin, Jason H. Moore

Dartmouth Scholarship

Background:

Biological data mining is a powerful tool that can provide a wealth of information about patterns of genetic and genomic biomarkers of health and disease. A potential disadvantage of data mining is volume and complexity of the results that can often be overwhelming. It is our working hypothesis that visualization methods can greatly enhance our ability to make sense of data mining results. More specifically, we propose that 3-D printing has an important role to play as a visualization technology in biological data mining. We provide here a brief review of 3-D printing along with a case study to …


Investigating The Interaction Of Aurka And Ube2c In Colorectal Cancer Cells, Apurva M. Hegde Aug 2015

Investigating The Interaction Of Aurka And Ube2c In Colorectal Cancer Cells, Apurva M. Hegde

Dissertations & Theses (Open Access)

Colorectal cancer (CRC) is the third leading cause of cancer-related deaths in the US. Among the many genomic aberrations previously implicated in colorectal cancer, recurrent amplification of chromosome 20q is frequently associated with liver metastasis. Previous research in our lab identified a gene signature on chromosome 20q associated with colorectal cancer progression. In this study, one of the genes in the signature, the ubiquitin conjugating enzyme UBE2C, was identified through preliminary bioinformatics analysis as a candidate for further examination of its role in CRC progression. Co-expression analysis of UBE2C in tumor-normal datasets from the public database Oncomine revealed all the …


Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang Jun 2015

Bayesmotif: De Novo Protein Sorting Motif Discovery From Impure Datasets, Jianjun Hu, F. Zhang

Jianjun Hu

Background

Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms.

Methods

We formulated the protein sorting motif discovery problem as a classification problem …


Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu Jun 2015

Hemebind: A Novel Method For Heme Binding Residue Prediction By Combining Structural And Sequence Information, R. Liu, Jianjun Hu

Jianjun Hu

Background Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues. Results Here …


Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou Jun 2015

Integrative Disease Classification Based On Cross-Platform Microarray Data, C.-C. Liu, Jianjun Hu, M. Kalakrishnan, H. Huang, X. Zhou

Jianjun Hu

Background Disease classification has been an important application of microarray technology. However, most microarray-based classifiers can only handle data generated within the same study, since microarray data generated by different laboratories or with different platforms can not be compared directly due to systematic variations. This issue has severely limited the practical use of microarray-based disease classification. Results In this study, we tested the feasibility of disease classification by integrating the large amount of heterogeneous microarray datasets from the public microarray repositories. Cross-platform data compatibility is created by deriving expression log-rank ratios within datasets. One may then compare vectors of log-rank …


Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou Jun 2015

Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. Waterman, X. Zhou

Jianjun Hu

Background Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. Results We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets …


Bioinformatics For The Comparative Genomic Analysis Of The Cotton (Gossypium) Polyploid Complex, Justin Thomas Page Jun 2015

Bioinformatics For The Comparative Genomic Analysis Of The Cotton (Gossypium) Polyploid Complex, Justin Thomas Page

Theses and Dissertations

Understanding the composition, evolution, and function of the cotton (Gossypium) genome is complicated by the joint presence of two genomes in its nucleus (AT and DT genomes). Specifically, read-mapping (a fundamental part of next-generation sequence analysis) cannot adequately differentiate reads as belonging to one genome or the other. These two genomes were derived from progenitor A-genome and D-genome diploids involved in ancestral allopolyploidization. To better understand the allopolyploid genome, we developed PolyCat to categorize reads according to their genome of origin based on homoeo-SNPs that differentiate the two genomes. We re-sequenced the genomes of extant diploid relatives of tetraploid cotton …


Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian May 2015

Library Support For Biomedical Research In The Omics Era: 2014- 2015 Report, Rolando Garcia-Milian

Rolando Garcia-Milian

The decreased cost of high-throughput technologies has enabled its use as the main research methods to study biological processes and disorders. In order to understand the relevance of the data generated by these methods, the researcher needs mining and integrating the enormous amount of biomedical information and knowledge contained in the text of the scientific literature and biomedical databases. Accordingly, the ability to access and examine molecular data should not be restricted to bioinformaticians or those with exceptional computer skills. In May 2014, the Cushing/Whitney Medical Library began to provide end-user bioinformatics support to the biomedical researchers of the Yale …


Can Collection Specimen Data Reveal Temporal Shifts Due To Climate Change?, Julie Maurer May 2015

Can Collection Specimen Data Reveal Temporal Shifts Due To Climate Change?, Julie Maurer

Scholars Week

Climate change is altering the distribution, behavior, and migration patterns of many species. Typically, these responses are documented studies in which standardized methods are used to collect population or behavioral data over several years. Multi-decade studies are rare and few predate the recent dramatic increase in global temperatures, limiting our ability to understand long-term consequences of climate change. Natural history (NH) collections offer a potential solution; they hold a wealth of species occurrence documentation spanning from decades to centuries. However, because the sampling of natural history collectors is spatially and temporally haphazard, it remains unclear whether NH data is useful …


Draft Genome Sequences Of Six Different Staphylococcus Epidermidis Clones, Isolated Individually From Preterm Neonates Presenting With Sepsis At Edinburgh's Royal Infirmary, Paul Walsh, M. Bekaert, J. Carroll, T. Manning, B. Kelly, A. O'Driscoll, X. Lu, C. Smith, P. Dickinson, K. Templeton, P. Ghazal, Roy D. Sleator May 2015

Draft Genome Sequences Of Six Different Staphylococcus Epidermidis Clones, Isolated Individually From Preterm Neonates Presenting With Sepsis At Edinburgh's Royal Infirmary, Paul Walsh, M. Bekaert, J. Carroll, T. Manning, B. Kelly, A. O'Driscoll, X. Lu, C. Smith, P. Dickinson, K. Templeton, P. Ghazal, Roy D. Sleator

Department of Biological Sciences Publications

Herein, we report the draft genome sequences of six individual Staphylococcus epidermidis clones, cultivated from blood taken from different preterm neonatal sepsis patients at the Royal Infirmary, Edinburgh, Scotland, United Kingdom.