Open Access. Powered by Scholars. Published by Universities.®

Other Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Life Sciences

Institution
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 72

Full-Text Articles in Other Computer Sciences

Self-Supervised Pretraining And Transfer Learning On Fmri Data With Transformers, Sean Paulsen Aug 2023

Self-Supervised Pretraining And Transfer Learning On Fmri Data With Transformers, Sean Paulsen

Dartmouth College Ph.D Dissertations

Transfer learning is a machine learning technique founded on the idea that knowledge acquired by a model during “pretraining” on a source task can be transferred to the learning of a target task. Successful transfer learning can result in improved performance, faster convergence, and reduced demand for data. This technique is particularly desirable for the task of brain decoding in the domain of functional magnetic resonance imaging (fMRI), wherein even the most modern machine learning methods can struggle to decode labelled features of brain images. This challenge is due to the highly complex underlying signal, physical and neurological differences between …


Novel Approach For Non-Invasive Prediction Of Body Shape And Habitus, Emma Young Jun 2023

Novel Approach For Non-Invasive Prediction Of Body Shape And Habitus, Emma Young

Electronic Theses and Dissertations

While marker-based motion capture remains the gold standard in measuring human movement, accuracy is influenced by soft-tissue artifacts, particularly for subjects with high body mass index (BMI) where markers are not placed close to the underlying bone. Obesity influences joint loads and motion patterns, and BMI may not be sufficient to capture the distribution of a subject’s weight or to differentiate differences between subjects. Subjects in need of a joint replacement are more likely to have mobility issues or pain, which prevents exercise. Obesity also increases the likelihood of needing a total joint replacement. Accurate movement data for subjects with …


Creating Project Contrast: A Video Game Exploring Consciousness And Qualia, Pierce Papke May 2023

Creating Project Contrast: A Video Game Exploring Consciousness And Qualia, Pierce Papke

Honors Projects

Project Contrast is a video game that explores how the unique traits inherent to video games might engage reflective player responses to qualitative experience. Project Contrast does this through suspension of disbelief, avatar projection, presence, player agency in storytelling, visual perception, functional gameplay, and art. Considering the difficulty in researching qualitative experience due to its subjectivity and circular explanations, I created Project Contrast not to analyze qualia, though that was my original hope. I instead created Project Contrast as an avenue for player self-reflection and learning about qualitative experience. While video games might be just code and art on a …


Du Undergraduate Showcase: Research, Scholarship, And Creative Works, Caitlyn Aldersea, Justin Bravo, Sam Allen, Anna Block, Connor Block, Emma Buechler, Maria De Los Angeles Bustillos, Arianna Carlson, William Christensen, Olivia Kachulis, Noah Craver, Kate Dillon, Muskan Fatima, Angel Fernandes, Emma Finch, Colleen Cassidy, Amy Fishman, Andrea Francis, Stacia Fritz, Simran Gill, Emma Gries, Rylie Hansen, Shannon Powers, Jacqueline Martinez, Zachary Harker, Ashley Hasty, Mykaela Tanino-Springsteen, Kathleen Hopps, Adelaide Kerenick, Colin Kleckner, Ci Koehring, Elijah Kruger, Braden Krumholz, Maddie Leake, Lyneé Alves, Seraphina Loukas, Yatzari Lozano Vazquez, Haley Maki, Emily Martinez, Sierra Mckinney, Mykaela Tanino-Springsteen, Audrey Mitchell, Kipling Newman, Audrey Ng, Megan Lucyshyn, Andrew Nguyen, Stevie Ostman, Casandra Pearson, Alexandra Penney, Julia Gielczynski, Tyler Ball, Anna Rini, Christina Rorres, Simon Ruland, Helayna Schafer, Emma Sellers, Sarah Schuller, Claire Shaver, Kevin Summers, Isabella Shaw, Madison Sinar, Claudia Pena, Apshara Siwakoti, Carter Sorensen, Madi Sousa, Anna Sparling, Alexandra Revier, Brandon Thierry, Dylan Tyree, Maggie Williams, Lauren Wols May 2023

Du Undergraduate Showcase: Research, Scholarship, And Creative Works, Caitlyn Aldersea, Justin Bravo, Sam Allen, Anna Block, Connor Block, Emma Buechler, Maria De Los Angeles Bustillos, Arianna Carlson, William Christensen, Olivia Kachulis, Noah Craver, Kate Dillon, Muskan Fatima, Angel Fernandes, Emma Finch, Colleen Cassidy, Amy Fishman, Andrea Francis, Stacia Fritz, Simran Gill, Emma Gries, Rylie Hansen, Shannon Powers, Jacqueline Martinez, Zachary Harker, Ashley Hasty, Mykaela Tanino-Springsteen, Kathleen Hopps, Adelaide Kerenick, Colin Kleckner, Ci Koehring, Elijah Kruger, Braden Krumholz, Maddie Leake, Lyneé Alves, Seraphina Loukas, Yatzari Lozano Vazquez, Haley Maki, Emily Martinez, Sierra Mckinney, Mykaela Tanino-Springsteen, Audrey Mitchell, Kipling Newman, Audrey Ng, Megan Lucyshyn, Andrew Nguyen, Stevie Ostman, Casandra Pearson, Alexandra Penney, Julia Gielczynski, Tyler Ball, Anna Rini, Christina Rorres, Simon Ruland, Helayna Schafer, Emma Sellers, Sarah Schuller, Claire Shaver, Kevin Summers, Isabella Shaw, Madison Sinar, Claudia Pena, Apshara Siwakoti, Carter Sorensen, Madi Sousa, Anna Sparling, Alexandra Revier, Brandon Thierry, Dylan Tyree, Maggie Williams, Lauren Wols

DU Undergraduate Research Journal Archive

DU Undergraduate Showcase: Research, Scholarship, and Creative Works


A Programmatic Geographic Information Systems Analysis Of Plant Hardiness Zones, Andrew Bowen May 2023

A Programmatic Geographic Information Systems Analysis Of Plant Hardiness Zones, Andrew Bowen

Electronic Theses and Dissertations

The Plant Hardiness Zone Map consists of thirteen geographical zones that describe whether a plant can survive based on average annual minimal temperatures. As climate change progresses, minimum temperatures in all regions are expected to change. This work programmatically evaluates predicted future climate projection data and converts it to United States Department of Agriculture-defined hardiness zones. Through the next 80 years, hardiness zones are projected to move poleward; in effect, colder zones will lose area and warmer zones will gain area globally. Some implications include changes in crop growing degree days, which could alter crop productivity, migration and settlement of …


From Deep Mutational Mapping Of Allosteric Protein Landscapes To Deep Learning Of Allostery And Hidden Allosteric Sites: Zooming In On “Allosteric Intersection” Of Biochemical And Big Data Approaches, Gennady M. Verkhivker, Mohammed Alshahrani, Grace Gupta, Sian Xiao, Peng Tao Apr 2023

From Deep Mutational Mapping Of Allosteric Protein Landscapes To Deep Learning Of Allostery And Hidden Allosteric Sites: Zooming In On “Allosteric Intersection” Of Biochemical And Big Data Approaches, Gennady M. Verkhivker, Mohammed Alshahrani, Grace Gupta, Sian Xiao, Peng Tao

Mathematics, Physics, and Computer Science Faculty Articles and Research

The recent advances in artificial intelligence (AI) and machine learning have driven the design of new expert systems and automated workflows that are able to model complex chemical and biological phenomena. In recent years, machine learning approaches have been developed and actively deployed to facilitate computational and experimental studies of protein dynamics and allosteric mechanisms. In this review, we discuss in detail new developments along two major directions of allosteric research through the lens of data-intensive biochemical approaches and AI-based computational methods. Despite considerable progress in applications of AI methods for protein structure and dynamics studies, the intersection between allosteric …


Invasive Buckthorn Mapping: A Uav-Based Approach Utilizing Machine Learning, Gis, And Remote Sensing Techniques In The Upper Peninsula Of Michigan, Vikranth Madeppa Jan 2023

Invasive Buckthorn Mapping: A Uav-Based Approach Utilizing Machine Learning, Gis, And Remote Sensing Techniques In The Upper Peninsula Of Michigan, Vikranth Madeppa

Dissertations, Master's Theses and Master's Reports

An Invasive species is a species that is alien or non-native to the ecosystem which causes harm to economic, environmental, or human health (E.O. 13112 of Feb 3, 1999). Invasive species have posed a serious threat to ecosystems across the globe. These invasive species have impacts on the biodiversity and productivity of invaded forests. Remotely sensed data is a valuable resource for understanding and addressing issues related to invasive species. This study presents a novel approach for mapping the distribution of two invasive plant species, Common and Glossy Buckthorn, using unmanned aerial vehicles (UAVs), machine learning algorithms, geographic information systems …


The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina May 2022

The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina

Student Theses and Dissertations

Purpose:
Sonic branding is not just about composing jingles like McDonald’s “I’m Lovin’ It.” Sonic branding is an industry that strategically designs a cohesive auditory component of a brand’s corporate identity. This paper examines the psychological impact of music and sound on consumer behavior reviewing studies from the past 40 years and investigates the significance of stimulating auditory perception by infusing sound in consumer experience in the modern 2020s.

Design/methodology/approach:
Qualitative content analysis of audio media was used to test two hypotheses. Four archival oral interview recordings from Jeanna Isham’s podcast “Sound in Marketing” featuring the sonic branding experts …


Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas Apr 2022

Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas

School of Computing: Faculty Publications

Information transmission and storage have gained traction as unifying concepts to characterize biological systems and their chances of survival and evolution at multiple scales. Despite the potential for an information-based mathematical framework to offer new insights into life processes and ways to interact with and control them, the main legacy is that of Shannon’s, where a purely syntactic characterization of information scores systems on the basis of their maximum information efficiency. The latter metrics seem not entirely suitable for biological systems, where transmission and storage of different pieces of information (carrying different semantics) can result in different chances of survival. …


Dissecting Mutational Allosteric Effects In Alkaline Phosphatases Associated With Different Hypophosphatasia Phenotypes: An Integrative Computational Investigation, Fei Xiao, Ziyun Zhou, Xingyu Song, Mi Gan, Jie Long, Gennady M. Verkhivker, Guang Hu Mar 2022

Dissecting Mutational Allosteric Effects In Alkaline Phosphatases Associated With Different Hypophosphatasia Phenotypes: An Integrative Computational Investigation, Fei Xiao, Ziyun Zhou, Xingyu Song, Mi Gan, Jie Long, Gennady M. Verkhivker, Guang Hu

Mathematics, Physics, and Computer Science Faculty Articles and Research

Hypophosphatasia (HPP) is a rare inherited disorder characterized by defective bone mineralization and is highly variable in its clinical phenotype. The disease occurs due to various loss-of-function mutations in ALPL, the gene encoding tissue-nonspecific alkaline phosphatase (TNSALP). In this work, a data-driven and biophysics-based approach is proposed for the large-scale analysis of ALPL mutations-from nonpathogenic to severe HPPs. By using a pipeline of synergistic approaches including sequence-structure analysis, network modeling, elastic network models and atomistic simulations, we characterized allosteric signatures and effects of the ALPL mutations on protein dynamics and function. Statistical analysis of molecular features computed for the …


Automated Parsing Of Flexible Molecular Systems Using Principal Component Analysis And K-Means Clustering Techniques, Matthew J. Nwerem Aug 2021

Automated Parsing Of Flexible Molecular Systems Using Principal Component Analysis And K-Means Clustering Techniques, Matthew J. Nwerem

Computational and Data Sciences (MS) Theses

Computational investigation of molecular structures and reactions of biological and pharmaceutical interests remains a grand scientific challenge due to the size and conformational flexibility of these systems. The work requires parsing and analyzing thousands of conformations in each molecular state for meaningful chemical information and subjecting the ensemble to costly quantum chemical calculations. The current status quo typically involves a manual process where the investigator must look at each conformation, separating each into structural families. This process is time-intensive and tedious, making this process infeasible in some cases, and limiting the ability of theoreticians to study these systems. However, the …


Analysis Of The Slo Bay Microbiome From A Network Perspective, Lien Viet Nguyen Jul 2021

Analysis Of The Slo Bay Microbiome From A Network Perspective, Lien Viet Nguyen

Master's Theses

Microorganisms are key players in the ecosystem functioning. In this thesis, we developed a framework to preprocess raw microbiome data, build a correlation network, and analyze co-occurrence patterns between microbes. We then applied this framework to a marine microbiome dataset. The dataset used in this study comes from a year-long time-series to characterize the microbial communities in our coastal waters off the Cal Poly Pier. In analyzing this dataset, we were able to observe and confirm previously discovered patterns of interactions and generate hypotheses about new patterns. The analysis of co-occurrences between prokaryotic and eukaryotic taxa is relatively novel and …


Knowing What We Know: Leveraging Community Knowledge Through Automated Text-Mining, Justin Gardner, Jonathan Tory Toole, Hemant Kalia, Garry Spink Jr., Gordon Broderick May 2021

Knowing What We Know: Leveraging Community Knowledge Through Automated Text-Mining, Justin Gardner, Jonathan Tory Toole, Hemant Kalia, Garry Spink Jr., Gordon Broderick

Advances in Clinical Medical Research and Healthcare Delivery

No abstract provided.


Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke May 2021

Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke

Computer Science and Computer Engineering Undergraduate Honors Theses

Bioinformatic analysis is a time-consuming process for labs performing research on various microbiomes. Researchers use tools like Qiime2 to help standardize the bioinformatic analysis methods, but even large, extensible platforms like Qiime2 have drawbacks due to the attention required by researchers. In this project, we propose to automate additional standard lab bioinformatic procedures by eliminating the existing manual process of determining the trim and truncate locations for paired end 2 sequences. We introduce a new Qiime2 plugin called TruncTrimmer to automate the process that usually requires the researcher to make a decision on where to trim and truncate manually after …


Iot Based Agriculture 4.0: Challenges And Opportunities, Halimjon Khujamatov, Temur Toshtemirov Mr., Doston Turayevich Khasanov Mr., Nasiba Saburova Ms., Ilhom Ikromovich Xamroyev Mr. Apr 2021

Iot Based Agriculture 4.0: Challenges And Opportunities, Halimjon Khujamatov, Temur Toshtemirov Mr., Doston Turayevich Khasanov Mr., Nasiba Saburova Ms., Ilhom Ikromovich Xamroyev Mr.

Bulletin of TUIT: Management and Communication Technologies

In recent years, the world's population growth has been intensifying, resulting in specific problems related to the depletion of natural resources, food shortages, declining fertile lands, and changing weather conditions. This paper has been discussed the use of IoT technology as a solution to such problems.

At the same time, the emergence of IoT technology has given rise to a new research direction in agriculture. Soil analysis and monitoring using Zigbee wireless sensor network technology, which is part of the IoT, will enable the creation of an IoT ecosystem as well as the development of smart agriculture. In addition, entrepreneurship, …


Ensemble Protein Inference Evaluation, Kyle Lee Lucke Jan 2021

Ensemble Protein Inference Evaluation, Kyle Lee Lucke

Graduate Student Theses, Dissertations, & Professional Papers

The Protein inference problem is becoming an increasingly important tool that aids in the characterization of complex proteomes and analysis of complex protein samples. In bottom-up shotgun proteomics experiments the metrics for evaluation (like AUC and calibration error) are based on an often imperfect target-decoy database. These metrics make the inherent assumption that all of the proteins in the target set are present in the sample being analyzed. In general, this is not the case, they are typically a mix of present and absent proteins. To objectively evaluate inference methods, protein standard datasets are used. These datasets are special in …


Functional Morphology Of Gliding Flight Ii. Morphology Follows Predictions Of Gliding Performance, Jonathan Rader, Tyson L. Hedrick, Yanyan He, Lindsay D. Waldrop Sep 2020

Functional Morphology Of Gliding Flight Ii. Morphology Follows Predictions Of Gliding Performance, Jonathan Rader, Tyson L. Hedrick, Yanyan He, Lindsay D. Waldrop

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

The evolution of wing morphology among birds, and its functional consequences, remains an open question, despite much attention. This is in part because the connection between form and function is difficult to test directly. To address this deficit, in prior work we used computational modeling and sensitivity analysis to interrogate the impact of altering wing aspect ratio, camber, and Reynolds number on aerodynamic performance, revealing the performance landscapes that avian evolution has explored. In the present work, we used a dataset of three-dimensionally scanned bird wings coupled with the performance landscapes to test two hypotheses regarding the evolutionary diversification of …


Computational Methods For Predicting Protein-Protein Interactions And Binding Sites, Yiwei Li Aug 2020

Computational Methods For Predicting Protein-Protein Interactions And Binding Sites, Yiwei Li

Electronic Thesis and Dissertation Repository

Proteins are essential to organisms and participate in virtually every process within cells. Quite often, they keep the cells functioning by interacting with other proteins. This process is called protein-protein interaction (PPI). The bonding amino acid residues during the process of protein-protein interactions are called PPI binding sites. Identifying PPIs and PPI binding sites are fundamental problems in system biology.

Experimental methods for solving these two problems are slow and expensive. Therefore, great efforts are being made towards increasing the performance of computational methods.

We present DELPHI, a deep learning based program for PPI site prediction and SPRINT, an algorithmic …


Functional Morphology Of Gliding Flight I. Modeling Reveals Distinct Performance Landscapes Based On Soaring Strategies, Lindsay D. Waldrop, Yanyan He, Tyson L. Hedrick, Jonathan Rader Aug 2020

Functional Morphology Of Gliding Flight I. Modeling Reveals Distinct Performance Landscapes Based On Soaring Strategies, Lindsay D. Waldrop, Yanyan He, Tyson L. Hedrick, Jonathan Rader

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

The physics of flight influences the morphology of bird wings through natural selection on flight performance. The connection between wing morphology and performance is unclear due to the complex relationships between various parameters of flight. In order to better understand this connection, we present a holistic analysis of gliding flight that preserves complex relationships between parameters. We use a computational model of gliding flight, along with analysis by uncertainty quantification, to 1) create performance landscapes of gliding based on output metrics (maximum lift-to-drag ratio, minimum gliding angle, minimum sinking speed, lift coefficient at minimum sinking speed); and 2) predict what …


Causality In Microbiomes, Md Musfiqur Rahman Sazal Jul 2020

Causality In Microbiomes, Md Musfiqur Rahman Sazal

FIU Electronic Theses and Dissertations

No abstract provided.


Allosteric Regulation At The Crossroads Of New Technologies: Multiscale Modeling, Networks, And Machine Learning, Gennady M. Verkhivker, Steve Agajanian, Guang Hu, Peng Tao Jul 2020

Allosteric Regulation At The Crossroads Of New Technologies: Multiscale Modeling, Networks, And Machine Learning, Gennady M. Verkhivker, Steve Agajanian, Guang Hu, Peng Tao

Mathematics, Physics, and Computer Science Faculty Articles and Research

Allosteric regulation is a common mechanism employed by complex biomolecular systems for regulation of activity and adaptability in the cellular environment, serving as an effective molecular tool for cellular communication. As an intrinsic but elusive property, allostery is a ubiquitous phenomenon where binding or disturbing of a distal site in a protein can functionally control its activity and is considered as the “second secret of life.” The fundamental biological importance and complexity of these processes require a multi-faceted platform of synergistically integrated approaches for prediction and characterization of allosteric functional states, atomistic reconstruction of allosteric regulatory mechanisms and discovery of …


Multi-Label Model For Toxicity Prediction, Xiu Huan Yap, Michael L. Raymer Apr 2020

Multi-Label Model For Toxicity Prediction, Xiu Huan Yap, Michael L. Raymer

Symposium of Student Research, Scholarship, and Creative Activities Materials

Most computational predictive models are specifically trained for a single toxicity endpoint. Since more than 1300 toxicity assays have been reported in the TOXCAST dashboard, achieving high coverage over this growing number of toxicity endpoints remains challenging. Furthermore, single-endpoint models lack the ability to learn dependencies between endpoints, such as those targeting similar biological pathways, which may be used to boost model performance. In this study, we characterize the performance of 3 multi-label classification (MLC) models, namely Classifier Chains (CC), Label Powersets (LP) and Stacking (SBR), on Tox21 challenge data. These MLC models employ the Problem Transformation approach, which is …


De Novo Sequencing And Analysis Of Salvia Hispanica Tissue-Specific Transcriptome And Identification Of Genes Involved In Terpenoid Biosynthesis, James Wimberley, Joseph Cahill, Hagop S. Atamian Mar 2020

De Novo Sequencing And Analysis Of Salvia Hispanica Tissue-Specific Transcriptome And Identification Of Genes Involved In Terpenoid Biosynthesis, James Wimberley, Joseph Cahill, Hagop S. Atamian

Biology, Chemistry, and Environmental Sciences Faculty Articles and Research

Salvia hispanica (commonly known as chia) is gaining popularity worldwide as a healthy food supplement due to its low saturated fatty acid and high polyunsaturated fatty acid content, in addition to being rich in protein, fiber, and antioxidants. Chia leaves contain plethora of secondary metabolites with medicinal properties. In this study, we sequenced chia leaf and root transcriptomes using the Illumina platform. The short reads were assembled into contigs using the Trinity software and annotated against the Uniprot database. The reads were de novo assembled into 103,367 contigs, which represented 92.8% transcriptome completeness and a diverse set of Gene Ontology …


Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh Oct 2019

Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh

Doctoral Dissertations

Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore's law. However, we are fast approaching the ultimate physical limits in terms of both device sizes and the associated energy dissipation. It is important to characterize these limits in a physically grounded and implementation-agnostic manner, in order to capture the fundamental energy dissipation costs associated with performing computing operations with classical information in nano-scale quantum systems. It is also necessary to identify and understand the effect of quantum in-distinguishability, noise, and device variability on these dissipation limits. Identifying these parameters is crucial to designing …


Effective Statistical Energy Function Based Protein Un/Structure Prediction, Avdesh Mishra Aug 2019

Effective Statistical Energy Function Based Protein Un/Structure Prediction, Avdesh Mishra

University of New Orleans Theses and Dissertations

Proteins are an important component of living organisms, composed of one or more polypeptide chains, each containing hundreds or even thousands of amino acids of 20 standard types. The structure of a protein from the sequence determines crucial functions of proteins such as initiating metabolic reactions, DNA replication, cell signaling, and transporting molecules. In the past, proteins were considered to always have a well-defined stable shape (structured proteins), however, it has recently been shown that there exist intrinsically disordered proteins (IDPs), which lack a fixed or ordered 3D structure, have dynamic characteristics and therefore, exist in multiple states. Based on …


Integration Of Random Forest Classifiers And Deep Convolutional Neural Networks For Classification And Biomolecular Modeling Of Cancer Driver Mutations, Steve Agajanian, Odeyemi Oluyemi, Gennady M. Verkhivker Jun 2019

Integration Of Random Forest Classifiers And Deep Convolutional Neural Networks For Classification And Biomolecular Modeling Of Cancer Driver Mutations, Steve Agajanian, Odeyemi Oluyemi, Gennady M. Verkhivker

Mathematics, Physics, and Computer Science Faculty Articles and Research

Development of machine learning solutions for prediction of functional and clinical significance of cancer driver genes and mutations are paramount in modern biomedical research and have gained a significant momentum in a recent decade. In this work, we integrate different machine learning approaches, including tree based methods, random forest and gradient boosted tree (GBT) classifiers along with deep convolutional neural networks (CNN) for prediction of cancer driver mutations in the genomic datasets. The feasibility of CNN in using raw nucleotide sequences for classification of cancer driver mutations was initially explored by employing label encoding, one hot encoding, and embedding to …


High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami Jun 2019

High-Performance Computing Frameworks For Large-Scale Genome Assembly, Sayan Goswami

LSU Doctoral Dissertations

Genome sequencing technology has witnessed tremendous progress in terms of throughput and cost per base pair, resulting in an explosion in the size of data. Typical de Bruijn graph-based assembly tools demand a lot of processing power and memory and cannot assemble big datasets unless running on a scaled-up server with terabytes of RAMs or scaled-out cluster with several dozens of nodes. In the first part of this work, we present a distributed next-generation sequence (NGS) assembler called Lazer, that achieves both scalability and memory efficiency by using partitioned de Bruijn graphs. By enhancing the memory-to-disk swapping and reducing the …


Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim

Open Educational Resources

This material introduces Linux File System structures and demonstrates how to use commands to communicate with the operating system through a Terminal program. Basic program structures and system() function of Perl are discussed. A brief introduction to gene-sequencing terminology and file formats are given.


Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 1, Esma Yildirim

Open Educational Resources

This material introduces the AWS console interface, describes how to create an instance on AWS with the VMI provided, connect to that machine instance using the SSH protocol. Once connected, it requires the students to write a script to enter the data folder, which includes gene-sequencing input files and print the first five line of each file remotely. The same exercise can be applied if the VMI is installed on a local machine using virtualization software (e.g. Oracle VirtualBox). In this case, the Terminal program of the VMI can be used to do the exercise.


Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim May 2019

Designing Computational Biology Workflows With Perl - Part 2, Esma Yildirim

Open Educational Resources

This material introduces the AWS console interface, describes how to create an instance on AWS with the VMI provided and connect to that machine instance using the SSH protocol. Once connected, it requires the students to write a script to automate the tasks to create VCF files from two different sample genomes belonging to E.coli microorganisms by using the FASTA and FASTQ files in the input folder of the virtual machine. The same exercise can be applied if the VMI is installed on a local machine using virtualization software (e.g. Oracle VirtualBox). In this case, the Terminal program of the …