Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Engineering

Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh Jan 2022

Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh

Browse all Theses and Dissertations

This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected …


Using Deep Learning To Analyze Materials In Medical Images, Carson Molder May 2021

Using Deep Learning To Analyze Materials In Medical Images, Carson Molder

Computer Science and Computer Engineering Undergraduate Honors Theses

Modern deep learning architectures have become increasingly popular in medicine, especially for analyzing medical images. In some medical applications, deep learning image analysis models have been more accurate at predicting medical conditions than experts. Deep learning has also been effective for material analysis on photographs. We aim to leverage deep learning to perform material analysis on medical images. Because material datasets for medicine are scarce, we first introduce a texture dataset generation algorithm that automatically samples desired textures from annotated or unannotated medical images. Second, we use a novel Siamese neural network called D-CNN to predict patch similarity and build …


Formal Concept Analysis Applications In Bioinformatics, Sarah Roscoe Nov 2020

Formal Concept Analysis Applications In Bioinformatics, Sarah Roscoe

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Bioinformatics is an important field that seeks to solve biological problems with the help of computation. One specific field in bioinformatics is that of genomics, the study of genes and their functions. Genomics can provide valuable analysis as to the interaction between how genes interact with their environment. One such way to measure the interaction is through gene expression data, which determines whether (and how much) a certain gene activates in a situation. Analyzing this data can be critical for predicting diseases or other biological reactions. One method used for analysis is Formal Concept Analysis (FCA), a computing technique based …


Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu May 2018

Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Finding similar sequences to an input query sequence (DNA or proteins) from a sequence data set is an important problem in bioinformatics. It provides researchers an intuition of what could be related or how the search space can be reduced for further tasks. An exact brute-force nearest-neighbor algorithm used for this task has complexity O(m * n) where n is the database size and m is the query size. Such an algorithm faces time-complexity issues as the database and query sizes increase. Furthermore, the use of alignment-based similarity measures such as minimum edit distance adds an additional complexity to the …


Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera Jan 2017

Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera

Wayne State University Theses

The identification of pathways that are involved in a particular phenotype helps us understand the underlying biological processes. Traditional pathway analysis techniques aim to infer the impact on individual pathways using only mRNA levels. However, recent studies showed that gene expression alone is unable to capture the whole picture of biological phenomena. At the same time, MicroRNAs (miRNAs) are newly discovered gene regulators that have shown to play an important role in diagnosis, and prognosis for different types of diseases. Current pathway analysis techniques do not take miRNAs into consideration. In this project, we investigate the effect of integrating miRNA …


Using Latent Semantic Analysis For Automated Keyword Extraction From Large Document Corpora, Tuğba Önal Süzek Jan 2017

Using Latent Semantic Analysis For Automated Keyword Extraction From Large Document Corpora, Tuğba Önal Süzek

Turkish Journal of Electrical Engineering and Computer Sciences

In this study, we describe a keyword extraction technique that uses latent semantic analysis (LSA) to identify semantically important single topic words or keywords. We compare our method against two other automated keyword extractors, Tf-idf (term frequency-inverse document frequency) and Metamap, using human-annotated keywords as a reference. Our results suggest that the LSA-based keyword extraction method performs comparably to the other techniques. Therefore, in an incremental update setting, the LSA-based keyword extraction method can be preferably used to extract keywords from text descriptions from big data when compared to existing keyword extraction methods.


Protein Fold Classification With Grow-And-Learn Network, Özlem Polat, Zümray Dokur Jan 2017

Protein Fold Classification With Grow-And-Learn Network, Özlem Polat, Zümray Dokur

Turkish Journal of Electrical Engineering and Computer Sciences

Protein fold classification is an important subject in computational biology and a compelling work from the point of machine learning. To deal with such a challenging problem, in this study, we propose a solution method for the classification of protein folds using Grow-and-Learn (GAL) neural network together with one-versus-others (OvO) method. To classify the most common 27 protein folds, 125 dimensional data, constituted by the physicochemical properties of amino acids, are used. The study is conducted on a database including 694 proteins: 311 of these proteins are used for training and 383 of them for testing. Overall, the classification system …


Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker Jan 2016

Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker

Turkish Journal of Electrical Engineering and Computer Sciences

The support vector machine (SVM) is one of the highly powerful classifiers that have been shown to be capable of dealing with high-dimensional data. However, its complexity increases requirements of computational power. Recent technologies including the postgenome data of high-dimensional nature add further complexity to the construction of SVM classifiers. In order to overcome this problem, hardware implementations of the SVM classifier have been proposed to benefit from parallelism to accelerate the SVM. On the other hand, those implementations offer limited flexibility in terms of changing parameters and require the reconfiguration of the whole device. The latter interrupts the operation …


Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger Jan 2015

Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger

Williams Honors College, Honors Research Projects

This Honors Project focused on a few aspects of this topic. The second is comparing the molecular signature kernels to three of the BLOSUM matrices (30, 62, and 90) to test the accuracy of the mathematical model. The kernel matrix was manipulated in order to improve the relationship by focusing on side groups and also by changing how the structure was represented in the matrix by increasing the initial height distance from the central atom (Height 1 and Height 2 included).

There were multiple design constraints for this project. The first was the comparison with the BLOSUM matrices (30, 62, …


Regen: Optimizing Genetic Selection Algorithms For Heterogeneous Computing, Scott Kenneth Swinkleb Winkleblack Jun 2014

Regen: Optimizing Genetic Selection Algorithms For Heterogeneous Computing, Scott Kenneth Swinkleb Winkleblack

Master's Theses

GenSel is a genetic selection analysis tool used to determine which genetic markers are informational for a given trait. Performing genetic selection related analyses is a time consuming and computationally expensive task. Due to an expected increase in the number of genotyped individuals, analysis times will increase dramatically. Therefore, optimization efforts must be made to keep analysis times reasonable.

This thesis focuses on optimizing one of GenSel’s underlying algorithms for heterogeneous computing. The resulting algorithm exposes task-level parallelism and data-level parallelism present but inaccessible in the original algorithm. The heterogeneous computing solution, ReGen, outperforms the optimized CPU implementation achieving a …


Computational Methods For Comparative Non-Coding Rna Analysis: From Structural Motif Identification To Genome-Wide Functional Classification, Cuncong Zhong Jan 2013

Computational Methods For Comparative Non-Coding Rna Analysis: From Structural Motif Identification To Genome-Wide Functional Classification, Cuncong Zhong

Electronic Theses and Dissertations

Recent advances in biological research point out that many ribonucleic acids (RNAs) are transcribed from the genome to perform a variety of cellular functions, rather than merely acting as information carriers for protein synthesis. These RNAs are usually referred to as the non-coding RNAs (ncRNAs). The versatile regulation mechanisms and functionalities of the ncRNAs contribute to the amazing complexity of the biological system. The ncRNAs perform their biological functions by folding into specific structures. In this case, the comparative study of the ncRNA structures is key to the inference of their molecular and cellular functions. We are especially interested in …


System Designs To Perform Bioinformatics Sequence Alignment, Çağlar Yilmaz, Mustafa Gök Jan 2013

System Designs To Perform Bioinformatics Sequence Alignment, Çağlar Yilmaz, Mustafa Gök

Turkish Journal of Electrical Engineering and Computer Sciences

The emerging field of bioinformatics uses computing as a tool to understand biology. Biological data of organisms (nucleotide and amino acid sequences) are stored in databases that contain billions of records. In order to process the vast amount of data in a reasonable time, high-performance analysis systems are developed. The main operation shared by the analysis tools is the search for matching patterns between sequences of data (sequence alignment). In this paper, we present 2 systems that can perform pairwise and multiple sequence alignment operations. Through the optimized design methods, proposed systems achieve up to 3.6 times more performance compared …


An Automated Signal Alignment Algorithm Based On Dynamic Time Warping For Capillary Electrophoresis Data, Fethullah Karabi̇ber Jan 2013

An Automated Signal Alignment Algorithm Based On Dynamic Time Warping For Capillary Electrophoresis Data, Fethullah Karabi̇ber

Turkish Journal of Electrical Engineering and Computer Sciences

Correcting the retention time variation and measuring the similarity of time series is one of the most popular challenges in the area of analyzing capillary electrophoresis (CE) data. In this study, an automated signal alignment method is proposed by modifying the dynamic time warping (DTW) approach to align the time-series data. Preprocessing tools and further optimizations were developed to increase the performance of the algorithm. As a demonstrative case study, the developed algorithm is applied to the analysis of CE data from a selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) evaluation of the RNA secondary structure. The time-shift problem …


The Gel Documentation System: A Cornerstone To The Implementation Of The Introduction To Biotechnology And Introduction To Bioinformatics Cross-Disciplinary Course Series, Marcy Kelly, Gregory Lampard, Constance Knapp Jun 2010

The Gel Documentation System: A Cornerstone To The Implementation Of The Introduction To Biotechnology And Introduction To Bioinformatics Cross-Disciplinary Course Series, Marcy Kelly, Gregory Lampard, Constance Knapp

Cornerstone 3 Reports : Interdisciplinary Informatics

No abstract provided.


New Computational Approaches For Multiple Rna Alignment And Rna Search, Daniel Deblasio Jan 2009

New Computational Approaches For Multiple Rna Alignment And Rna Search, Daniel Deblasio

Electronic Theses and Dissertations

In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in O(n2) time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence structure the problem becomes even harder. Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence …


Improving Remote Homology Detection Using A Sequence Property Approach, Gina Marie Cooper Jan 2009

Improving Remote Homology Detection Using A Sequence Property Approach, Gina Marie Cooper

Browse all Theses and Dissertations

Understanding the structure and function of proteins is a key part of understanding biological systems. Although proteins are complex biological macromolecules, they are made up of only 20 basic building blocks known as amino acids. The makeup of a protein can be described as a sequence of amino acids. One of the most important tools in modern bioinformatics is the ability to search for biological sequences (such as protein sequences) that are similar to a given query sequence. There are many tools for doing this (Altschul et al., 1990, Hobohm and Sander, 1995, Thomson et al., 1994, Karplus and Barrett, …


Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii Jan 2008

Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii

Browse all Theses and Dissertations

While genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Isolation of translational efficiency bias can have important …


Computational Methods For The Objective Review Of Forensic Dna Testing Results, Jason R. Gilder Jan 2007

Computational Methods For The Objective Review Of Forensic Dna Testing Results, Jason R. Gilder

Browse all Theses and Dissertations

Since the advent of criminal investigations, investigators have sought a "gold standard" for the evaluation of forensic evidence. Currently, deoxyribonucleic acid (DNA) technology is the most reliable method of identification. Short Tandem Repeat (STR) DNA genotyping has the potential for impressive match statistics, but the methodology not infallible. The condition of an evidentiary sample and potential issues with the handling and testing of a sample can lead to significant issues with the interpretation of DNA testing results. Forensic DNA interpretation standards are determined by laboratory validation studies that often involve small sample sizes. This dissertation presents novel methodologies to address …