Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication Year
- Publication
-
- Turkish Journal of Electrical Engineering and Computer Sciences (5)
- Browse all Theses and Dissertations (4)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (2)
- Electronic Theses and Dissertations (2)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
- Publication Type
Articles 1 - 18 of 18
Full-Text Articles in Engineering
Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh
Novel Natural Language Processing Models For Medical Terms And Symptoms Detection In Twitter, Farahnaz Golrooy Motlagh
Browse all Theses and Dissertations
This dissertation focuses on disambiguation of language use on Twitter about drug use, consumption types of drugs, drug legalization, ontology-enhanced approaches, and prediction analysis of data-driven by developing novel NLP models. Three technical aims comprise this work: (a) leveraging pattern recognition techniques to improve the quality and quantity of crawled Twitter posts related to drug abuse; (b) using an expert-curated, domain-specific DsOn ontology model that improve knowledge extraction in the form of drug-to-symptom and drug-to-side effect relations; and (c) modeling the prediction of public perception of the drug’s legalization and the sentiment analysis of drug consumption on Twitter. We collected …
Using Deep Learning To Analyze Materials In Medical Images, Carson Molder
Using Deep Learning To Analyze Materials In Medical Images, Carson Molder
Computer Science and Computer Engineering Undergraduate Honors Theses
Modern deep learning architectures have become increasingly popular in medicine, especially for analyzing medical images. In some medical applications, deep learning image analysis models have been more accurate at predicting medical conditions than experts. Deep learning has also been effective for material analysis on photographs. We aim to leverage deep learning to perform material analysis on medical images. Because material datasets for medicine are scarce, we first introduce a texture dataset generation algorithm that automatically samples desired textures from annotated or unannotated medical images. Second, we use a novel Siamese neural network called D-CNN to predict patch similarity and build …
Formal Concept Analysis Applications In Bioinformatics, Sarah Roscoe
Formal Concept Analysis Applications In Bioinformatics, Sarah Roscoe
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Bioinformatics is an important field that seeks to solve biological problems with the help of computation. One specific field in bioinformatics is that of genomics, the study of genes and their functions. Genomics can provide valuable analysis as to the interaction between how genes interact with their environment. One such way to measure the interaction is through gene expression data, which determines whether (and how much) a certain gene activates in a situation. Analyzing this data can be critical for predicting diseases or other biological reactions. One method used for analysis is Formal Concept Analysis (FCA), a computing technique based …
Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu
Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Finding similar sequences to an input query sequence (DNA or proteins) from a sequence data set is an important problem in bioinformatics. It provides researchers an intuition of what could be related or how the search space can be reduced for further tasks. An exact brute-force nearest-neighbor algorithm used for this task has complexity O(m * n) where n is the database size and m is the query size. Such an algorithm faces time-complexity issues as the database and query sizes increase. Furthermore, the use of alignment-based similarity measures such as minimum edit distance adds an additional complexity to the …
Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera
Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera
Wayne State University Theses
The identification of pathways that are involved in a particular phenotype helps us understand the underlying biological processes. Traditional pathway analysis techniques aim to infer the impact on individual pathways using only mRNA levels. However, recent studies showed that gene expression alone is unable to capture the whole picture of biological phenomena. At the same time, MicroRNAs (miRNAs) are newly discovered gene regulators that have shown to play an important role in diagnosis, and prognosis for different types of diseases. Current pathway analysis techniques do not take miRNAs into consideration. In this project, we investigate the effect of integrating miRNA …
Using Latent Semantic Analysis For Automated Keyword Extraction From Large Document Corpora, Tuğba Önal Süzek
Using Latent Semantic Analysis For Automated Keyword Extraction From Large Document Corpora, Tuğba Önal Süzek
Turkish Journal of Electrical Engineering and Computer Sciences
In this study, we describe a keyword extraction technique that uses latent semantic analysis (LSA) to identify semantically important single topic words or keywords. We compare our method against two other automated keyword extractors, Tf-idf (term frequency-inverse document frequency) and Metamap, using human-annotated keywords as a reference. Our results suggest that the LSA-based keyword extraction method performs comparably to the other techniques. Therefore, in an incremental update setting, the LSA-based keyword extraction method can be preferably used to extract keywords from text descriptions from big data when compared to existing keyword extraction methods.
Protein Fold Classification With Grow-And-Learn Network, Özlem Polat, Zümray Dokur
Protein Fold Classification With Grow-And-Learn Network, Özlem Polat, Zümray Dokur
Turkish Journal of Electrical Engineering and Computer Sciences
Protein fold classification is an important subject in computational biology and a compelling work from the point of machine learning. To deal with such a challenging problem, in this study, we propose a solution method for the classification of protein folds using Grow-and-Learn (GAL) neural network together with one-versus-others (OvO) method. To classify the most common 27 protein folds, 125 dimensional data, constituted by the physicochemical properties of amino acids, are used. The study is conducted on a database including 694 proteins: 311 of these proteins are used for training and 383 of them for testing. Overall, the classification system …
Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker
Novel Dynamic Partial Reconfiguration Implementations Of The Support Vector Machine Classifier On Fpga, Hanaa Hussain, Khaled Benkrid, Hüseyi̇n Şeker
Turkish Journal of Electrical Engineering and Computer Sciences
The support vector machine (SVM) is one of the highly powerful classifiers that have been shown to be capable of dealing with high-dimensional data. However, its complexity increases requirements of computational power. Recent technologies including the postgenome data of high-dimensional nature add further complexity to the construction of SVM classifiers. In order to overcome this problem, hardware implementations of the SVM classifier have been proposed to benefit from parallelism to accelerate the SVM. On the other hand, those implementations offer limited flexibility in terms of changing parameters and require the reconfiguration of the whole device. The latter interrupts the operation …
Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger
Evaluation Of The Signature Molecular Descriptor With Blosum62 And An All-Atom Description For Use In Sequence Alignment Of Proteins, Lindsay M. Aichinger
Williams Honors College, Honors Research Projects
This Honors Project focused on a few aspects of this topic. The second is comparing the molecular signature kernels to three of the BLOSUM matrices (30, 62, and 90) to test the accuracy of the mathematical model. The kernel matrix was manipulated in order to improve the relationship by focusing on side groups and also by changing how the structure was represented in the matrix by increasing the initial height distance from the central atom (Height 1 and Height 2 included).
There were multiple design constraints for this project. The first was the comparison with the BLOSUM matrices (30, 62, …
Regen: Optimizing Genetic Selection Algorithms For Heterogeneous Computing, Scott Kenneth Swinkleb Winkleblack
Regen: Optimizing Genetic Selection Algorithms For Heterogeneous Computing, Scott Kenneth Swinkleb Winkleblack
Master's Theses
GenSel is a genetic selection analysis tool used to determine which genetic markers are informational for a given trait. Performing genetic selection related analyses is a time consuming and computationally expensive task. Due to an expected increase in the number of genotyped individuals, analysis times will increase dramatically. Therefore, optimization efforts must be made to keep analysis times reasonable.
This thesis focuses on optimizing one of GenSel’s underlying algorithms for heterogeneous computing. The resulting algorithm exposes task-level parallelism and data-level parallelism present but inaccessible in the original algorithm. The heterogeneous computing solution, ReGen, outperforms the optimized CPU implementation achieving a …
Computational Methods For Comparative Non-Coding Rna Analysis: From Structural Motif Identification To Genome-Wide Functional Classification, Cuncong Zhong
Electronic Theses and Dissertations
Recent advances in biological research point out that many ribonucleic acids (RNAs) are transcribed from the genome to perform a variety of cellular functions, rather than merely acting as information carriers for protein synthesis. These RNAs are usually referred to as the non-coding RNAs (ncRNAs). The versatile regulation mechanisms and functionalities of the ncRNAs contribute to the amazing complexity of the biological system. The ncRNAs perform their biological functions by folding into specific structures. In this case, the comparative study of the ncRNA structures is key to the inference of their molecular and cellular functions. We are especially interested in …
System Designs To Perform Bioinformatics Sequence Alignment, Çağlar Yilmaz, Mustafa Gök
System Designs To Perform Bioinformatics Sequence Alignment, Çağlar Yilmaz, Mustafa Gök
Turkish Journal of Electrical Engineering and Computer Sciences
The emerging field of bioinformatics uses computing as a tool to understand biology. Biological data of organisms (nucleotide and amino acid sequences) are stored in databases that contain billions of records. In order to process the vast amount of data in a reasonable time, high-performance analysis systems are developed. The main operation shared by the analysis tools is the search for matching patterns between sequences of data (sequence alignment). In this paper, we present 2 systems that can perform pairwise and multiple sequence alignment operations. Through the optimized design methods, proposed systems achieve up to 3.6 times more performance compared …
An Automated Signal Alignment Algorithm Based On Dynamic Time Warping For Capillary Electrophoresis Data, Fethullah Karabi̇ber
An Automated Signal Alignment Algorithm Based On Dynamic Time Warping For Capillary Electrophoresis Data, Fethullah Karabi̇ber
Turkish Journal of Electrical Engineering and Computer Sciences
Correcting the retention time variation and measuring the similarity of time series is one of the most popular challenges in the area of analyzing capillary electrophoresis (CE) data. In this study, an automated signal alignment method is proposed by modifying the dynamic time warping (DTW) approach to align the time-series data. Preprocessing tools and further optimizations were developed to increase the performance of the algorithm. As a demonstrative case study, the developed algorithm is applied to the analysis of CE data from a selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) evaluation of the RNA secondary structure. The time-shift problem …
The Gel Documentation System: A Cornerstone To The Implementation Of The Introduction To Biotechnology And Introduction To Bioinformatics Cross-Disciplinary Course Series, Marcy Kelly, Gregory Lampard, Constance Knapp
The Gel Documentation System: A Cornerstone To The Implementation Of The Introduction To Biotechnology And Introduction To Bioinformatics Cross-Disciplinary Course Series, Marcy Kelly, Gregory Lampard, Constance Knapp
Cornerstone 3 Reports : Interdisciplinary Informatics
No abstract provided.
New Computational Approaches For Multiple Rna Alignment And Rna Search, Daniel Deblasio
New Computational Approaches For Multiple Rna Alignment And Rna Search, Daniel Deblasio
Electronic Theses and Dissertations
In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in O(n2) time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence structure the problem becomes even harder. Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence …
Improving Remote Homology Detection Using A Sequence Property Approach, Gina Marie Cooper
Improving Remote Homology Detection Using A Sequence Property Approach, Gina Marie Cooper
Browse all Theses and Dissertations
Understanding the structure and function of proteins is a key part of understanding biological systems. Although proteins are complex biological macromolecules, they are made up of only 20 basic building blocks known as amino acids. The makeup of a protein can be described as a sequence of amino acids. One of the most important tools in modern bioinformatics is the ability to search for biological sequences (such as protein sequences) that are similar to a given query sequence. There are many tools for doing this (Altschul et al., 1990, Hobohm and Sander, 1995, Thomson et al., 1994, Karplus and Barrett, …
Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii
Algorithmic Techniques Employed In The Isolation Of Codon Usage Biases In Prokaryotic Genomes, Douglas W. Raiford Iii
Browse all Theses and Dissertations
While genomic sequencing projects are an abundant source of information for biological studies ranging from the molecular to the ecological in scale, much of the information present may yet be hidden from casual analysis. One such information domain, trends in codon usage, can provide a wealth of information about an organism's genes and their expression. Degeneracy in the genetic code allows more than one triplet codon to code for the same amino acid, and usage of these codons is often biased such that one or more of these synonymous codons is preferred. Isolation of translational efficiency bias can have important …
Computational Methods For The Objective Review Of Forensic Dna Testing Results, Jason R. Gilder
Computational Methods For The Objective Review Of Forensic Dna Testing Results, Jason R. Gilder
Browse all Theses and Dissertations
Since the advent of criminal investigations, investigators have sought a "gold standard" for the evaluation of forensic evidence. Currently, deoxyribonucleic acid (DNA) technology is the most reliable method of identification. Short Tandem Repeat (STR) DNA genotyping has the potential for impressive match statistics, but the methodology not infallible. The condition of an evidentiary sample and potential issues with the handling and testing of a sample can lead to significant issues with the interpretation of DNA testing results. Forensic DNA interpretation standards are determined by laboratory validation studies that often involve small sample sizes. This dissertation presents novel methodologies to address …