Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

Western Michigan University

Theses/Dissertations

Big Data

Publication Year

Articles 1 - 3 of 3

Full-Text Articles in Entire DC Network

Toward Self-Reconfigurable Parametric Systems: Reinforcement Learning Approach, Ting-Yu Mu Dec 2019

Toward Self-Reconfigurable Parametric Systems: Reinforcement Learning Approach, Ting-Yu Mu

Dissertations

For the ongoing advancement of the fields of Information Technology (IT) and Computer Science, machine learning-based approaches are utilized in different ways in order to solve the problems that belong to the Nondeterministic Polynomial time (NP)-hard complexity class or to approximate the problems if there is no known efficient way to find a solution. Problems that determine the proper set of reconfigurable parameters of parametric systems to obtain the near optimal performance are typically classified as NP-hard problems with no efficient mathematical models to obtain the best solutions. This body of work aims to advance the knowledge of machine learning …


A Holistic Computational Approach To Boosting The Performance Of Protein Search Engines, Majdi Ahmad Mosa Maabreh Apr 2018

A Holistic Computational Approach To Boosting The Performance Of Protein Search Engines, Majdi Ahmad Mosa Maabreh

Dissertations

Despite availability of several proteins search engines, due to the increasing amounts of MS/MS data and database sizes, more efficient data analysis and reduction methods are important. Improving accuracy and performance of protein identification is a main goal in the community of proteomic research. In this research, a holistic solution for improvement in search performance is developed.

Most current search engines apply the SEQUEST style of searching protein databases to define MS/MS spectra. SEQUEST involves three main phases: (i) Indexing the protein databases, (ii) Matching and Ranking the MS/MS spectra and (iii) Filtering the matches and reporting the final proteins. …


A Hybrid Parallel Approach To High-Performance Compression Of Big Genomic Files And In Compresso Data Processing, Sandino N. Vargas Pérez Dec 2017

A Hybrid Parallel Approach To High-Performance Compression Of Big Genomic Files And In Compresso Data Processing, Sandino N. Vargas Pérez

Dissertations

Due to the rapid development of high-throughput low cost Next-Generation Sequencing, genomic file transmission and storage is now one of the many Big Data challenges in computer science. Highly specialized compression techniques have been devised to tackle this issue, but sequential data compression has become increasingly inefficient and existing parallel algorithms suffer from poor scalability. Even the best available solutions can take hours to compress gigabytes of data, making the use of these techniques for large-scale genomics prohibitively expensive in terms of time and space complexity.

This dissertation responds to the aforementioned problem by presenting a novel hybrid parallel approach …