Open Access. Powered by Scholars. Published by Universities.®

Theory and Algorithms Commons

Open Access. Powered by Scholars. Published by Universities.®

Life Sciences

Institution
Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 74

Full-Text Articles in Theory and Algorithms

Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik Apr 2024

Rescape: Transforming Coral-Reefscape Images For Quantitative Analysis, Zachary Ferris, Eraldo Ribeiro, Tomofumi Nagata, Robert Van Woesik

Ocean Engineering and Marine Sciences Faculty Publications

Ever since the first image of a coral reef was captured in 1885, people worldwide have been accumulating images of coral reefscapes that document the historic conditions of reefs. However, these innumerable reefscape images suffer from perspective distortion, which reduces the apparent size of distant taxa, rendering the images unusable for quantitative analysis of reef conditions. Here we solve this century-long distortion problem by developing a novel computer-vision algorithm, ReScape, which removes the perspective distortion from reefscape images by transforming them into top-down views, making them usable for quantitative analysis of reef conditions. In doing so, we demonstrate the …


A Machine Learning Model Of Perturb-Seq Data For Use In Space Flight Gene Expression Profile Analysis, Liam F. Johnson, James Casaletto, Lauren Sanders, Sylvain Costes Mar 2024

A Machine Learning Model Of Perturb-Seq Data For Use In Space Flight Gene Expression Profile Analysis, Liam F. Johnson, James Casaletto, Lauren Sanders, Sylvain Costes

Graduate Industrial Research Symposium

The genetic perturbations caused by spaceflight on biological systems tend to have a system-wide effect which is often difficult to deconvolute it into individual signals with specific points of origin. Single cell multi-omic data can provide a profile of the perturbational effects, but does not necessarily indicate the initial point of interference within the network. The objective of this project is to take advantage of large scale and genome-wide perturbational datasets by using them to train a tuned machine learning model that is capable of predicting the effects of unseen perturbations in new data. Perturb-Seq datasets are large libraries of …


Online Class-Incremental Learning For Real-World Food Image Classification, Siddeshwar Raghavan, Jiangpeng He, Fengqing Zhu Mar 2024

Online Class-Incremental Learning For Real-World Food Image Classification, Siddeshwar Raghavan, Jiangpeng He, Fengqing Zhu

Graduate Industrial Research Symposium

Food image classification is essential for monitoring health and tracking dietary in image-based dietary assessment methods. However, conventional systems often rely on static datasets with fixed classes and uniform distribution. In contrast, real-world food consumption patterns, shaped by cultural, economic, and personal influences, involve dynamic and evolving data. Thus, it requires the classification system to cope with continuously evolving data. Online Class Incremental Learning (OCIL) addresses the challenge of learning continuously from a single-pass data stream while adapting to the new knowledge and reducing catastrophic forgetting. Experience Replay (ER) based OCIL methods store a small portion of previous data and …


Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia Dec 2023

Reducing Food Scarcity: The Benefits Of Urban Farming, S.A. Claudell, Emilio Mejia

Journal of Nonprofit Innovation

Urban farming can enhance the lives of communities and help reduce food scarcity. This paper presents a conceptual prototype of an efficient urban farming community that can be scaled for a single apartment building or an entire community across all global geoeconomics regions, including densely populated cities and rural, developing towns and communities. When deployed in coordination with smart crop choices, local farm support, and efficient transportation then the result isn’t just sustainability, but also increasing fresh produce accessibility, optimizing nutritional value, eliminating the use of ‘forever chemicals’, reducing transportation costs, and fostering global environmental benefits.

Imagine Doris, who is …


Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou Nov 2023

Motif-Cluster: A Spatial Clustering Package For Repetitive Motif Binding Patterns, Mengyuan Zhou

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Previous efforts in using genome-wide analysis of transcription factor binding sites (TFBSs) have overlooked the importance of ranking potential significant regulatory regions, especially those with repetitive binding within a local region. Identifying these homogenous binding sites is critical because they have the potential to amplify the binding affinity and regulation activity of transcription factors, impacting gene expression and cellular functions. To address this issue, we developed an open-source tool Motif-Cluster that prioritizes and visualizes transcription factor regulatory regions by incorporating the idea of local motif clusters. Motif-Cluster can rank the significant transcription factor regulatory regions without the need for experimental …


Decoy-Target Database Strategy And False Discovery Rate Analysis For Glycan Identification, Xiaoou Li Jul 2023

Decoy-Target Database Strategy And False Discovery Rate Analysis For Glycan Identification, Xiaoou Li

Electronic Thesis and Dissertation Repository

In recent years, the technology of glycopeptide sequencing through MS/MS mass spectrometry data has achieved remarkable progress. Various software tools have been developed and widely used for protein identification. Estimation of false discovery rate (FDR) has become an essential method for evaluating the performance of glycopeptide scoring algorithms. The target-decoy strategy, which involves constructing decoy databases, is currently the most popular utilized method for FDR calculation. In this study, we applied various decoy construction algorithms to generate decoy glycan databases and proposed a novel approach to calculate the FDR by using the EM algorithm and mixture model.


Improving Adjacency List Storage Methods For Polypeptide Similarity Analysis, Arianna Swensen Dec 2022

Improving Adjacency List Storage Methods For Polypeptide Similarity Analysis, Arianna Swensen

Honors Theses

Protein design is a complex biomolecular and computational problem. Working on increasingly large protein folding problems requires an improvement in current analysis methods available. This work first discusses various methods of protein design, including de novo protein design, which is the primary focus of this thesis. Then, a new approach utilizing a B+ tree to effectively store and query a graph of keys and vertices is proposed in order to store the number of times two polypeptides are considered to be similar. This approach is found to have a reduction in time complexity from current mapping methods and thus provides …


The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina May 2022

The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina

Student Theses and Dissertations

Purpose:
Sonic branding is not just about composing jingles like McDonald’s “I’m Lovin’ It.” Sonic branding is an industry that strategically designs a cohesive auditory component of a brand’s corporate identity. This paper examines the psychological impact of music and sound on consumer behavior reviewing studies from the past 40 years and investigates the significance of stimulating auditory perception by infusing sound in consumer experience in the modern 2020s.

Design/methodology/approach:
Qualitative content analysis of audio media was used to test two hypotheses. Four archival oral interview recordings from Jeanna Isham’s podcast “Sound in Marketing” featuring the sonic branding experts …


Simulating Polistes Dominulus Nest-Building Heuristics With Deterministic And Markovian Properties, Benjamin Pottinger May 2022

Simulating Polistes Dominulus Nest-Building Heuristics With Deterministic And Markovian Properties, Benjamin Pottinger

Undergraduate Honors Theses

European Paper Wasps (Polistes dominula) are social insects that build round, symmetrical nests. Current models indicate that these wasps develop colonies by following simple heuristics based on nest stimuli. Computer simulations can model wasp behavior to imitate natural nest building. This research investigated various building heuristics through a novel Markov-based simulation. The simulation used a hexagonal grid to build cells based on the building rule supplied to the agent. Nest data was compared with natural data and through visual inspection. Larger nests were found to be less compact for the rules simulated.


Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas Apr 2022

Ubjective Information And Survival In A Simulated Biological System, Tyler S. Barker, Massimiliano Pierobon, Peter J. Thomas

School of Computing: Faculty Publications

Information transmission and storage have gained traction as unifying concepts to characterize biological systems and their chances of survival and evolution at multiple scales. Despite the potential for an information-based mathematical framework to offer new insights into life processes and ways to interact with and control them, the main legacy is that of Shannon’s, where a purely syntactic characterization of information scores systems on the basis of their maximum information efficiency. The latter metrics seem not entirely suitable for biological systems, where transmission and storage of different pieces of information (carrying different semantics) can result in different chances of survival. …


Three-Dimensional Graph Matching To Identify Secondary Structure Correspondence Of Medium-Resolution Cryo-Em Density Maps, Bahareh Behkamal, Mahmoud Naghibzadeh, Mohammad Reza Saberi, Zeinab Amiri Tehranizadeh, Andrea Pagnani, Kamal Al Nasr Nov 2021

Three-Dimensional Graph Matching To Identify Secondary Structure Correspondence Of Medium-Resolution Cryo-Em Density Maps, Bahareh Behkamal, Mahmoud Naghibzadeh, Mohammad Reza Saberi, Zeinab Amiri Tehranizadeh, Andrea Pagnani, Kamal Al Nasr

Computer Science Faculty Research

Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such …


Comparison Of Multiple Imputation Algorithms And Verification Using Whole-Genome Sequencing In The Cmuh Genetic Biobank, Ting-Yuan Liu, Chih-Fan Lin, Hsing-Tsung Wu, Ya-Lun Wu, Yu-Chia Chen, Chi-Chou Liao, Yu-Pao Chou, Dysan Chao, Hsing-Fang Lu, Ya-Sian Chang, Jan-Gowth Chang, Kai-Cheng Hsu, Fuu‑Jen Tsai Nov 2021

Comparison Of Multiple Imputation Algorithms And Verification Using Whole-Genome Sequencing In The Cmuh Genetic Biobank, Ting-Yuan Liu, Chih-Fan Lin, Hsing-Tsung Wu, Ya-Lun Wu, Yu-Chia Chen, Chi-Chou Liao, Yu-Pao Chou, Dysan Chao, Hsing-Fang Lu, Ya-Sian Chang, Jan-Gowth Chang, Kai-Cheng Hsu, Fuu‑Jen Tsai

BioMedicine

A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of single-nucleotide polymorphisms (SNPs) with a minor allele frequency of >1%. Nevertheless, imputation represents a useful alternative means of expanding data. In this study, we compared four imputation algorithms in terms of …


Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke May 2021

Trunctrimmer: A First Step Towards Automating Standard Bioinformatic Analysis, Z. Gunner Lawless, Dana Dittoe, Dale R. Thompson, Steven C. Ricke

Computer Science and Computer Engineering Undergraduate Honors Theses

Bioinformatic analysis is a time-consuming process for labs performing research on various microbiomes. Researchers use tools like Qiime2 to help standardize the bioinformatic analysis methods, but even large, extensible platforms like Qiime2 have drawbacks due to the attention required by researchers. In this project, we propose to automate additional standard lab bioinformatic procedures by eliminating the existing manual process of determining the trim and truncate locations for paired end 2 sequences. We introduce a new Qiime2 plugin called TruncTrimmer to automate the process that usually requires the researcher to make a decision on where to trim and truncate manually after …


Extending Import Detection Algorithms For Concept Import From Two To Three Biomedical Terminologies, Vipina K. Keloth, James Geller, Yan Chen, Julia Xu Dec 2020

Extending Import Detection Algorithms For Concept Import From Two To Three Biomedical Terminologies, Vipina K. Keloth, James Geller, Yan Chen, Julia Xu

Publications and Research

Background: While enrichment of terminologies can be achieved in different ways, filling gaps in the IS-A hierarchy backbone of a terminology appears especially promising. To avoid difficult manual inspection, we started a research program in 2014, investigating terminology densities, where the comparison of terminologies leads to the algorithmic discovery of potentially missing concepts in a target terminology. While candidate concepts have to be approved for import by an expert, the human effort is greatly reduced by algorithmic generation of candidates. In previous studies, a single source terminology was used with one target terminology.

Methods: In this paper, we are extending …


New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger Nov 2020

New Methods For Deep Learning Based Real-Valued Inter-Residue Distance Prediction, Jacob Barger

Theses

Background: Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction--a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions …


Cylindrical Similarity Measurement For Helices In Medium-Resolution Cryo-Electron Microscopy Density Maps, Salim Sazzed, Peter Scheible, Maytha Alshammari, Willy Wriggers, Jing He Apr 2020

Cylindrical Similarity Measurement For Helices In Medium-Resolution Cryo-Electron Microscopy Density Maps, Salim Sazzed, Peter Scheible, Maytha Alshammari, Willy Wriggers, Jing He

College of Sciences Posters

Cryo-electron microscopy (cryo-EM) density maps at medium resolution (5-10 Å) reveal secondary structural features such as α-helices and β-sheets, but they lack the side chains details that would enable a direct structure determination. Among the more than 800 entries in the Electron Microscopy Data Bank (EMDB) of medium-resolution density maps that are associated with atomic models, a wide variety of similarities can be observed between maps and models. To validate such atomic models and to classify structural features, a local similarity criterion, the F1 score, is proposed and evaluated in this study. The F1 score is theoretically normalized to a …


Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh Oct 2019

Function And Dissipation In Finite State Automata - From Computing To Intelligence And Back, Natesh Ganesh

Doctoral Dissertations

Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore's law. However, we are fast approaching the ultimate physical limits in terms of both device sizes and the associated energy dissipation. It is important to characterize these limits in a physically grounded and implementation-agnostic manner, in order to capture the fundamental energy dissipation costs associated with performing computing operations with classical information in nano-scale quantum systems. It is also necessary to identify and understand the effect of quantum in-distinguishability, noise, and device variability on these dissipation limits. Identifying these parameters is crucial to designing …


Protein Inter-Residue Distance Prediction Using Residual And Capsule Networks, Andrew Dillon Oct 2019

Protein Inter-Residue Distance Prediction Using Residual And Capsule Networks, Andrew Dillon

Theses

The protein folding problem, also known as protein structure prediction, is the task of building three-dimensional protein models given their one-dimensional amino acid sequence. New methods that have been successfully used in the most recent CASP challenge have demonstrated that predicting a protein's inter-residue distances is key to solving this problem. Various deep learning algorithms including fully convolutional neural networks and residual networks have been developed to solve the distance prediction problem. In this work, we develop a hybrid method based on residual networks and capsule networks. We demonstrate that our method can predict distances more accurately than the algorithms …


Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, Nelson Zange Tsaku Aug 2019

Texture-Based Deep Neural Network For Histopathology Cancer Whole Slide Image (Wsi) Classification, Nelson Zange Tsaku

Master of Science in Computer Science Theses

Automatic histopathological Whole Slide Image (WSI) analysis for cancer classification has been highlighted along with the advancements in microscopic imaging techniques. However, manual examination and diagnosis with WSIs is time-consuming and tiresome. Recently, deep convolutional neural networks have succeeded in histopathological image analysis. In this paper, we propose a novel cancer texture-based deep neural network (CAT-Net) that learns scalable texture features from histopathological WSIs. The innovation of CAT-Net is twofold: (1) capturing invariant spatial patterns by dilated convolutional layers and (2) Reducing model complexity while improving performance. Moreover, CAT-Net can provide discriminative texture patterns formed on cancerous regions of histopathological …


Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan Aug 2019

Statistical And Machine Learning Methods Evaluated For Incorporating Soil And Weather Into Corn Nitrogen Recommendations, Curtis J. Ransom, Newell R. Kitchen, James J. Camberato, Paul R. Carter, Richard B. Ferguson, Fabián G. Fernández, David W. Franzen, Carrie A. M. Laboski, D. Brenton Myers, Emerson D. Nafziger, John E. Sawyer, John F. Shanahan

John E. Sawyer

Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success of incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLSR), decision tree, and random forest] were evaluated using a dataset …


Gogo: An Improved Algorithm To Measure The Semantic Similarity Between Gene Ontology Terms, Chenguang Zhao May 2019

Gogo: An Improved Algorithm To Measure The Semantic Similarity Between Gene Ontology Terms, Chenguang Zhao

Master's Theses

Measuring the semantic similarity between Gene Ontology (GO) terms is an essential step in functional bioinformatics research. We implemented a software named GOGO for calculating the semantic similarity between GO terms. GOGO has the advantages of both information-content-based and hybrid methods, such as Resnik’s and Wang’s methods. Moreover, GOGO is relatively fast and does not need to calculate information content (IC) from a large gene annotation corpus but still has the advantage of using IC. This is achieved by considering the number of children nodes in the GO directed acyclic graphs when calculating the semantic contribution of an ancestor node …


Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens Mar 2019

Computational Analysis Of Large-Scale Trends And Dynamics In Eukaryotic Protein Family Evolution, Joseph Boehm Ahrens

FIU Electronic Theses and Dissertations

The myriad protein-coding genes found in present-day eukaryotes arose from a combination of speciation and gene duplication events, spanning more than one billion years of evolution. Notably, as these proteins evolved, the individual residues at each site in their amino acid sequences were replaced at markedly different rates. The relationship between protein structure, protein function, and site-specific rates of amino acid replacement is a topic of ongoing research. Additionally, there is much interest in the different evolutionary constraints imposed on sequences related by speciation (orthologs) versus sequences related by gene duplication (paralogs). A principal aim of this dissertation is to …


A Theoretical Model Of Underground Dipole Antennas For Communications In Internet Of Underground Things, Abdul Salam, Mehmet C. Vuran, Xin Dong, Christos Argyropoulos, Suat Irmak Feb 2019

A Theoretical Model Of Underground Dipole Antennas For Communications In Internet Of Underground Things, Abdul Salam, Mehmet C. Vuran, Xin Dong, Christos Argyropoulos, Suat Irmak

Faculty Publications

The realization of Internet of Underground Things (IOUT) relies on the establishment of reliable communication links, where the antenna becomes a major design component due to the significant impacts of soil. In this paper, a theoretical model is developed to capture the impacts of change of soil moisture on the return loss, resonant frequency, and bandwidth of a buried dipole antenna. Experiments are conducted in silty clay loam, sandy, and silt loam soil, to characterize the effects of soil, in an indoor testbed and field testbeds. It is shown that at subsurface burial depths (0.1-0.4m), change in soil moisture impacts …


Estimating Waterbird Abundance On Catfish Aquaculture Ponds Using An Unmanned Aerial System, Paul C. Burr, Sathishkumar Samiappan, Lee A. Hathcock, Robert J. Moorhead, Brian S. Dorr Jan 2019

Estimating Waterbird Abundance On Catfish Aquaculture Ponds Using An Unmanned Aerial System, Paul C. Burr, Sathishkumar Samiappan, Lee A. Hathcock, Robert J. Moorhead, Brian S. Dorr

Human–Wildlife Interactions

In this study, we examined the use of an unmanned aerial system (UAS) to monitor fish-eating birds on catfish (Ictalurus spp.) aquaculture facilities in Mississippi, USA. We tested 2 automated computer algorithms to identify bird species using mosaicked imagery taken from a UAS platform. One algorithm identified birds based on color alone (color segmentation), and the other algorithm used shape recognition (template matching), and the results of each algorithm were compared directly to manual counts of the same imagery. We captured digital imagery of great egrets (Ardea alba), great blue herons (A. herodias), …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Computer Vision Evidence Supporting Craniometric Alignment Of Rat Brain Atlases To Streamline Expert-Guided, First-Order Migration Of Hypothalamic Spatial Datasets Related To Behavioral Control, Arshad M. Khan, Jose G. Perez, Claire E. Wells, Olac Fuentes Apr 2018

Computer Vision Evidence Supporting Craniometric Alignment Of Rat Brain Atlases To Streamline Expert-Guided, First-Order Migration Of Hypothalamic Spatial Datasets Related To Behavioral Control, Arshad M. Khan, Jose G. Perez, Claire E. Wells, Olac Fuentes

Arshad M. Khan, Ph.D.

The rat has arguably the most widely studied brain among all animals, with numerous reference atlases for rat brain having been published since 1946. For example, many neuroscientists have used the atlases of Paxinos and Watson (PW, first published in 1982) or Swanson (S, first published in 1992) as guides to probe or map specific rat brain structures and their connections. Despite nearly three decades of contemporaneous publication, no independent attempt has been made to establish a basic framework that allows data mapped in PW to be placed in register with S, or vice versa. …


Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao Apr 2018

Efficient Reduced Bias Genetic Algorithm For Generic Community Detection Objectives, Aditya Karnam Gururaj Rao

Theses

The problem of community structure identification has been an extensively investigated area for biology, physics, social sciences, and computer science in recent years for studying the properties of networks representing complex relationships. Most traditional methods, such as K-means and hierarchical clustering, are based on the assumption that communities have spherical configurations. Lately, Genetic Algorithms (GA) are being utilized for efficient community detection without imposing sphericity. GAs are machine learning methods which mimic natural selection and scale with the complexity of the network. However, traditional GA approaches employ a representation method that dramatically increases the solution space to be searched by …


Similarity Based Classification Of Adhd Using Singular Value Decomposition, Taban Eslami, Fahad Saeed Apr 2018

Similarity Based Classification Of Adhd Using Singular Value Decomposition, Taban Eslami, Fahad Saeed

Parallel Computing and Data Science Lab Technical Reports

Attention deficit hyperactivity disorder (ADHD) is one of the most common brain disorders among children. This disorder is considered as a big threat for public health and causes attention, focus and organizing difficulties for children and even adults. Since the cause of ADHD is not known yet, data mining algorithms are being used to help discover patterns which discriminate healthy from ADHD subjects. Numerous efforts are underway with the goal of developing classification tools for ADHD diagnosis based on functional and structural magnetic resonance imaging data of the brain. In this paper, we used Eros, which is a technique for …


Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg Jan 2018

Fast And Space-Efficient Location Of Heavy Or Dense Segments In Run-Length Encoded Sequences, Ronald I. Greenberg

Ronald Greenberg

This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible …


Ultra-Fast And Memory-Efficient Lookups For Cloud, Networked Systems, And Massive Data Management, Ye Yu Jan 2018

Ultra-Fast And Memory-Efficient Lookups For Cloud, Networked Systems, And Massive Data Management, Ye Yu

Theses and Dissertations--Computer Science

Systems that process big data (e.g., high-traffic networks and large-scale storage) prefer data structures and algorithms with small memory and fast processing speed. Efficient and fast algorithms play an essential role in system design, despite the improvement of hardware. This dissertation is organized around a novel algorithm called Othello Hashing. Othello Hashing supports ultra-fast and memory-efficient key-value lookup, and it fits the requirements of the core algorithms of many large-scale systems and big data applications. Using Othello hashing, combined with domain expertise in cloud, computer networks, big data, and bioinformatics, I developed the following applications that resolve several major …