Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Physical Sciences and Mathematics

Hite: A Fast And Accurate Dynamic Boundary Adjustment Approach For Full-Length Transposable Element Detection And Annotation, Kang Hu, Peng Ning, Minghua Xu, You Zou, Jianye Chang, Xin Gao, Yaohang Li, Jue Ruan, Bin Hu, Jianxin Wang Jan 2024

Hite: A Fast And Accurate Dynamic Boundary Adjustment Approach For Full-Length Transposable Element Detection And Annotation, Kang Hu, Peng Ning, Minghua Xu, You Zou, Jianye Chang, Xin Gao, Yaohang Li, Jue Ruan, Bin Hu, Jianxin Wang

Computer Science Faculty Publications

Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, …


A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun Mar 2022

A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun

FIU Electronic Theses and Dissertations

Cancer is a complex molecular process due to abnormal changes in the genome, such as mutation and copy number variation, and epigenetic aberrations such as dysregulations of long non-coding RNA (lncRNA). These abnormal changes are reflected in transcriptome by turning oncogenes on and tumor suppressor genes off, which are considered cancer biomarkers.

However, transcriptomic data is high dimensional, and finding the best subset of genes (features) related to causing cancer is computationally challenging and expensive. Thus, developing a feature selection framework to discover molecular biomarkers for cancer is critical.

Traditional approaches for biomarker discovery calculate the fold change for each …


A Novel Method For Detecting Morphologically Similar Crops And Weeds Based On The Combination Of Contour Masks And Filtered Local Binary Pattern Operators, Vi Nguyen Thanh Le, Selam Ahderom, Beniamin Apopei, Kamal Alameh Jan 2020

A Novel Method For Detecting Morphologically Similar Crops And Weeds Based On The Combination Of Contour Masks And Filtered Local Binary Pattern Operators, Vi Nguyen Thanh Le, Selam Ahderom, Beniamin Apopei, Kamal Alameh

Research outputs 2014 to 2021

Background: Weeds are a major cause of low agricultural productivity. Some weeds have morphological features similar to crops, making them difficult to discriminate. Results: We propose a novel method using a combination of filtered features extracted by combined Local Binary Pattern operators and features extracted by plant-leaf contour masks to improve the discrimination rate between broadleaf plants. Opening and closing morphological operators were applied to filter noise in plant images. The images at 4 stages of growth were collected using a testbed system. Mask-based local binary pattern features were combined with filtered features and a coefficient k. The classification of …


Coral Reef Change Detection In Remote Pacific Islands Using Support Vector Machine Classifiers, Justin J. Gapper, Hesham El-Askary, Erik Linstead, Thomas Piechota Jun 2019

Coral Reef Change Detection In Remote Pacific Islands Using Support Vector Machine Classifiers, Justin J. Gapper, Hesham El-Askary, Erik Linstead, Thomas Piechota

Mathematics, Physics, and Computer Science Faculty Articles and Research

Despite the abundance of research on coral reef change detection, few studies have been conducted to assess the spatial generalization principles of a live coral cover classifier trained using remote sensing data from multiple locations. The aim of this study is to develop a machine learning classifier for coral dominated benthic cover-type class (CDBCTC) based on ground truth observations and Landsat images, evaluate the performance of this classifier when tested against new data, then deploy the classifier to perform CDBCTC change analysis of multiple locations. The proposed framework includes image calibration, support vector machine (SVM) training and tuning, statistical assessment …


Effective Plant Discrimination Based On The Combination Of Local Binary Pattern Operators And Multiclass Support Vector Machine Methods, Vi N T Le, Beniamin Apopei, Kamal Alameh Jan 2019

Effective Plant Discrimination Based On The Combination Of Local Binary Pattern Operators And Multiclass Support Vector Machine Methods, Vi N T Le, Beniamin Apopei, Kamal Alameh

Research outputs 2014 to 2021

Accurate crop and weed discrimination plays a critical role in addressing the challenges of weed management in agriculture. The use of herbicides is currently the most common approach to weed control. However, herbicide resistant plants have long been recognised as a major concern due to the excessive use of herbicides. Effective weed detection techniques can reduce the cost of weed management and improve crop quality and yield. A computationally efficient and robust plant classification algorithm is developed and applied to the classification of three crops: Brassica napus (canola), Zea mays (maize/corn), and radish. The developed algorithm is based on the …


Evaluation Of Spatial Generalization Characteristics Of A Robust Classifier As Applied To Coral Reef Habitats In Remote Islands Of The Pacific Ocean, Justin J. Gapper, Hesham El-Askary, Erik J. Linstead, Thomas Piechota Nov 2018

Evaluation Of Spatial Generalization Characteristics Of A Robust Classifier As Applied To Coral Reef Habitats In Remote Islands Of The Pacific Ocean, Justin J. Gapper, Hesham El-Askary, Erik J. Linstead, Thomas Piechota

Mathematics, Physics, and Computer Science Faculty Articles and Research

This study was an evaluation of the spectral signature generalization properties of coral across four remote Pacific Ocean reefs. The sites under consideration have not been the subject of previous studies for coral classification using remote sensing data. Previous research regarding using remote sensing to identify reefs has been limited to in-situ assessment, with some researchers also performing temporal analysis of a selected area of interest. This study expanded the previous in-situ analyses by evaluating the ability of a basic predictor, Linear Discriminant Analysis (LDA), trained on Depth Invariant Indices calculated from the spectral signature of coral in one location …


Assessment Of The Ponderosa Woodlands In Nebraska's Wildcat Hills: Implications For Juniperus Encroachment And Management, Allie Victoria Schiltmeyer Jul 2018

Assessment Of The Ponderosa Woodlands In Nebraska's Wildcat Hills: Implications For Juniperus Encroachment And Management, Allie Victoria Schiltmeyer

School of Natural Resources: Dissertations, Theses, and Student Research

Ponderosa pine (Pinus ponderosa) is a dominant tree species across western North America. Its eastern distribution includes three populations in western Nebraska. This study assesses the distribution, structure and age of ponderosa pine woodlands in one of those regions, the Wildcat Hills. The Wildcat Hills have escaped severe wildfires seen in recent decades in other ponderosa pine regions. Nevertheless, the Wildcat Hills woodlands face multiple threats including climate change, wildfire, drought, pine beetles, and invasive species. Key to these threats is the stand structure of pine woodlands, which have increased in density across much of ponderosa pine’s range. …


A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth Jan 2017

A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth

Kno.e.sis Publications

Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …


Classification And Visualization Of Neural Patterns Using Subspace Analysis Statistical Methods, Jun Xia, Marius Osan, Emilia Titan, Riana Nicolae, Remus Osan Jan 2012

Classification And Visualization Of Neural Patterns Using Subspace Analysis Statistical Methods, Jun Xia, Marius Osan, Emilia Titan, Riana Nicolae, Remus Osan

Neuroscience Institute Faculty Publications

The size and complexity of neural data is increasing at a dramatic pace due to rapid advances in experimental technologies. As a result, the data analysis techniques are shifting their focus from single-units to neural populations. The goal is to investigate complex temporal and spatial patterns, as well as to present the results in an intuitive way, allowing for detection and monitoring of relevant neural patterns.


Cellulose- And Xylan-Degrading Thermophilic Anaerobic Bacteria From Biocompost, M. V. Sizova, J. A. Izquierdo, N. S. Panikov, L. R. Lynd Feb 2011

Cellulose- And Xylan-Degrading Thermophilic Anaerobic Bacteria From Biocompost, M. V. Sizova, J. A. Izquierdo, N. S. Panikov, L. R. Lynd

Dartmouth Scholarship

Nine thermophilic cellulolytic clostridial isolates and four other noncellulolytic bacterial isolates were isolated from self-heated biocompost via preliminary enrichment culture on microcrystalline cellulose. All cellulolytic isolates grew vigorously on cellulose, with the formation of either ethanol and acetate or acetate and formate as principal fermentation products as well as lactate and glycerol as minor products. In addition, two out of nine cellulolytic strains were able to utilize xylan and pretreated wood with roughly the same efficiency as for cellulose. The major products of xylan fermentation were acetate and formate, with minor contributions of lactate and ethanol. Phylogenetic analyses of 16S …


Barcoding Of Arrow Worms (Phylum Chaetognatha) From Three Oceans: Genetic Diversity And Evolution Within An Enigmatic Phylum, Robert M. Jennings, Ann Bucklin, Annelies Pierrot-Bults Apr 2010

Barcoding Of Arrow Worms (Phylum Chaetognatha) From Three Oceans: Genetic Diversity And Evolution Within An Enigmatic Phylum, Robert M. Jennings, Ann Bucklin, Annelies Pierrot-Bults

Biology Faculty Publication Series

Arrow worms (Phylum Chaetognatha) are abundant planktonic organisms and important predators in many food webs; yet, the classification and evolutionary relationships among chaetognath species remain poorly understood. A seemingly simple body plan is underlain by subtle variation in morphological details, obscuring the affinities of species within the phylum. Many species achieve near global distributions, spanning the same latitudinal bands in all ocean basins, while others present disjunct ranges, in some cases with the same species apparently found at both poles. To better understand how these complex evolutionary and geographic variables are reflected in the species makeup of chaetognaths, we analyze …


Development Of A Regional Habitat Classification Scheme For The Amirante Islands, Seychelles, Sarah Hamylton, Tom Spencer, Annelise Hagan Jan 2010

Development Of A Regional Habitat Classification Scheme For The Amirante Islands, Seychelles, Sarah Hamylton, Tom Spencer, Annelise Hagan

Faculty of Science - Papers (Archive)

A collaborative expedition between Khaled bin Sultan Living Oceans Foundation, Cambridge Coastal Research Unit and Seychelles Centre for Marine Research and Technology – Marine Parks Authority (SCMRT-MPA) was conducted to the southern Seychelles, western Indian Ocean, in January 2005. This resulted in a series of habitat maps of the reefs and reef islands of the Amirantes Archipelago, derived from remotely-sensed Compact Airborne Spectrographic Imager (CASI) data. The procedures used in map development, image processing techniques and field survey methods are outlined. Habitat classification, and regional-scale comparisons of relative habitat composition are described. The study demonstrates the use of remote sensing …


Classification Of Seagrass Habitat Structure As A Reponse To Wave Exposure At Etoile Cay, Seychelles, Sarah Hamylton, Tom Spencer Jan 2007

Classification Of Seagrass Habitat Structure As A Reponse To Wave Exposure At Etoile Cay, Seychelles, Sarah Hamylton, Tom Spencer

Faculty of Science - Papers (Archive)

Physical processes are thought to be a critical control on shallow water communities in the tropics. Past studies of seagrass community patterns have tended to be qualitative and failed to empirically link observed structures with the processes that govern them. Remote sensing technology, in the form of imagery acquired using a Compact Airborne Spectrographic Imager (CASI), has been used to construct a habitat map of seagrass communities at Etoile Cay, Amirantes, Seychelles. . A sim-ple definition of seagrass habitat structure, incorporating measures of complexity and heterogene-ity, has been investigated along a wave exposure gradient via moving window analysis over the …


Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey Nov 2005

Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey

UW Biostatistics Working Paper Series

Nearest centroid classifiers have recently been successfully employed in high-dimensional applications. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is typically carried out by computing univariate statistics for each feature individually, without consideration for how a subset of features performs as a whole. For subsets of a given size, we characterize the optimal choice of features, corresponding to those yielding the smallest misclassification rate. Furthermore, we propose an algorithm for estimating this optimal subset in practice. Finally, we investigate the applicability of shrinkage ideas to nearest centroid classifiers. We use gene-expression microarrays for …


Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng Dec 2003

Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng

U.C. Berkeley Division of Biostatistics Working Paper Series

Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …


Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer Jan 2003

Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer

UW Biostatistics Working Paper Series

High throughput technologies, such as gene expression arrays and protein mass spectrometry, allow one to simultaneously evaluate thousands of potential biomarkers that distinguish different tissue types. Of particular interest here is cancer versus normal organ tissues. We consider statistical methods to rank genes (or proteins) in regards to differential expression between tissues. Various statistical measures are considered and we argue that two measures related to the Receiver Operating Characteristic Curve are particularly suitable for this purpose. We also propose that sampling variability in the gene rankings be quantified and suggest using the “selection probability function”, the probability distribution of rankings …


Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao May 2001

Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao

Kno.e.sis Publications

Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …


A Proposed Method For Classifying And Evaluating Soils On The Basis Of Productivity And Use Suitabilities, Arthur Anderson, A. P. Nelson, F. A. Hayes, I. D. Wood May 1938

A Proposed Method For Classifying And Evaluating Soils On The Basis Of Productivity And Use Suitabilities, Arthur Anderson, A. P. Nelson, F. A. Hayes, I. D. Wood

Nebraska Agricultural Experiment Station: Historical Research Bulletins

It is the object of this paper to present a method for classifying and evaluating the soils as mapped in regular soil surveys on the basis of land types, which are here defined as areas having reasonably similar productivity and use suitabilities. The standards used to differentiate land types will vary according to the desired objectives, but any material difference in yield, or in practices necessary to maintain a desirable level of productivity will justify recognition of land types. The proposed procedure involves a more detailed study of the influence which soils, slope, erosion, and drainage have on specific crops …