Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication Year
- Publication
-
- Faculty of Science - Papers (Archive) (2)
- Kno.e.sis Publications (2)
- Mathematics, Physics, and Computer Science Faculty Articles and Research (2)
- Research outputs 2014 to 2021 (2)
- UW Biostatistics Working Paper Series (2)
-
- Biology Faculty Publication Series (1)
- Computer Science Faculty Publications (1)
- Dartmouth Scholarship (1)
- FIU Electronic Theses and Dissertations (1)
- Nebraska Agricultural Experiment Station: Historical Research Bulletins (1)
- Neuroscience Institute Faculty Publications (1)
- School of Natural Resources: Dissertations, Theses, and Student Research (1)
- U.C. Berkeley Division of Biostatistics Working Paper Series (1)
Articles 1 - 18 of 18
Full-Text Articles in Physical Sciences and Mathematics
Hite: A Fast And Accurate Dynamic Boundary Adjustment Approach For Full-Length Transposable Element Detection And Annotation, Kang Hu, Peng Ning, Minghua Xu, You Zou, Jianye Chang, Xin Gao, Yaohang Li, Jue Ruan, Bin Hu, Jianxin Wang
Hite: A Fast And Accurate Dynamic Boundary Adjustment Approach For Full-Length Transposable Element Detection And Annotation, Kang Hu, Peng Ning, Minghua Xu, You Zou, Jianye Chang, Xin Gao, Yaohang Li, Jue Ruan, Bin Hu, Jianxin Wang
Computer Science Faculty Publications
Recent advancements in genome assembly have greatly improved the prospects for comprehensive annotation of Transposable Elements (TEs). However, existing methods for TE annotation using genome assemblies suffer from limited accuracy and robustness, requiring extensive manual editing. In addition, the currently available gold-standard TE databases are not comprehensive, even for extensively studied species, highlighting the critical need for an automated TE detection method to supplement existing repositories. In this study, we introduce HiTE, a fast and accurate dynamic boundary adjustment approach designed to detect full-length TEs. The experimental results demonstrate that HiTE outperforms RepeatModeler2, the state-of-the-art tool, across various species. Furthermore, …
A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun
A Machine Learning Framework For Identifying Molecular Biomarkers From Transcriptomic Cancer Data, Md Abdullah Al Mamun
FIU Electronic Theses and Dissertations
Cancer is a complex molecular process due to abnormal changes in the genome, such as mutation and copy number variation, and epigenetic aberrations such as dysregulations of long non-coding RNA (lncRNA). These abnormal changes are reflected in transcriptome by turning oncogenes on and tumor suppressor genes off, which are considered cancer biomarkers.
However, transcriptomic data is high dimensional, and finding the best subset of genes (features) related to causing cancer is computationally challenging and expensive. Thus, developing a feature selection framework to discover molecular biomarkers for cancer is critical.
Traditional approaches for biomarker discovery calculate the fold change for each …
A Novel Method For Detecting Morphologically Similar Crops And Weeds Based On The Combination Of Contour Masks And Filtered Local Binary Pattern Operators, Vi Nguyen Thanh Le, Selam Ahderom, Beniamin Apopei, Kamal Alameh
A Novel Method For Detecting Morphologically Similar Crops And Weeds Based On The Combination Of Contour Masks And Filtered Local Binary Pattern Operators, Vi Nguyen Thanh Le, Selam Ahderom, Beniamin Apopei, Kamal Alameh
Research outputs 2014 to 2021
Background: Weeds are a major cause of low agricultural productivity. Some weeds have morphological features similar to crops, making them difficult to discriminate. Results: We propose a novel method using a combination of filtered features extracted by combined Local Binary Pattern operators and features extracted by plant-leaf contour masks to improve the discrimination rate between broadleaf plants. Opening and closing morphological operators were applied to filter noise in plant images. The images at 4 stages of growth were collected using a testbed system. Mask-based local binary pattern features were combined with filtered features and a coefficient k. The classification of …
Coral Reef Change Detection In Remote Pacific Islands Using Support Vector Machine Classifiers, Justin J. Gapper, Hesham El-Askary, Erik Linstead, Thomas Piechota
Coral Reef Change Detection In Remote Pacific Islands Using Support Vector Machine Classifiers, Justin J. Gapper, Hesham El-Askary, Erik Linstead, Thomas Piechota
Mathematics, Physics, and Computer Science Faculty Articles and Research
Despite the abundance of research on coral reef change detection, few studies have been conducted to assess the spatial generalization principles of a live coral cover classifier trained using remote sensing data from multiple locations. The aim of this study is to develop a machine learning classifier for coral dominated benthic cover-type class (CDBCTC) based on ground truth observations and Landsat images, evaluate the performance of this classifier when tested against new data, then deploy the classifier to perform CDBCTC change analysis of multiple locations. The proposed framework includes image calibration, support vector machine (SVM) training and tuning, statistical assessment …
Effective Plant Discrimination Based On The Combination Of Local Binary Pattern Operators And Multiclass Support Vector Machine Methods, Vi N T Le, Beniamin Apopei, Kamal Alameh
Effective Plant Discrimination Based On The Combination Of Local Binary Pattern Operators And Multiclass Support Vector Machine Methods, Vi N T Le, Beniamin Apopei, Kamal Alameh
Research outputs 2014 to 2021
Accurate crop and weed discrimination plays a critical role in addressing the challenges of weed management in agriculture. The use of herbicides is currently the most common approach to weed control. However, herbicide resistant plants have long been recognised as a major concern due to the excessive use of herbicides. Effective weed detection techniques can reduce the cost of weed management and improve crop quality and yield. A computationally efficient and robust plant classification algorithm is developed and applied to the classification of three crops: Brassica napus (canola), Zea mays (maize/corn), and radish. The developed algorithm is based on the …
Evaluation Of Spatial Generalization Characteristics Of A Robust Classifier As Applied To Coral Reef Habitats In Remote Islands Of The Pacific Ocean, Justin J. Gapper, Hesham El-Askary, Erik J. Linstead, Thomas Piechota
Evaluation Of Spatial Generalization Characteristics Of A Robust Classifier As Applied To Coral Reef Habitats In Remote Islands Of The Pacific Ocean, Justin J. Gapper, Hesham El-Askary, Erik J. Linstead, Thomas Piechota
Mathematics, Physics, and Computer Science Faculty Articles and Research
This study was an evaluation of the spectral signature generalization properties of coral across four remote Pacific Ocean reefs. The sites under consideration have not been the subject of previous studies for coral classification using remote sensing data. Previous research regarding using remote sensing to identify reefs has been limited to in-situ assessment, with some researchers also performing temporal analysis of a selected area of interest. This study expanded the previous in-situ analyses by evaluating the ability of a basic predictor, Linear Discriminant Analysis (LDA), trained on Depth Invariant Indices calculated from the spectral signature of coral in one location …
Assessment Of The Ponderosa Woodlands In Nebraska's Wildcat Hills: Implications For Juniperus Encroachment And Management, Allie Victoria Schiltmeyer
Assessment Of The Ponderosa Woodlands In Nebraska's Wildcat Hills: Implications For Juniperus Encroachment And Management, Allie Victoria Schiltmeyer
School of Natural Resources: Dissertations, Theses, and Student Research
Ponderosa pine (Pinus ponderosa) is a dominant tree species across western North America. Its eastern distribution includes three populations in western Nebraska. This study assesses the distribution, structure and age of ponderosa pine woodlands in one of those regions, the Wildcat Hills. The Wildcat Hills have escaped severe wildfires seen in recent decades in other ponderosa pine regions. Nevertheless, the Wildcat Hills woodlands face multiple threats including climate change, wildfire, drought, pine beetles, and invasive species. Key to these threats is the stand structure of pine woodlands, which have increased in density across much of ponderosa pine’s range. …
A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth
A Novel Approach For Classifying Gene Expression Data Using Topic Modeling, Soon Jye Kho, Himi Yalamanchili, Michael L. Raymer, Amit Sheth
Kno.e.sis Publications
Understanding the role of differential gene expression in cancer etiology and cellular process is a complex problem that continues to pose a challenge due to sheer number of genes and inter-related biological processes involved. In this paper, we employ an unsupervised topic model, Latent Dirichlet Allocation (LDA) to mitigate overfitting of high-dimensionality gene expression data and to facilitate understanding of the associated pathways. LDA has been recently applied for clustering and exploring genomic data but not for classification and prediction. Here, we proposed to use LDA inclustering as well as in classification of cancer and healthy tissues using lung cancer …
Classification And Visualization Of Neural Patterns Using Subspace Analysis Statistical Methods, Jun Xia, Marius Osan, Emilia Titan, Riana Nicolae, Remus Osan
Classification And Visualization Of Neural Patterns Using Subspace Analysis Statistical Methods, Jun Xia, Marius Osan, Emilia Titan, Riana Nicolae, Remus Osan
Neuroscience Institute Faculty Publications
The size and complexity of neural data is increasing at a dramatic pace due to rapid advances in experimental technologies. As a result, the data analysis techniques are shifting their focus from single-units to neural populations. The goal is to investigate complex temporal and spatial patterns, as well as to present the results in an intuitive way, allowing for detection and monitoring of relevant neural patterns.
Cellulose- And Xylan-Degrading Thermophilic Anaerobic Bacteria From Biocompost, M. V. Sizova, J. A. Izquierdo, N. S. Panikov, L. R. Lynd
Cellulose- And Xylan-Degrading Thermophilic Anaerobic Bacteria From Biocompost, M. V. Sizova, J. A. Izquierdo, N. S. Panikov, L. R. Lynd
Dartmouth Scholarship
Nine thermophilic cellulolytic clostridial isolates and four other noncellulolytic bacterial isolates were isolated from self-heated biocompost via preliminary enrichment culture on microcrystalline cellulose. All cellulolytic isolates grew vigorously on cellulose, with the formation of either ethanol and acetate or acetate and formate as principal fermentation products as well as lactate and glycerol as minor products. In addition, two out of nine cellulolytic strains were able to utilize xylan and pretreated wood with roughly the same efficiency as for cellulose. The major products of xylan fermentation were acetate and formate, with minor contributions of lactate and ethanol. Phylogenetic analyses of 16S …
Barcoding Of Arrow Worms (Phylum Chaetognatha) From Three Oceans: Genetic Diversity And Evolution Within An Enigmatic Phylum, Robert M. Jennings, Ann Bucklin, Annelies Pierrot-Bults
Barcoding Of Arrow Worms (Phylum Chaetognatha) From Three Oceans: Genetic Diversity And Evolution Within An Enigmatic Phylum, Robert M. Jennings, Ann Bucklin, Annelies Pierrot-Bults
Biology Faculty Publication Series
Arrow worms (Phylum Chaetognatha) are abundant planktonic organisms and important predators in many food webs; yet, the classification and evolutionary relationships among chaetognath species remain poorly understood. A seemingly simple body plan is underlain by subtle variation in morphological details, obscuring the affinities of species within the phylum. Many species achieve near global distributions, spanning the same latitudinal bands in all ocean basins, while others present disjunct ranges, in some cases with the same species apparently found at both poles. To better understand how these complex evolutionary and geographic variables are reflected in the species makeup of chaetognaths, we analyze …
Development Of A Regional Habitat Classification Scheme For The Amirante Islands, Seychelles, Sarah Hamylton, Tom Spencer, Annelise Hagan
Development Of A Regional Habitat Classification Scheme For The Amirante Islands, Seychelles, Sarah Hamylton, Tom Spencer, Annelise Hagan
Faculty of Science - Papers (Archive)
A collaborative expedition between Khaled bin Sultan Living Oceans Foundation, Cambridge Coastal Research Unit and Seychelles Centre for Marine Research and Technology – Marine Parks Authority (SCMRT-MPA) was conducted to the southern Seychelles, western Indian Ocean, in January 2005. This resulted in a series of habitat maps of the reefs and reef islands of the Amirantes Archipelago, derived from remotely-sensed Compact Airborne Spectrographic Imager (CASI) data. The procedures used in map development, image processing techniques and field survey methods are outlined. Habitat classification, and regional-scale comparisons of relative habitat composition are described. The study demonstrates the use of remote sensing …
Classification Of Seagrass Habitat Structure As A Reponse To Wave Exposure At Etoile Cay, Seychelles, Sarah Hamylton, Tom Spencer
Classification Of Seagrass Habitat Structure As A Reponse To Wave Exposure At Etoile Cay, Seychelles, Sarah Hamylton, Tom Spencer
Faculty of Science - Papers (Archive)
Physical processes are thought to be a critical control on shallow water communities in the tropics. Past studies of seagrass community patterns have tended to be qualitative and failed to empirically link observed structures with the processes that govern them. Remote sensing technology, in the form of imagery acquired using a Compact Airborne Spectrographic Imager (CASI), has been used to construct a habitat map of seagrass communities at Etoile Cay, Amirantes, Seychelles. . A sim-ple definition of seagrass habitat structure, incorporating measures of complexity and heterogene-ity, has been investigated along a wave exposure gradient via moving window analysis over the …
Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey
Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey
UW Biostatistics Working Paper Series
Nearest centroid classifiers have recently been successfully employed in high-dimensional applications. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is typically carried out by computing univariate statistics for each feature individually, without consideration for how a subset of features performs as a whole. For subsets of a given size, we characterize the optimal choice of features, corresponding to those yielding the smallest misclassification rate. Furthermore, we propose an algorithm for estimating this optimal subset in practice. Finally, we investigate the applicability of shrinkage ideas to nearest centroid classifiers. We use gene-expression microarrays for …
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng
U.C. Berkeley Division of Biostatistics Working Paper Series
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …
Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer
Selecting Differentially Expressed Genes From Microarray Experiments, Margaret S. Pepe, Gary M. Longton, Garnet L. Anderson, Michel Schummer
UW Biostatistics Working Paper Series
High throughput technologies, such as gene expression arrays and protein mass spectrometry, allow one to simultaneously evaluate thousands of potential biomarkers that distinguish different tissue types. Of particular interest here is cancer versus normal organ tissues. We consider statistical methods to rank genes (or proteins) in regards to differential expression between tissues. Various statistical measures are considered and we argue that two measures related to the Receiver Operating Characteristic Curve are particularly suitable for this purpose. We also propose that sampling variability in the gene rankings be quantified and suggest using the “selection probability function”, the probability distribution of rankings …
Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao
Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao
Kno.e.sis Publications
Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …
A Proposed Method For Classifying And Evaluating Soils On The Basis Of Productivity And Use Suitabilities, Arthur Anderson, A. P. Nelson, F. A. Hayes, I. D. Wood
A Proposed Method For Classifying And Evaluating Soils On The Basis Of Productivity And Use Suitabilities, Arthur Anderson, A. P. Nelson, F. A. Hayes, I. D. Wood
Nebraska Agricultural Experiment Station: Historical Research Bulletins
It is the object of this paper to present a method for classifying and evaluating the soils as mapped in regular soil surveys on the basis of land types, which are here defined as areas having reasonably similar productivity and use suitabilities. The standards used to differentiate land types will vary according to the desired objectives, but any material difference in yield, or in practices necessary to maintain a desirable level of productivity will justify recognition of land types. The proposed procedure involves a more detailed study of the influence which soils, slope, erosion, and drainage have on specific crops …