Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Classification

Discipline
Institution
Publication Year
Publication
File Type

Articles 151 - 180 of 188

Full-Text Articles in Physical Sciences and Mathematics

Context Aware Privacy Preserving Clustering And Classification, Nirmal Thapa Jan 2013

Context Aware Privacy Preserving Clustering And Classification, Nirmal Thapa

Theses and Dissertations--Computer Science

Data are valuable assets to any organizations or individuals. Data are sources of useful information which is a big part of decision making. All sectors have potential to benefit from having information. Commerce, health, and research are some of the fields that have benefited from data. On the other hand, the availability of the data makes it easy for anyone to exploit the data, which in many cases are private confidential data. It is necessary to preserve the confidentiality of the data. We study two categories of privacy: Data Value Hiding and Data Pattern Hiding. Privacy is a huge concern …


Instrument And Method Development For Single-Cell Classification Using Fluorescence Imaging Multivariate Optical Computing, Joseph Swanstrom Jan 2013

Instrument And Method Development For Single-Cell Classification Using Fluorescence Imaging Multivariate Optical Computing, Joseph Swanstrom

Theses and Dissertations

Multivariate optical computing (MOC) is an all-optical approach of predictive spectroscopy that utilizes multivariate calibration and spectral pattern recognition techniques while operating in a simple filter photometer instrument, removing the need for expensive instrumentation and post-processing of spectral data. This is accomplished with specially designed interference filters called multivariate optical elements (MOEs). MOC can provide analytical solutions for applications requiring low cost, rugged, and simple to operate instrumentation for use in remote and hazardous environments such as open ocean waters. These instrument specifications are central for developing a method for classifying phytoplankton in their natural environment. Phytoplankton are photosynthetic single …


Fast And Efficient Classification, Tracking, And Simulation In Wireless Sensor Networks, Hao Jiang Aug 2012

Fast And Efficient Classification, Tracking, And Simulation In Wireless Sensor Networks, Hao Jiang

All Dissertations

Wireless sensor networks are composed of large numbers of resource-lean sensors that collect low-level inputs from the physical world. The applications present challenges for programmers. On the one hand, lightweight algorithms are required given the limited capacity of the constituent devices. On the other, the algorithms must be scalable to accommodate large networks. In this thesis, we focus on the design and implementation of fast and lean (yet scalable) algorithms for classification, simulation, and target tracking in the context of wireless sensor networks. We briefly consider each of these challenges in turn.
The first challenge is to achieve high precision …


Composite Feature-Based Face Detection Using Skin Color Modeling And Svm Classification, Swathi Rajashekar May 2012

Composite Feature-Based Face Detection Using Skin Color Modeling And Svm Classification, Swathi Rajashekar

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

This report proposes a face detection algorithm based on skin color modeling and support vector machine (SVM) classification. Said classification is based on various face features used to detect specific faces in an input color image. A YCbCr color space is used to filter the skin color pixels from the input color image. Template matching is used on the result with various window sizes of the template created from an ORL face database. The candidates obtained above, are then classified by SVM classifiers using the histogram of oriented gradients, eigen features, edge ratio, and edge statistics features.


Recognizing Patterns In Transmitted Signals For Identification Purposes, Baha' A. Alsaify May 2012

Recognizing Patterns In Transmitted Signals For Identification Purposes, Baha' A. Alsaify

Graduate Theses and Dissertations

The ability to identify and authenticate entities in cyberspace such as users, computers, cell phones, smart cards, and radio frequency identification (RFID) tags is usually accomplished by having the entity demonstrate knowledge of a secret key. When the entity is portable and physically accessible, like an RFID tag, it can be difficult to secure given the memory, processing, and economic constraints. This work proposes to use unique patterns in the transmitted signals caused by manufacturing differences to identify and authenticate a wireless device such as an RFID tag. Both manufacturer identification and tag identification are performed on a population of …


Fast Neural Network Algorithm For Solving Classification Tasks, Noor Albarakati Apr 2012

Fast Neural Network Algorithm For Solving Classification Tasks, Noor Albarakati

Theses and Dissertations

Classification is one-out-of several applications in the neural network (NN) world. Multilayer perceptron (MLP) is the common neural network architecture which is used for classification tasks. It is famous for its error back propagation (EBP) algorithm, which opened the new way for solving classification problems given a set of empirical data. In the thesis, we performed experiments by using three different NN structures in order to find the best MLP neural network structure for performing the nonlinear classification of multiclass data sets. A developed learning algorithm used here is the batch EBP algorithm which uses all the data as a …


Contributions To K-Means Clustering And Regression Via Classification Algorithms, Raied Salman Apr 2012

Contributions To K-Means Clustering And Regression Via Classification Algorithms, Raied Salman

Theses and Dissertations

The dissertation deals with clustering algorithms and transforming regression prob-lems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learn-ing environment for solving regression problems as classification tasks by using support vector machines (SVMs). An extension to the most popular unsupervised clustering meth-od, k-means algorithm, is proposed, dubbed k-means2 (k-means squared) algorithm, appli-cable to ultra large datasets. The main idea is based on using a small portion of the dataset in the first stage of the clustering. Thus, the centers of such a smaller …


Computer Methods For Pre-Microrna Secondary Structure Prediction, Dianwei Han Jan 2012

Computer Methods For Pre-Microrna Secondary Structure Prediction, Dianwei Han

Theses and Dissertations--Computer Science

This thesis presents a new algorithm to predict the pre-microRNA secondary structure. An accurate prediction of the pre-microRNA secondary structure is important in miRNA informatics. Based on a recently proposed model, nucleotide cyclic motifs (NCM), to predict RNA secondary structure, we propose and implement a Modified NCM (MNCM) model with a physics-based scoring strategy to tackle the problem of pre-microRNA folding. Our microRNAfold is implemented using a global optimal algorithm based on the bottom-up local optimal solutions.

It has been shown that studying the functions of multiple genes and predicting the secondary structure of multiple related microRNA is more important …


Using Semantic Templates To Study Vulnerabilities Recorded In Large Software Repositories, Yan Wu Oct 2011

Using Semantic Templates To Study Vulnerabilities Recorded In Large Software Repositories, Yan Wu

Student Work

Software vulnerabilities allow an attacker to reduce a system's Confidentiality, Availability, and Integrity by exposing information, executing malicious code, and undermine system functionalities that contribute to the overall system purpose and need. With new vulnerabilities discovered everyday in a variety of applications and user environments, a systematic study of their characteristics is a subject of immediate need for the following reasons:

  • The high rate in which information about past and new vulnerabilities are accumulated makes it difficult to absorb and comprehend.
  • Rather than learning from past mistakes, similar types of vulnerabilities are observed repeatedly.
  • As the scale and complexity of …


Processing And Classification Of Physiological Signals Using Wavelet Transform And Machine Learning Algorithms, Abed Al-Raoof Bsoul Apr 2011

Processing And Classification Of Physiological Signals Using Wavelet Transform And Machine Learning Algorithms, Abed Al-Raoof Bsoul

Theses and Dissertations

Over the last century, physiological signals have been broadly analyzed and processed not only to assess the function of the human physiology, but also to better diagnose illnesses or injuries and provide treatment options for patients. In particular, Electrocardiogram (ECG), blood pressure (BP) and impedance are among the most important biomedical signals processed and analyzed. The majority of studies that utilize these signals attempt to diagnose important irregularities such as arrhythmia or blood loss by processing one of these signals. However, the relationship between them is not yet fully studied using computational methods. Therefore, a system that extract and combine …


A Sparse Representation Technique For Classification Problems, Reinaldo Sanchez Arias Jan 2011

A Sparse Representation Technique For Classification Problems, Reinaldo Sanchez Arias

Open Access Theses & Dissertations

In pattern recognition and machine learning, a classification problem refers to finding an algorithm for assigning a given input data into one of several categories. Many natural signals are sparse or compressible in the sense that they have short representations when expressed in a suitable basis. Motivated by the recent successful development of algorithms for sparse signal recovery, we apply the selective nature of sparse representation to perform classification. In order to find such sparse linear representation, we implement an l1-minimization algorithm. This methodology overcomes the lack of robustness with respect to outliers. In contrast to other classification …


Algorithms For Training Large-Scale Linear Programming Support Vector Regression And Classification, Pablo Rivas Perea Jan 2011

Algorithms For Training Large-Scale Linear Programming Support Vector Regression And Classification, Pablo Rivas Perea

Open Access Theses & Dissertations

The main contribution of this dissertation is the development of a method to train a Support Vector Regression (SVR) model for the large-scale case where the number of training samples supersedes the computational resources. The proposed scheme consists of posing the SVR problem entirely as a Linear Programming (LP) problem and on the development of a sequential optimization method based on variables decomposition, constraints decomposition, and the use of primal-dual interior point methods. Experimental results demonstrate that the proposed approach has comparable performance with other SV-based classifiers. Particularly, experiments demonstrate that as the problem size increases, the sparser the solution …


A Classification Of Lower Paleozoic Carbonate-Bearing Rocks For Geotechnical Applications, Bethany L. Overfield Jan 2011

A Classification Of Lower Paleozoic Carbonate-Bearing Rocks For Geotechnical Applications, Bethany L. Overfield

University of Kentucky Master's Theses

An empirically-based classification of lower Paleozoic carbonate-bearing rocks was created for field-based geotechnical applications. Geotechnical parameters were subsequently correlated to that classification. Seven hundred seventy-seven samples were used as the basis for the classification. Thirteen categories based on visual and tactile properties and a hydrochloric acid test were created. Samples were from central, north-central, and south-central Kentucky and represented the majority of Ordovician exposures in the state, and some Mississippian exposures. Few Silurian and Devonian units were included in the sample set. Geotechnical parameters, including density as well as elastic constants (shear and compression wave velocities, Poisson’s ratio, Young’s modulus, …


Class Discovery And Prediction Of Tumor With Microarray Data, Bo Liu Jan 2011

Class Discovery And Prediction Of Tumor With Microarray Data, Bo Liu

All Graduate Theses, Dissertations, and Other Capstone Projects

Current microarray technology is able take a single tissue sample to construct an Affymetrix oglionucleotide array containing (estimated) expression levels of thousands of different genes for that tissue. The objective is to develop a more systematic approach to cancer classification based on Affymetrix oglionucleotide microarrays. For this purpose, I studied published colon cancer microarray data. Colon cancer, with 655,000 deaths worldwide per year, has become the fourth most common form of cancer in the United States and the third leading cause of cancer - related death in the Western world. This research has been focuses in two areas: class discovery, …


Evolutionary Strategies For Data Mining, Rose Lowe Dec 2010

Evolutionary Strategies For Data Mining, Rose Lowe

All Dissertations

Learning classifier systems (LCS) have been successful in generating rules for solving classification problems in data mining. The rules are of the form IF condition THEN action. The condition encodes the features of the input space and the action encodes the class label. What is lacking in those systems is the ability to express each feature using a function that is appropriate for that feature. The genetic algorithm is capable of doing this but cannot because only one type of membership function
is provided. Thus, the genetic algorithm learns only the shape and placement of the membership function, and in …


An Empirical Approach To Evaluating Sufficient Similarity: Utilization Of Euclidean Distance As A Similarity Measure, Scott Marshall May 2010

An Empirical Approach To Evaluating Sufficient Similarity: Utilization Of Euclidean Distance As A Similarity Measure, Scott Marshall

Theses and Dissertations

Individuals are exposed to chemical mixtures while carrying out everyday tasks, with unknown risk associated with exposure. Given the number of resulting mixtures it is not economically feasible to identify or characterize all possible mixtures. When complete dose-response data are not available on a (candidate) mixture of concern, EPA guidelines define a similar mixture based on chemical composition, component proportions and expert biological judgment (EPA, 1986, 2000). Current work in this literature is by Feder et al. (2009), evaluating sufficient similarity in exposure to disinfection by-products of water purification using multivariate statistical techniques and traditional hypothesis testing. The work of …


Cluster And Classification Analysis Of Fossil Invertebrates Within The Bird Spring Formation, Arrow Canyon, Nevada: Implications For Relative Rise And Fall Of Sea-Level, Scott L. Morris Apr 2010

Cluster And Classification Analysis Of Fossil Invertebrates Within The Bird Spring Formation, Arrow Canyon, Nevada: Implications For Relative Rise And Fall Of Sea-Level, Scott L. Morris

Theses and Dissertations

Carbonate strata preserve indicators of local marine environments through time. Such indicators often include microfossils that have relatively unique conditions under which they can survive, including light, nutrients, salinity, and especially water temperature. As such, microfossils are environmental proxies. When these microfossils are preserved in the rock record, they constitute key components of depositional facies. Spence et al. (2004, 2007) has proposed several approaches for determining the facies of a given stratigraphic succession based upon these proxies. Cluster analysis can be used to determine microfossil groups that represent specific environmental conditions. Identifying which microfossil groups exist through time can indicate …


Statistical Learning And Behrens-Fisher Distribution Methods For Heteroscedastic Data In Microarray Analysis, Nabin K. Manandhr-Shrestha Mar 2010

Statistical Learning And Behrens-Fisher Distribution Methods For Heteroscedastic Data In Microarray Analysis, Nabin K. Manandhr-Shrestha

USF Tampa Graduate Theses and Dissertations

The aim of the present study is to identify the di®erentially expressed genes be- tween two di®erent conditions and apply it in predicting the class of new samples using the microarray data. Microarray data analysis poses many challenges to the statis- ticians because of its high dimensionality and small sample size, dubbed as "small n large p problem". Microarray data has been extensively studied by many statisticians and geneticists. Generally, it is said to follow a normal distribution with equal vari- ances in two conditions, but it is not true in general. Since the number of replications is very small, …


Analysis And Modeling Of Hurricane Impacts On A Coastal Louisiana Lake Bottom, Angelina Freeman Jan 2010

Analysis And Modeling Of Hurricane Impacts On A Coastal Louisiana Lake Bottom, Angelina Freeman

LSU Doctoral Dissertations

Tropical cyclone impacts on wetland, terrestrial, and shelf systems have been previously studied and reasonably delineated, but little is known about the response of coastal lakes to storm events. For the first time, tropical cyclone impacts on a shallow coastal lake in the Louisiana coastal plain have been studied using direct lines of evidence and numerical modeling. Using side-scan sonar, CHIRP subbottom and echo sounder bathymetric profiles, the lake bottom and shallow subsurface of Sister Lake was imaged pre- and post-Hurricanes Katrina and Rita to provide a geologic framework for assessing the effects of these storms. Box cores were collected …


The Impact Of Overfitting And Overgeneralization On The Classification Accuracy In Data Mining, Huy Nguyen Anh Pham Jan 2010

The Impact Of Overfitting And Overgeneralization On The Classification Accuracy In Data Mining, Huy Nguyen Anh Pham

LSU Doctoral Dissertations

Current classification approaches usually do not try to achieve a balance between fitting and generalization when they infer models from training data. Such approaches ignore the possibility of different penalty costs for the false-positive, false-negative, and unclassifiable types. Thus, their performances may not be optimal or may even be coincidental. This dissertation analyzes the above issues in depth. It also proposes two new approaches called the Homogeneity-Based Algorithm (HBA) and the Convexity-Based Algorithm (CBA) to address these issues. These new approaches aim at optimally balancing the data fitting and generalization behaviors of models when some traditional classification approaches are used. …


Integrating Information Theory Measures And A Novel Rule-Set-Reduction Tech-Nique To Improve Fuzzy Decision Tree Induction Algorithms, Nael Mohammed Abu-Halaweh Dec 2009

Integrating Information Theory Measures And A Novel Rule-Set-Reduction Tech-Nique To Improve Fuzzy Decision Tree Induction Algorithms, Nael Mohammed Abu-Halaweh

Computer Science Dissertations

Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and …


The Classification Of Simple Lie Algebras In Maple, D. Russell Sadler Jan 2009

The Classification Of Simple Lie Algebras In Maple, D. Russell Sadler

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Lie algebras are invaluable tools in mathematics and physics as they enable us to study certain geometric objects such as Lie groups and differentiable manifolds. The computer algebra system Maple has several tools in its Lie Algebras package to work with Lie algebras and Lie groups. The purpose of this paper is to supplement the existing software with tools that are essential for the classification of simple Lie algebras over C.

In particular, we use a method to find a Cartan subalgebra of a Lie algebra in polynomial time. From the Cartan subalgebra we can compute the corresponding root system. …


Comparison Of Machine Learning Algorithms For Modeling Species Distributions: Application To Stream Invertebrates From Western Usa Reference Sites, Margi Dubal May 2008

Comparison Of Machine Learning Algorithms For Modeling Species Distributions: Application To Stream Invertebrates From Western Usa Reference Sites, Margi Dubal

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

Machine learning algorithms are increasingly being used by ecologists to model and predict the distributions of individual species and entire assemblages of sites. Accurate prediction of distribution of species is an important factor in any modeling. We compared prediction accuracy of four machine learning algorithms-random forests, classification trees, support vector machines, and gradient boosting machines to a traditional method, linear discriminant models (LDM), on a large set of stream invertebrate data collected at 728 reference sites in the western United States. Classifications were constructed for individual species and for assemblages of sites clustered a priori by similarity on biological characteristics. …


Data Mining Methods For Malware Detection, Muazzam Siddiqui Jan 2008

Data Mining Methods For Malware Detection, Muazzam Siddiqui

Electronic Theses and Dissertations

This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods. The traditional approaches using signatures to detect malicious programs fails for the new and unknown malwares case, where signatures are not available. We present a data mining framework to detect malicious programs. We collected, analyzed and processed several thousand malicious and clean programs to find out the best features and build models that can classify a given program into a malware or a clean class. Our research is closely related to information retrieval …


Improving Neural Network Classification Training, Michael Edwin Rimer Sep 2007

Improving Neural Network Classification Training, Michael Edwin Rimer

Theses and Dissertations

The following work presents a new set of general methods for improving neural network accuracy on classification tasks, grouped under the label of classification-based methods. The central theme of these approaches is to provide problem representations and error functions that more directly improve classification accuracy than conventional learning and error functions. The CB1 algorithm attempts to maximize classification accuracy by selectively backpropagating error only on misclassified training patterns. CB2 incorporates a sliding error threshold to the CB1 algorithm, interpolating between the behavior of CB1 and standard error backpropagation as training progresses in order to avoid prematurely saturated network weights. CB3 …


A Classification Of Real Indecomposable Solvable Lie Algebras Of Small Dimension With Codimension One Nilradicals, Alan R. Parry May 2007

A Classification Of Real Indecomposable Solvable Lie Algebras Of Small Dimension With Codimension One Nilradicals, Alan R. Parry

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

This thesis was concerned with classifying the real indecomposable solvable Lie algebras with codimension one nilradicals of dimensions two through seven. This thesis was organized into three chapters.

In the first, we described the necessary concepts and definitions about Lie algebras as well as a few helpful theorems that are necessary to understand the project. We also reviewed many concepts from linear algebra that are essential to the research.

The second chapter was occupied with a description of how we went about classifying the Lie algebras. In particular, it outlined the basic premise of the classification: that we can use …


Data Mining And Analysis On Multiple Time Series Object Data, Chunyu Jiang Jan 2007

Data Mining And Analysis On Multiple Time Series Object Data, Chunyu Jiang

Browse all Theses and Dissertations

Huge amount of data is available in our society and the need for turning such data into useful information and knowledge is urgent. Data mining is an important field addressing that need and significant progress has been achieved in the last decade. In several important application areas, data arises in the format of Multiple Time Series Object (MTSO) data, where each data object is an array of time series over a large set of features and each has an associated class or state. Very little research has been conducted towards this kind of data. Examples include computational toxicology, where each …


Toward A Heuristic Model For Evaluating The Complexity Of Computer Security Visualization Interface, Hsiu-Chung Wang Dec 2006

Toward A Heuristic Model For Evaluating The Complexity Of Computer Security Visualization Interface, Hsiu-Chung Wang

Computer Science Theses

Computer security visualization has gained much attention in the research community in the past few years. However, the advancement in security visualization research has been hampered by the lack of standardization in visualization design, centralized datasets, and evaluation methods. We propose a new heuristic model for evaluating the complexity of computer security visualizations. This complexity evaluation method is designed to evaluate the efficiency of performing visual search in security visualizations in terms of measuring critical memory capacity load needed to perform such tasks. Our method is based on research in cognitive psychology along with characteristics found in a majority of …


A Neural Network Model For Classification Of Coastal Wetlands Vegetation Structure With Moderate Resolution Imaging Spectro-Radiometer (Modis) Data, Evaristo Joseph Liwa Jan 2006

A Neural Network Model For Classification Of Coastal Wetlands Vegetation Structure With Moderate Resolution Imaging Spectro-Radiometer (Modis) Data, Evaristo Joseph Liwa

LSU Doctoral Dissertations

Mapping coastal marshes is an important component in the management of coastal environments. Classification of marshes using remote sensing data has traditionally been performed by employing either parametric supervised classification algorithms or unsupervised classification algorithms. The implementation of these conversional classification methods is based on the underlying distributions concerning the probability density functions (PDF). Neural networks provide a practical approach to this classification because they are essentially non-parametric data transformations that are not restricted by any underlying assumptions. The major objective of this study was to evaluate the ability of neural networks using Moderate Resolution Imaging Spectro-radiometer (MODIS) data to …


Special Classification Models For Lichens In The Pacific Northwest, Janeen Ardito May 2005

Special Classification Models For Lichens In The Pacific Northwest, Janeen Ardito

All Graduate Plan B and other Reports, Spring 1920 to Spring 2023

A common problem in ecological studies is that of determining where to look for rare species. This paper shows how statistical models, such as classification trees, may be used to assist in the design of probability-based surveys for rare species using information on more abundant species that are associated with the rare species. This model assisted approach to survey design involves first building models for the more abundant species. The models are then used to determine stratifications for the rare species that are associated with the more abundant species. The goal of this approach is to increase the number of …