Open Access. Powered by Scholars. Published by Universities.®

Science and Technology Studies Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Science and Technology Studies

Knowledge Discovery In Biological Datasets Using A Hybrid Bayes Classifier/Evolutionary Algorithm, Michael L. Raymer, Leslie A. Kuhn, William F. Punch Nov 2001

Knowledge Discovery In Biological Datasets Using A Hybrid Bayes Classifier/Evolutionary Algorithm, Michael L. Raymer, Leslie A. Kuhn, William F. Punch

Kno.e.sis Publications

A key element of bioinformatics research is the extraction of meaningful information from large experimental data sets. Various approaches, including statistical and graph theoretical methods, data mining, and computational pattern recognition, have been applied to this task with varying degrees of success. We have previously shown that a genetic algorithm coupled with a k-nearest-neighbors classifier performs well in extracting information about protein-water binding from X-ray crystallographic protein structure data. Using a novel classifier based on the Bayes discriminant function, we present a hybrid algorithm that employs feature selection and extraction to isolate salient features from large biological data sets. The …


Profile Combinatorics For Fragment Selection In Comparative Protein Structure Modeling, Deacon Sweeney, Travis E. Doom, Michael L. Raymer Nov 2001

Profile Combinatorics For Fragment Selection In Comparative Protein Structure Modeling, Deacon Sweeney, Travis E. Doom, Michael L. Raymer

Kno.e.sis Publications

Sequencing of the human genome was a great stride towards modeling cellular complexes, massive systems whose key players are proteins and DNA. A major bottleneck limiting the modeling process is structure and function annotation for the new genes. Contemporary protein structure prediction algorithms represent the sequence of every protein of known structure with a profile to which the profile of a protein sequence of unknown structure is compared for recognition. We propose a novel approach to increase the scope and resolution of protein structure profiles. Our technique locates equivalent regions among the members of a structurally similar fold family, and …


Online Bayesian Tree-Structured Transformation Of Hmms With Optimal Model Selection For Speaker Adaptation, Shaojun Wang, Yunxin Zhao Sep 2001

Online Bayesian Tree-Structured Transformation Of Hmms With Optimal Model Selection For Speaker Adaptation, Shaojun Wang, Yunxin Zhao

Kno.e.sis Publications

This paper presents a new recursive Bayesian learning approach for transformation parameter estimation in speaker adaptation. Our goal is to incrementally transform or adapt a set of hidden Markov model (HMM) parameters for a new speaker and gain large performance improvement from a small amount of adaptation data. By constructing a clustering tree of HMM Gaussian mixture components, the linear regression (LR) or affine transformation parameters for HMM Gaussian mixture components are dynamically searched. An online Bayesian learning technique is proposed for recursive maximum a posteriori (MAP) estimation of LR and affine transformation parameters. This technique has the advantages of …


Summarizing Data Sets For Classification, Christopher W. Kinzig, Krishnaprasad Thirunarayan, Gary B. Lamont, Robert E. Marmelstein Jun 2001

Summarizing Data Sets For Classification, Christopher W. Kinzig, Krishnaprasad Thirunarayan, Gary B. Lamont, Robert E. Marmelstein

Kno.e.sis Publications

This paper describes our approach and experiences with implementing a data mining system using genetic algorithms in C++. In contrast with earlier classification algorithms that tended to “tile” the data sets using some pre-specified “shapes”, the proposed system is based on Marmelstein’s work on determining natural boundaries for class homogeneous regions. These boundaries are further refined to construct a compact set of simple data mining rules for classification.


Query Processing With An Fpga Coprocessor Board, Jack S. Jean, Guozhu Dong, Hwa Zhang, Xinzhong Guo, Baifeng Zhang Jun 2001

Query Processing With An Fpga Coprocessor Board, Jack S. Jean, Guozhu Dong, Hwa Zhang, Xinzhong Guo, Baifeng Zhang

Kno.e.sis Publications

In this paper, a commercial FPGA coprocessor board is used to accelerate the processing of queries on a relational database that contains texts and images. FPGA designs for text searching and image matching are described and their performances summarized. A potential design for a database JOIN operator is then studied. A query optimization preprocessor is then proposed.


Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao May 2001

Making Use Of The Most Expressive Jumping Emerging Patterns For Classification, Jinyan Li, Guozhu Dong, Kotagiri Ramamohanarao

Kno.e.sis Publications

Classification aims to discover a model from training data that can be used to predict the class of test instances. In this paper, we propose the use of jumping emerging patterns (JEPs) as the basis for a new classifier called the JEP-Classifier. Each JEP can capture some crucial difference between a pair of datasets. Then, aggregating all JEPs of large supports can produce a more potent classification power. Procedurally, the JEP-Classifier learns the pair-wise features (sets of JEPs) contained in the training data, and uses the collective impacts contributed by the most expressive pair-wise features to determine the class labels …


Survivability Architecture For Workflow Management Systems, Jorge Cardoso, Zongwei Luo, John A. Miller, Amit P. Sheth, Krzysztof J. Kochut Mar 2001

Survivability Architecture For Workflow Management Systems, Jorge Cardoso, Zongwei Luo, John A. Miller, Amit P. Sheth, Krzysztof J. Kochut

Kno.e.sis Publications

The survivability of critical infrastructure systems has been gaining increasing concern from the industry. The survivability research area addresses the issue of infrastructure systems that continues to provide pre-established service levels to users in the face of disorders and react to changes in the surrounding environment. Workflow management systems need to be survivable since they are used to support critical and sensitive business processes. They require a high level of dependability and should not allow process instances to be interrupted or aborted due to failures. Moreover, due to their sensitivity, business process should reflect any change in the environment. In …


Latent Maximum Entropy Principle For Statistical Language Modeling, Shaojun Wang, Ronald Rosenfeld, Yunxin Zhao Jan 2001

Latent Maximum Entropy Principle For Statistical Language Modeling, Shaojun Wang, Ronald Rosenfeld, Yunxin Zhao

Kno.e.sis Publications

We describe a unified probabilistic framework for statistical language modeling, the latent maximum entropy principle. The salient feature of this approach is that the hidden causal hierarchical dependency structure can be encoded into the statistical model in a principled way by mixtures of exponential families with a rich expressive power. We first show the problem formulation, solution, and certain convergence properties. We then describe how to use this machine learning technique to model various aspects of natural language, such as syntactic structure of sentences, semantic information in a document. Finally, we draw a conclusion and point out future research directions.