Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 15 of 15

Full-Text Articles in Computer Sciences

Blocking Reduction Strategies In Hierarchical Text Classification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng, Jaideep Srivastava Oct 2004

Blocking Reduction Strategies In Hierarchical Text Classification, Ee Peng Lim, Aixin Sun, Wee-Keong Ng, Jaideep Srivastava

Research Collection School Of Computing and Information Systems

One common approach in hierarchical text classification involves associating classifiers with nodes in the category tree and classifying text documents in a top-down manner. Classification methods using this top-down approach can scale well and cope with changes to the category trees. However, all these methods suffer from blocking which refers to documents wrongly rejected by the classifiers at higher-levels and cannot be passed to the classifiers at lower-levels. We propose a classifier-centric performance measure known as blocking factor to determine the extent of the blocking. Three methods are proposed to address the blocking problem, namely, threshold reduction, restricted voting, and …


Enhancements To Crisp Possibilistic Reconstructability Analysis, Anas Al-Rabadi, Martin Zwick Aug 2004

Enhancements To Crisp Possibilistic Reconstructability Analysis, Anas Al-Rabadi, Martin Zwick

Systems Science Faculty Publications and Presentations

Modified Reconstructibility Analysis (MRA), a novel decomposition within the framework of set-theoretic (crisp possibilistic) Reconstructibility Analysis, is presented. It is shown that in some cases while 3-variable NPN-classified Boolean functions are not decomposable using Conventional Reconstructibility Analysis (CRA), they are decomposable using Modified Reconstructibility Analysis (MRA). Also, it is shown that whenever a decomposition of 3-variable NPN-classified Boolean functions exists in both MRA and CRA, MRA yields simpler or equal complexity decompositions. A comparison of the corresponding complexities for Ashenhurst-Curtis decompositions, and Modified Reconstructibility Analysis (MRA) is also presented. While both AC and MRA decompose some but …


A Support-Ordered Trie For Fast Frequent Itemset Discovery, Ee Peng Lim, Yew-Kwong Woon, Wee-Keong Ng Jul 2004

A Support-Ordered Trie For Fast Frequent Itemset Discovery, Ee Peng Lim, Yew-Kwong Woon, Wee-Keong Ng

Research Collection School Of Computing and Information Systems

The importance of data mining is apparent with the advent of powerful data collection and storage tools; raw data is so abundant that manual analysis is no longer possible. Unfortunately, data mining problems are difficult to solve and this prompted the introduction of several novel data structures to improve mining efficiency. Here, we critically examine existing preprocessing data structures used in association rule mining for enhancing performance in an attempt to understand their strengths and weaknesses. Our analyses culminate in a practical structure called the SOTrielT (support-ordered trie itemset) and two synergistic algorithms to accompany it for the fast discovery …


New Techniques For Improving Biological Data Quality Through Information Integration, Katherine Grace Herbert May 2004

New Techniques For Improving Biological Data Quality Through Information Integration, Katherine Grace Herbert

Dissertations

As databases become more pervasive through the biological sciences, various data quality concerns are emerging. Biological databases tend to develop data quality issues regarding data legacy, data uniformity and data duplication. Due to the nature of this data, each of these problems is non-trivial and can cause many problems for the database. For biological data to be corrected and standardized, methods and frameworks must be developed to handle both structural and traditional data.

The BIG-AJAX framework has been developed for solving these problems through both data cleaning and data integration. This framework exploits declarative data cleaning and exploratory data mining …


Customer Relationship Management For Banking System, Pingyu Hou Jan 2004

Customer Relationship Management For Banking System, Pingyu Hou

Theses Digitization Project

The purpose of this project is to design, build, and implement a Customer Relationship Management (CRM) system for a bank. CRM BANKING is an online application that caters to strengthening and stabilizing customer relationships in a bank.


High Performance Data Mining Techniques For Intrusion Detection, Muazzam Ahmed Siddiqui Jan 2004

High Performance Data Mining Techniques For Intrusion Detection, Muazzam Ahmed Siddiqui

Electronic Theses and Dissertations

The rapid growth of computers transformed the way in which information and data was stored. With this new paradigm of data access, comes the threat of this information being exposed to unauthorized and unintended users. Many systems have been developed which scrutinize the data for a deviation from the normal behavior of a user or system, or search for a known signature within the data. These systems are termed as Intrusion Detection Systems (IDS). These systems employ different techniques varying from statistical methods to machine learning algorithms. Intrusion detection systems use audit data generated by operating systems, application softwares or …


Reconstructability Analysis With Fourier Transforms, Martin Zwick Jan 2004

Reconstructability Analysis With Fourier Transforms, Martin Zwick

Systems Science Faculty Publications and Presentations

Fourier methods used in two‐ and three‐dimensional image reconstruction can be used also in reconstructability analysis (RA). These methods maximize a variance‐type measure instead of information‐theoretic uncertainty, but the two measures are roughly collinear and the Fourier approach yields results close to that of standard RA. The Fourier method, however, does not require iterative calculations for models with loops. Moreover, the error in Fourier RA models can be assessed without actually generating the full probability distributions of the models; calculations scale with the size of the data rather than the state space. State‐based modeling using the Fourier approach is also …


A Software Architecture For Reconstructability Analysis, Kenneth Willett, Martin Zwick Jan 2004

A Software Architecture For Reconstructability Analysis, Kenneth Willett, Martin Zwick

Systems Science Faculty Publications and Presentations

Software packages for reconstructability analysis (RA), as well as for related log linear modeling, generally provide a fixed set of functions. Such packages are suitable for end‐users applying RA in various domains, but do not provide a platform for research into the RA methods themselves. A new software system, Occam3, is being developed which is intended to address three goals which often conflict with one another to provide: a general and flexible infrastructure for experimentation with RA methods and algorithms; an easily‐configured system allowing methods to be combined in novel ways, without requiring deep software expertise; and a system which …


An Overview Of Reconstructability Analysis, Martin Zwick Jan 2004

An Overview Of Reconstructability Analysis, Martin Zwick

Systems Science Faculty Publications and Presentations

This paper is an overview of reconstructability analysis (RA), a discrete multivariate modeling methodology developed in the systems literature; an earlier version of this tutorial is Zwick (2001). RA was derived from Ashby (1964), and was developed by Broekstra, Cavallo, Cellier Conant, Jones, Klir, Krippendorff, and others (Klir, 1986, 1996). RA resembles and partially overlaps log‐line (LL) statistical methods used in the social sciences (Bishop et al., 1978; Knoke and Burke, 1980). RA also resembles and overlaps methods used in logic design and machine learning (LDL) in electrical and computer engineering (e.g. Perkowski et al., 1997). Applications of RA, like …


A Comparison Of Modified Reconstructability Analysis And Ashenhurst‐Curtis Decomposition Of Boolean Functions, Anas Al-Rabadi, Marek Perkowski, Martin Zwick Jan 2004

A Comparison Of Modified Reconstructability Analysis And Ashenhurst‐Curtis Decomposition Of Boolean Functions, Anas Al-Rabadi, Marek Perkowski, Martin Zwick

Systems Science Faculty Publications and Presentations

Modified reconstructability analysis (MRA), a novel decomposition technique within the framework of set‐theoretic (crisp possibilistic) reconstructability analysis, is applied to three‐variable NPN‐classified Boolean functions. MRA is superior to conventional reconstructability analysis, i.e. it decomposes more NPN functions. MRA is compared to Ashenhurst‐Curtis (AC) decomposition using two different complexity measures: log‐functionality, a measure suitable for machine learning, and the count of the total number of two‐input gates, a measure suitable for circuit design. MRA is superior to AC using the first of these measures, and is comparable to, but different from AC, using the second.


Modified Reconstructability Analysis For Many-Valued Functions And Relations, Anas Al-Rabadi, Martin Zwick Jan 2004

Modified Reconstructability Analysis For Many-Valued Functions And Relations, Anas Al-Rabadi, Martin Zwick

Systems Science Faculty Publications and Presentations

A novel many-valued decomposition within the framework of lossless Reconstructability Analysis is presented. In previous work, Modified Recontructability Analysis (MRA) was applied to Boolean functions, where it was shown that most Boolean functions not decomposable using conventional Reconstructability Analysis (CRA) are decomposable using MRA. Also, it was previously shown that whenever decomposition exists in both MRA and CRA, MRA yields simpler or equal complexity decompositions. In this paper, MRA is extended to many-valued logic functions, and logic structures that correspond to such decomposition are developed. It is shown that many-valued MRA can decompose many-valued functions when CRA fails to do …


Directed Extended Dependency Analysis For Data Mining, Thaddeus T. Shannon, Martin Zwick Jan 2004

Directed Extended Dependency Analysis For Data Mining, Thaddeus T. Shannon, Martin Zwick

Systems Science Faculty Publications and Presentations

Extended dependency analysis (EDA) is a heuristic search technique for finding significant relationships between nominal variables in large data sets. The directed version of EDA searches for maximally predictive sets of independent variables with respect to a target dependent variable. The original implementation of EDA was an extension of reconstructability analysis. Our new implementation adds a variety of statistical significance tests at each decision point that allow the user to tailor the algorithm to a particular objective. It also utilizes data structures appropriate for the sparse data sets customary in contemporary data mining problems. Two examples that illustrate different approaches …


State-Based Reconstructability Analysis, Martin Zwick, Michael S. Johnson Jan 2004

State-Based Reconstructability Analysis, Martin Zwick, Michael S. Johnson

Systems Science Faculty Publications and Presentations

Reconstructability analysis (RA) is a method for detecting and analyzing the structure of multivariate categorical data. While Jones and his colleagues extended the original variable‐based formulation of RA to encompass models defined in terms of system states, their focus was the analysis and approximation of real‐valued functions. In this paper, we separate two ideas that Jones had merged together: the “g to k” transformation and state‐based modeling. We relate the idea of state‐based modeling to established variable‐based RA concepts and methods, including structure lattices, search strategies, metrics of model quality, and the statistical evaluation of model fit for analyses based …


Reconstructability Analysis Detection Of Optimal Gene Order In Genetic Algorithms, Martin Zwick, Stephen Shervais Jan 2004

Reconstructability Analysis Detection Of Optimal Gene Order In Genetic Algorithms, Martin Zwick, Stephen Shervais

Systems Science Faculty Publications and Presentations

The building block hypothesis implies that genetic algorithm efficiency will be improved if sets of genes that improve fitness through epistatic interaction are near to one another on the chromosome. We demonstrate this effect with a simple problem, and show that information-theoretic reconstructability analysis can be used to decide on optimal gene ordering.


Reversible Modified Reconstructability Analysis Of Boolean Circuits And Its Quantum Computation, Anas Al-Rabadi, Martin Zwick Jan 2004

Reversible Modified Reconstructability Analysis Of Boolean Circuits And Its Quantum Computation, Anas Al-Rabadi, Martin Zwick

Systems Science Faculty Publications and Presentations

Modified Reconstructability Analysis (MRA) can be realized reversibly by utilizing Boolean reversible (3,3) logic gates that are universal in two arguments. The quantum computation of the reversible MRA circuits is also introduced. The reversible MRA transformations are given a quantum form by using the normal matrix representation of such gates. The MRA-based quantum decomposition may play an important role in the synthesis of logic structures using future technologies that consume less power and occupy less space.