Physical Sciences and Mathematics | Open Access Articles

A Pseudo Nearest-Neighbor Approach For Missing Data Recovery On Gaussian Random Data Sets, Xiaolu Huang, Qiuming Zhu Nov 2002

A Pseudo Nearest-Neighbor Approach For Missing Data Recovery On Gaussian Random Data Sets, Xiaolu Huang, Qiuming Zhu

Computer Science Faculty Publications

Missing data handling is an important preparation step for most data discrimination or mining tasks. Inappropriate treatment of missing data may cause large errors or false results. In this paper, we study the effect of a missing data recovery method, namely the pseudo- nearest neighbor substitution approach, on Gaussian distributed data sets that represent typical cases in data discrimination and data mining applications. The error rate of the proposed recovery method is evaluated by comparing the clustering results of the recovered data sets to the clustering results obtained on the originally complete data sets. The results are also compared with …

Go to article

Understanding And Measuring Corporate Is Sophistication: An Exploratory Investigation Using Ground Theory, Deepak Khazanchi Oct 2002

Understanding And Measuring Corporate Is Sophistication: An Exploratory Investigation Using Ground Theory, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

This paper describes the results of an exploratory study that culminated in the development of a set of theoretical dimensions for “Corporate IS Sophistication”. These indicators were developed utilizing grounded theory to analyze archival corporate data and data from in-depth interviews with functional IT executives in two Norwegian and one North American firm.

Go to article

An Iterative Initial-Points Refinement Algorithm For Categorical Data Clustering, Ying Sun, Qiuming Zhu, Zhengxin Chen May 2002

An Iterative Initial-Points Refinement Algorithm For Categorical Data Clustering, Ying Sun, Qiuming Zhu, Zhengxin Chen

Computer Science Faculty Publications

The original k-means clustering algorithm is designed to work primarily on numeric data sets. This prohibits the algorithm from being directly applied to categorical data clustering in many data mining applications. The k-modes algorithm [Z. Huang, Clustering large data sets with mixed numeric and categorical value, in: Proceedings of the First Pacific Asia Knowledge Discovery and Data Mining Conference. World Scientific, Singapore, 1997, pp. 21–34] extended the k-means paradigm to cluster categorical data by using a frequency-based method to update the cluster modes versus the k-means fashion of minimizing a numerically valued cost. However, as is …

Go to article

Ventures Into Capturing Effort In Programming, Barbara Bernal-Thomas, Briana B. Morrison Jan 2002

Ventures Into Capturing Effort In Programming, Barbara Bernal-Thomas, Briana B. Morrison

Computer Science Faculty Proceedings & Presentations

The quest for teaching a method of data collection in programming experiences was marked with successes and failures. We believe that software development curricula must provide students with knowledge and experience related to the practice of data collection, which will measure the effort put into a software project. By recording their past effort in software projects, students can more accurately estimate the amount of effort and time required to complete a future software project. Students can also learn the amount of effort required to develop “correct” software and begin to estimate the amount of time required, per software phase, to …

Go to article

An Empirical Analysis Of Electronic Data Interchange (Edi) Implementation Benefits In Kentucky Small- And Medium-Sized Enterprises: Some Implications For New It Implementation, Deepak Khazanchi Jan 2002

An Empirical Analysis Of Electronic Data Interchange (Edi) Implementation Benefits In Kentucky Small- And Medium-Sized Enterprises: Some Implications For New It Implementation, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

This paper reports that the benefits accrued from implementing and integrating Electronic Data Interchange (EDI) within small and medium-sized enterprises (SMEs) can be conceptualized into two factors. First, firms derive operational/tactical benefits by predominantly focusing on increasing internal utility of this technology. Second, firms derive strategic benefits from EDI in the form of better external relationships and alliances with trading partners and an enhanced ability to compete in their market. Among other significant findings, there are clear indications from the correlation statistics reported here that experience with EDI, industrial category of a firm and the level of ED! integral ton …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

A Pseudo Nearest-Neighbor Approach For Missing Data Recovery On Gaussian Random Data Sets, Xiaolu Huang, Qiuming Zhu

Computer Science Faculty Publications

Understanding And Measuring Corporate Is Sophistication: An Exploratory Investigation Using Ground Theory, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications

An Iterative Initial-Points Refinement Algorithm For Categorical Data Clustering, Ying Sun, Qiuming Zhu, Zhengxin Chen

Computer Science Faculty Publications

Ventures Into Capturing Effort In Programming, Barbara Bernal-Thomas, Briana B. Morrison

Computer Science Faculty Proceedings & Presentations

An Empirical Analysis Of Electronic Data Interchange (Edi) Implementation Benefits In Kentucky Small- And Medium-Sized Enterprises: Some Implications For New It Implementation, Deepak Khazanchi

Information Systems and Quantitative Analysis Faculty Publications