Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Data Mining, Software Engineering (3)
- Lifetime estimation (2)
- Peer-to-peer computing (2)
- Random processes (2)
- Age-based selection (1)
-
- Age-proportional graphs (1)
- ArXiv (1)
- Arbitrary age-biased neighbor-selection algorithm (1)
- Attribute selection (1)
- Create-based method (1)
- Data model (1)
- Data preparation (1)
- Data selection (1)
- Defect prediction (1)
- Delay (1)
- Description Logics (1)
- Description and exchange of aggregations of Web resources (1)
- Design (1)
- Exponential user lifetimes (1)
- Feature ranking (1)
- Finite-size graphs (1)
- Gnutella networks (1)
- Graph partitioning (1)
- Graph theory (1)
- Heavy-tailed lifetimes (1)
- Heavy-tailed user lifetimes (1)
- High-dimensional data (1)
- Hybrid feature selection (1)
- Information Retrieval (1)
- Information Storage and Retrieval (1)
Articles 1 - 10 of 10
Full-Text Articles in Physical Sciences and Mathematics
Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya
Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya
Computer Science Faculty Publications
A large system often goes through multiple software project development cycles, in part due to changes in operation and development environments. For example, rapid turnover of the development team between releases can influence software quality, making it important to mine software project data over multiple system releases when building defect predictors. Data collection of software attributes are often conducted independent of the quality improvement goals, leading to the availability of a large number of attributes for analysis. Given the problems associated with variations in development process, data collection, and quality goals from one release to another emphasizes the importance of …
High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
Computer Science Faculty Publications
Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set of software metrics that have the same predictive capability as a larger set of metrics – we strive to answer that question in this paper. We present a comprehensive comparison between seven commonly-used filter-based feature ranking techniques (FRT) …
Robust Lifetime Measurement In Large-Scale P2p Systems With Non-Stationary Arrivals, Xiaoming Wang, Zhongmei Yao, Yueping Zhang, Dmitri Loguinov
Robust Lifetime Measurement In Large-Scale P2p Systems With Non-Stationary Arrivals, Xiaoming Wang, Zhongmei Yao, Yueping Zhang, Dmitri Loguinov
Computer Science Faculty Publications
Characterizing user churn has become an important topic in studying P2P networks, both in theoretical analysis and system design. Recent work has shown that direct sampling of user lifetimes may lead to certain bias (arising from missed peers and round-off inconsistencies) and proposed a technique that estimates lifetimes based on sampled residuals. In this paper, however, we show that under non-stationary arrivals, which are often present in real systems, residual-based sampling does not correctly reconstruct user lifetimes and suffers a varying degree of bias, which in some cases makes estimation completely impossible. We overcome this problem using two contributions: a …
An Empirical Investigation Of Filter Attribute Selection Techniques For Software Quality Classification, Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang
An Empirical Investigation Of Filter Attribute Selection Techniques For Software Quality Classification, Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang
Computer Science Faculty Publications
Attribute selection is an important activity in data preprocessing for software quality modeling and other data mining problems. The software quality models have been used to improve the fault detection process. Finding faulty components in a software system during early stages of software development process can lead to a more reliable final product and can reduce development and maintenance costs. It has been shown in some studies that prediction accuracy of the models improves when irrelevant and redundant features are removed from the original data set. In this study, we investigated four filter attribute selection techniques, Automatic Hybrid Search (AHS), …
Residual-Based Estimation Of Peer And Link Lifetimes In P2p Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov
Residual-Based Estimation Of Peer And Link Lifetimes In P2p Networks, Xiaoming Wang, Zhongmei Yao, Dmitri Loguinov
Computer Science Faculty Publications
Existing methods of measuring lifetimes in P2P systems usually rely on the so-called Create-BasedMethod (CBM), which divides a given observation window into two halves and samples users ldquocreatedrdquo in the first half every Delta time units until they die or the observation period ends. Despite its frequent use, this approach has no rigorous accuracy or overhead analysis in the literature. To shed more light on its performance, we first derive a model for CBM and show that small window size or large Delta may lead to highly inaccurate lifetime distributions. We then show that create-based sampling exhibits an inherent …
Node Isolation Model And Age-Based Neighbor Selection In Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov
Node Isolation Model And Age-Based Neighbor Selection In Unstructured P2p Networks, Zhongmei Yao, Derek Leonard, Dmitri Loguinov
Computer Science Faculty Publications
Previous analytical studies of unstructured P2P resilience have assumed exponential user lifetimes and only considered age-independent neighbor replacement. In this paper, we overcome these limitations by introducing a general node-isolation model for heavy-tailed user lifetimes and arbitrary neighbor-selection algorithms. Using this model, we analyze two age-biased neighbor-selection strategies and show that they significantly improve the residual lifetimes of chosen users, which dramatically reduces the probability of user isolation and graph partitioning compared with uniform selection of neighbors. In fact, the second strategy based on random walks on age-proportional graphs demonstrates that, for lifetimes with infinite variance, the system monotonically increases …
Correlation Of Music Charts And Search Engine Rankings, Martin Klein, Olena Hunsicker, Michael Nelson
Correlation Of Music Charts And Search Engine Rankings, Martin Klein, Olena Hunsicker, Michael Nelson
Computer Science Faculty Publications
We investigate the question whether expert rankings of real-world entities correlate with search engine (SE) rankings of corresponding web resources. We compare Billboards "Hot 100 Airplay" music charts with SE rankings of associated web resources. Out of nine comparisons we found two strong, two moderate, two weak and one negative correlation. The remaining two comparisons were inconclusive.
Object Reuse And Exchange, Michael L. Nelson, Carl Lagoze, Herbert Van De Sompel, Pete Johnston, Robert Sanderson, Simeon Warner, Jürgen Sieck (Ed.), Michael A. Herzog (Ed.)
Object Reuse And Exchange, Michael L. Nelson, Carl Lagoze, Herbert Van De Sompel, Pete Johnston, Robert Sanderson, Simeon Warner, Jürgen Sieck (Ed.), Michael A. Herzog (Ed.)
Computer Science Faculty Publications
The Open Archives Object Reuse and Exchange (OAI-ORE) project defines standards for the description and exchange of aggregations of Web resources. The OAI-ORE abstract data model is conformant with the Architecture of the World Wide Web and leverages concepts from the Semantic Web, including RDF descriptions and Linked Data. In this paper we provide a brief review of a motivating example and its serialization in Atom.
Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson
Exploring Out-Of-Turn Interactions With Websites, Saverio Perugini, Naren Ramakrishnan, Manuel A. Pérez-Quiñones, Mary E. Pinney, Mary Beth Rosson
Computer Science Faculty Publications
Hierarchies are ubiquitous on the web for structuring online catalogs and indexing multidimensional attributed data sets. They are a natural metaphor for information seeking if their levelwise structure mirrors the user's conception of the underlying domain. In other cases, they can be frustrating, especially if multiple drill‐downs are necessary to arrive at information of interest. To support a broad range of users, site designers often expose multiple faceted classifications or provide within‐page pruning mechanisms. We present a new technique, called out-of-turn interaction, that increases the richness of user interaction at hierarchical sites, without enumerating all possible completion paths in the …
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
User Interface Design, Moritz Stefaner, Sebastien Ferre, Saverio Perugini, Jonathan Koren, Yi Zhang
Computer Science Faculty Publications
As detailed in Chap. 1, system implementations for dynamic taxonomies and faceted search allow a wide range of query possibilities on the data. Only when these are made accessible by appropriate user interfaces, the resulting applications can support a variety of search, browsing and analysis tasks. User interface design in this area is confronted with specific challenges. This chapter presents an overview of both established and novel principles and solutions.