Scale: A Scalable Framework For Efficiently Clustering Transactional Data, 2010 Wright State University - Main Campus
Scale: A Scalable Framework For Efficiently Clustering Transactional Data, Hua Yan, Keke Chen, Ling Liu, Zhang Yi
Kno.e.sis Publications
This paper presents SCALE, a fully automated transactional clustering framework. The SCALE design highlights three unique features. First, we introduce the concept of Weighted Coverage Density as a categorical similarity measure for efficient clustering of transactional datasets. The concept of weighted coverage density is intuitive and it allows the weight of each item in a cluster to be changed dynamically according to the occurrences of items. Second, we develop the weighted coverage density measure based clustering algorithm, a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Third, we introduce two clustering validation metrics and show that these domain …
Improving Parallel I/O Performance Using Interval I/O, 2010 The University of Maine
Improving Parallel I/O Performance Using Interval I/O, Jeremy Logan
Electronic Theses and Dissertations
Today's most advanced scientific applications run on large clusters consisting of hundreds of thousands of processing cores, access state of the art parallel file systems that allow files to be distributed across hundreds of storage targets, and utilize advanced interconnections systems that allow for theoretical I/O bandwidth of hundreds of gigabytes per second. Despite these advanced technologies, these applications often fail to obtain a reasonable proportion of available I/O bandwidth. The reasons for the poor performance of application I/O include the noncontiguous I/O access patterns used for scientific computing, contention due to false sharing, and the somewhat finicky nature of …
Performance Tuning Of Streaming Applications Via Search-Space Decomposition, 2010 Washington University in St Louis
Performance Tuning Of Streaming Applications Via Search-Space Decomposition, Shobana Padmanabhan, Roger D. Chamberlain, Yixin Chen
All Computer Science and Engineering Research
High-performance streaming applications are typically pipelined and deployed on architecturally diverse (hybrid)systems. Developers of such applications are interested in customizing components used, so as to benefit application performance. We present an efficient and automatic technique for design-space exploration of applications in this problem domain. We solve performance tuning as an optimization problem by formulating cost functions using results from queueing theory. This results in a mixed-integer nonlinear optimization problem which is NP-hard. We reduce the search complexity by decomposing the search space. We have developed a domain-specific decomposition technique using topological information of the application embodied in the queueing network …
Multi-Channel Reliability And Spectrum Usage In Real Homes: Empirical Studies For Home-Area Sensor Networks, 2010 Washington University in St Louis
Multi-Channel Reliability And Spectrum Usage In Real Homes: Empirical Studies For Home-Area Sensor Networks, Mo Sha, Gregory Hackmann, Chenyang Lu
All Computer Science and Engineering Research
Home area networks (HANs) consisting of wireless sensors have emerged as the enabling technology for important applications such as smart energy and assisted living. A key challenge faced by HANs is maintaining reliable operation in real-world residential environments. This paper presents two in-depth empirical studies on the wireless channels in real homes. The spectrum study analyzes the spectrum usage in the 2.4 GHz band where wireless sensor networks based on the IEEE 802.15.4 standard must coexist with existing wireless devices. We characterize the ambient wireless environment in six apartments through passive spectrum analysis across the entire 2.4 GHz band over …
Priority Assignment For Real-Time Flows In Wirelesshart Sensor-Actuator Networks, 2010 Washington University in St Louis
Priority Assignment For Real-Time Flows In Wirelesshart Sensor-Actuator Networks, Abusayeed Saifullah, You Chenyang, Yixin Chen
All Computer Science and Engineering Research
Recent years have witnessed the adoption of wireless sensor-actuator networks as a communication infrastructure for process control applications. An important enabling technology for industrial process control is WirelessHART, an open wireless sensor-actuator network standard specifically developed for process industries. A key challenge faced byWirelessHART networks is to meet the stringent real-time communication requirements imposed by feedback control systems in process industries. Fixed priority scheduling, a popular scheduling policy in real-time networks, has recently been shown to be an effective real-time transmission scheduling policy in WirelessHART networks. Priority assignment has a major impact on the schedulability of real-time flows in these …
An Inexpensive Robot Platform For Teleoperation And Experimentation, 2010 Washington University in St Louis
An Inexpensive Robot Platform For Teleoperation And Experimentation, Daniel A. Lazewatsky, William D. Smart
All Computer Science and Engineering Research
Most commercially-available robots are either aimed at the research community, or are designed with a single purpose in mind. The extensive hobbyist community has tended to focus on the hardware and the low-level software aspects. We claim that there is a need for a low-cost, general-purpose robot, accessible to the hobbyist community, with sufficient computation and sensing to run ``research-grade'' software. In this paper, we describe the design and implementation of such a robot. We explicitly outline our design goals, and show how a capable robot can be assembled from off-the-shelf parts, for a modest cost, by a single person …
E-Carrel: An Environment For Collaborative Textual Scholarship, 2010 Loyola University Chicago
E-Carrel: An Environment For Collaborative Textual Scholarship, George K. Thiruvathukal, Steven E. Jones, Peter Shillingsburg
Computer Science: Faculty Publications and Other Works
The E-Carrel project aims to address the preservation of, access to, and re-uses of humanities electronic text files. It enables dynamic, growing resource projects as repositories for new knowledge. It provides for on-line distributed data and tools that are open to new scholarly enhancement through a user friendly tagging tool, sophisticated use of stand-off markup and annotation (leveraging RDF capabilities), and a browsing system anyone can use. It creates a secure system of text preparation and dissemination that encourages collaboration and participation by anyone interested in the texts. To insure the endurance of authenticated texts, multiple copies are distributed on …
An Adaptable Group Communication System, 2010 Louisiana State University and Agricultural and Mechanical College
An Adaptable Group Communication System, Vikram Reddy Kayathi
LSU Master's Theses
Existing group communication systems like ISIS, Spread, Jgroups etc., provide group communication in a synchronous environment. They are built on top of TCP/IP or UDP and guarantee virtual synchrony and consistency. However, wide area distributed systems are inherently asynchronous. Existing group communication systems are not suitable for wide area deployment. They do not provide persistent communication; i.e., if a node gets temporarily disconnected, all messages directed to that node during that period are lost. Hence such systems are not suitable for deployment in disadvantaged networks. While, according to Brewer’s CAP theorem, it is impossible for a distributed computer system to …
A Curriculum Unit On Programming And Robotics, 2010 Tufts University
A Curriculum Unit On Programming And Robotics, Marina U. Bers, Louise Flannery, Elizabeth Kazakoff, R. Jordan Crouser
Computer Science: Faculty Publications
The Tangible Kindergarten project studies how, when given age-appropriate tools, young children can actively engage in computer programming and robotics in a way that is consistent with developmentally appropriate practice. This research project explores the creation of novel human computer interaction techniques to support learning with technology in early elementary school, with a focus on kindergarten. Since many modern graphical user interfaces are not designed with the developmental needs of such young learners in mind, they are generally ill-suited for use in early elementary school classrooms, especially for computer programming activities. To overcome this problem, this research project has created …
Analysis Of Xbrl Literature: A Decade Of Progress And Puzzle, 2010 Bryant University
Analysis Of Xbrl Literature: A Decade Of Progress And Puzzle, Saeed Roohani, Zhao Xianming, Ernest Capozzoli, Barbara Lamberton
Faculty and Research Publications
XBRL (eXtensible Business Reporting language) was recently, in 2008, in its 10th year. The concept was articulated in 1998 by Charles Hoffman, known as XFRML (eXtensible Financial Reporting Mark Up Language) to facilitate the business reporting process and improve financial reporting. The objective of this paper is to examine a decade (1998-2008) of XBRL articles published in various publications including trade, practitioner and academic journals to identify trends and patterns, milestones, and organizations actively contributed to this development. Another goal is to assess public perceptions of XBRL, its capabilities and its future. We examined published articles where XBRL appeared either …
An Attempt To Find Neighbors, 2010 Kennesaw State University
An Attempt To Find Neighbors, Yong Shi, Ryan Rosenblum
Faculty and Research Publications
In this paper, we present our continuous research on similarity search problems. Previously we proposed PanKNN[18]which is a novel technique that explores the meaning of K nearest neighbors from a new perspective, redefines the distances between data points and a given query point Q, and efficiently and effectively selects data points which are closest to Q. It can be applied in various data mining fields. In this paper, we present our approach to solving the similarity search problem in the presence of obstacles. We apply the concept of obstacle points and process the similarity search problems in a different way. …
Biomedical Relationship Extraction From Literature Based On Bio-Semantic Token Subsequences, 2010 Kennesaw State University
Biomedical Relationship Extraction From Literature Based On Bio-Semantic Token Subsequences, Ying Xie, Jayasimha R. Katukuri, Vijay V. Raghavan
Faculty and Research Publications
Relationship Extraction (RE) from biomedical literature is an important and challenging problem in both text mining and bioinformatics. Although various approaches have been proposed to extract protein?protein interaction types, their accuracy rates leave a large room for further exploring. In this paper, two supervised learning algorithms based on newly defined "bio-semantic token subsequence" are proposed for multi-class biomedical relationship classification. The first approach calculates a "bio-semantic token subsequence kernel", whereas the second one explicitly extracts weighted features from bio-semantic token subsequences. The two proposed approaches outperform several alternatives reported in literature on multi-class protein?protein interaction classification.
A Quadratic Lower Bound For Rocchio’S Similarity-Based Relevance Feedback Algorithm With A Fixed Query Updating Factor, 2010 The University of Texas Rio Grande Valley
A Quadratic Lower Bound For Rocchio’S Similarity-Based Relevance Feedback Algorithm With A Fixed Query Updating Factor, Zhixiang Chen, Bin Fu, John P. Abraham
Computer Science Faculty Publications and Presentations
Rocchio’s similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive supervised learning algorithm from examples. In practice, Rocchio’s algorithm often uses a fixed query updating factor. When this is the case, we strengthen the linear Ω(n) lower bound obtained by Chen and Zhu (Inf. Retr. 5:61–86, 2002) and prove that Rocchio’s algorithm makes Ω(k(n−k)) mistakes in searching for a collection of documents represented by a monotone disjunction of k relevant features over the n-dimensional binary vector space {0,1}n, …
Task Analysis, Modeling, And Automatic Identification Of Elemental Tasks In Robot-Assisted Laparoscopic Surgery, 2010 Wayne State University
Task Analysis, Modeling, And Automatic Identification Of Elemental Tasks In Robot-Assisted Laparoscopic Surgery, Lavie Pinchas Golenberg
Wayne State University Dissertations
Robotic microsurgery provides many advantages for surgical operations, including tremor filtration, an increase in dexterity, and smaller incisions. There is a growing need for a task analyses on robotic laparoscopic operations to understand better the tasks involved in robotic microsurgery cases. A few research groups have conducted task observations to help systems automatically identify surgeon skill based on task execution. Their gesture analyses, however, lacked depth and their class libraries were composed of ambiguous groupings of gestures that did not share contextual similarities.
A Hierarchical Task Analysis was performed on a four-throw suturing task using a robotic microsurgical platform. Three …
Localized Feature Selection For Unsupervised Learning, 2010 Wayne State University
Localized Feature Selection For Unsupervised Learning, Yuanhong Li
Wayne State University Dissertations
Clustering is the unsupervised classification of data objects into different groups (clusters) such that objects in one group are similar together and dissimilar from another group. Feature selection for unsupervised learning is a technique that chooses the best feature subset for clustering. In general, unsupervised feature selection algorithms conduct feature selection in a global sense by producing a common feature subset for all the clusters. This, however, can be invalid in clustering practice, where the local intrinsic property of data matters more, which implies that localized feature selection is more desirable.
In this dissertation, we focus on cluster-wise feature selection …
Scientific Workflow Integration For Services Computing, 2010 Wayne State University
Scientific Workflow Integration For Services Computing, Cui Lin
Wayne State University Dissertations
In recent years, significant scientific advances are increasingly achieved through complex scientific processes. As the exponential growth in computing technologies and scientific data, a scientific workflow may comprise a large number of heterogeneous scientific services and applications, provided by different organizations. These services, applications, and their associated data are usually distributed across heterogeneous computing environments. The integration and management of such scientific workflows are pushing the limits of current workflow technology. This dissertation presents an integrated solution to composing, scheduling, executing and developing scientific workflows and scientific workflow management systems.
To provide a foundation for workflow composition, scheduling, execution and …
Dynamic Multivariate Simplex Splines For Volume Representation And Modeling, 2010 Wayne State University
Dynamic Multivariate Simplex Splines For Volume Representation And Modeling, Yunhao Tan
Wayne State University Dissertations
Volume representation and modeling of heterogeneous objects acquired from real world are very challenging research tasks and playing fundamental roles in many potential applications, e.g., volume reconstruction, volume simulation and volume registration. In order to accurately and efficiently represent and model the real-world objects, this dissertation proposes an integrated computational framework based on dynamic multivariate simplex splines (DMSS) that can greatly improve the accuracy and efficacy of modeling and simulation of heterogenous objects. The framework can not only reconstruct with high accuracy geometric, material, and other quantities associated with heterogeneous real-world models, but also simulate the complicated dynamics precisely by …
Filter Scheduling Function Model In Internet Server: Resource Configuration, Performance Evaluation And Optimal Scheduling, 2010 Wayne State University
Filter Scheduling Function Model In Internet Server: Resource Configuration, Performance Evaluation And Optimal Scheduling, Minghua Xu
Wayne State University Dissertations
ABSTRACT
FILTER SCHEDULING FUNCTION MODEL IN INTERNET SERVER:
RESOURCE CONFIGURATION, PERFORMANCE EVALUATION AND
OPTIMAL SCHEDULING
by
MINGHUA XU
August 2010
Advisor: Dr. Cheng-Zhong Xu
Major: Computer Engineering
Degree: Doctor of Philosophy
Internet traffic often exhibits a structure with rich high-order statistical properties like selfsimilarity
and long-range dependency (LRD). This greatly complicates the problem of
server performance modeling and optimization. On the other hand, popularity of Internet
has created numerous client-server or peer-to-peer applications, with most of them,
such as online payment, purchasing, trading, searching, publishing and media streaming,
being timing sensitive and/or financially critical. The scheduling policy in Internet servers …
A Scientific Workflow System For Genomic Data Analysis, 2010 Wayne State University
A Scientific Workflow System For Genomic Data Analysis, Jamal Ali Musleh Alhiyafi
Wayne State University Dissertations
Scientific workflows have become increasingly popular as a new computing paradigm for scientists to design and execute complex and distributed scientific processes to enable and accelerate many scientific discoveries. Although several scientific workflow management systems (SWFMSs) have been developed, there is a great need for an integrated scientific workflow system that enables the design and execution of higher-level scientific workflows, which integrate heterogeneous scientific workflows enacted by existing SWFMSs. On one hand, science is becoming increasingly collaborative today, requiring an integrated solution that combines the features and capabilities of different SWFMSs, which are typically developed and optimized towards one single …
Data Clustering And Visualization Through Matrix Factorization, 2010 Wayne State University
Data Clustering And Visualization Through Matrix Factorization, Yanhua Chen
Wayne State University Dissertations
Clustering is traditionally an unsupervised task which is to find natural groupings or clusters in multidimensional data based on perceived similarities among the patterns. The purpose of clustering is to extract useful information
from unlabeled data.
In order to present the extracted useful knowledge obtained by clustering in a meaningful way, data visualization becomes a popular and growing area of research field. Visualization can provide a qualitative overview of large and complex data sets, which help us the desired insight in truly understanding the phenomena of interest in data.
The contribution of this dissertation is two-fold: Semi-Supervised Non-negative Matrix Factorization …