Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 46

Full-Text Articles in Physical Sciences and Mathematics

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope Dec 2009

Biological Sequence Simulation For Testing Complex Evolutionary Hypotheses: Indel-Seq-Gen Version 2.0, Cory L. Strope

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Reconstructing the evolutionary history of biological sequences will provide a better understanding of mechanisms of sequence divergence and functional evolution. Long-term sequence evolution includes not only substitutions of residues but also more dynamic changes such as insertion, deletion, and long-range rearrangements. Such dynamic changes make reconstructing sequence evolution history difficult and affect the accuracy of molecular evolutionary methods, such as multiple sequence alignments (MSAs) and phylogenetic methods. In order to test the accuracy of these methods, benchmark datasets are required. However, currently available benchmark datasets have limitations in their sizes and evolutionary histories of the included sequences are unknown. These …


Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet Nov 2009

Classification, Clustering And Data-Mining Of Biological Data, Thomas Triplet

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are currently over 1100 molecular biology databases dispersed throughout the Internet. However, very few of them integrate data from multiple sources. To assist in the functional and evolutionary analysis of the abundant number of novel proteins, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database that integrates data from various biological sources. PROFESS is freely available athttp://cse.unl.edu/~profess/. Our database is designed to be versatile and expandable and will not …


Smartstore: A New Metadata Organization Paradigm With Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian Nov 2009

Smartstore: A New Metadata Organization Paradigm With Semantic-Awareness For Next-Generation File Systems, Yu Hua, Hong Jiang, Yifeng Zhu, Dan Feng, Lei Tian

CSE Conference and Workshop Papers

Existing storage systems using hierarchical directory tree do not meet scalability and functionality requirements for exponentially growing datasets and increasingly complex queries in Exabyte-level systems with billions of files. This paper proposes semantic-aware organization, called SmartStore, which exploits metadata semantics of files to judiciously aggregate correlated files into semantic-aware groups by using information retrieval tools. Decentralized design improves system scalability and reduces query latency for complex queries (range and top-k queries), which is conducive to constructing semantic-aware caching, and conventional filename-based query. SmartStore limits search scope of complex query to a single or a minimal number of semantically related groups …


Dynamic Load Balancing For I/O-Intensive Applications On Clusters, Xiao Qin, Hong Jiang, Adam Manzanares, Xiaojun Ruan, Shu Yin Nov 2009

Dynamic Load Balancing For I/O-Intensive Applications On Clusters, Xiao Qin, Hong Jiang, Adam Manzanares, Xiaojun Ruan, Shu Yin

School of Computing: Faculty Publications

Load balancing for clusters has been investigated extensively, mainly focusing on the effective usage of global CPU and memory resources. However, previous CPU- or memory-centric load balancing schemes suffer significant performance drop under I/O-intensive workloads due to the imbalance of I/O load. To solve this problem, we propose two simple yet effective I/O-aware load-balancing schemes for two types of clusters: (1) homogeneous clusters where nodes are identical and (2) heterogeneous clusters, which are comprised of a variety of nodes with different performance characteristics in computing power, memory capacity, and disk speed. In addition to assigning I/O-intensive sequential and parallel jobs …


Digital Logic Based Encoding Strategies For Steganography On Voice-Over-Ip, Hui Tian, Ke Zhou, Hong Jiang, Dan Feng Oct 2009

Digital Logic Based Encoding Strategies For Steganography On Voice-Over-Ip, Hui Tian, Ke Zhou, Hong Jiang, Dan Feng

CSE Conference and Workshop Papers

This paper presents three encoding strategies based on digital logic for steganography on Voice over IP (VoIP), which aim to enhance the embedding transparency. Differing from previous approaches, our strategies reduce the embedding distortion by improving the similarity between the cover and the covert message using digital logical transformations, instead of reducing the amount of the substitution bits. Therefore, by contrast, our strategies will improve the embedding transparency without sacrificing the embedding capacity. Of these three strategies, the first one adopts logical operations, the second one employs circular shifting operations, and the third one combines the operations of the first …


Refactoring Pipe-Like Mashups For End-User Programmers, Kathryn T. Stolee, Sebastian Elbaum Sep 2009

Refactoring Pipe-Like Mashups For End-User Programmers, Kathryn T. Stolee, Sebastian Elbaum

CSE Technical Reports

Mashups are becoming increasingly popular as end users are able to easily access, manipulate, and compose data from many web sources. We have observed, however, that mashups tend to suffer from deficiencies that propagate as mashups are reused. To address these deficiencies, we would like to bring some of the benefits of software engineering techniques to the end users creating these programs. In this work, we focus on identifying code smells indicative of the deficiencies we observed in web mashups programmed in the popular Yahoo! Pipes environment. Through an empirical study, we explore the impact of those smells on end-user …


Temporal Data Classification Using Linear Classifiers, Peter Revesz, Thomas Triplet Sep 2009

Temporal Data Classification Using Linear Classifiers, Peter Revesz, Thomas Triplet

CSE Conference and Workshop Papers

Data classification is usually based on measurements recorded at the same time. This paper considers temporal data classification where the input is a temporal database that describes measurements over a period of time in history while the predicted class is expected to occur in the future. We describe a new temporal classification method that improves the accuracy of standard classification methods. The benefits of the method are tested on weather forecasting using the meteorological database from the Texas Commission on Environmental Quality.


Vowel Recognition From Articulatory Position Time-Series Data, Jun Wang, Ashok Samal, Jordan R. Green, Tom D. Carrell Sep 2009

Vowel Recognition From Articulatory Position Time-Series Data, Jun Wang, Ashok Samal, Jordan R. Green, Tom D. Carrell

CSE Conference and Workshop Papers

A new approach of recognizing vowels from articulatory position time-series data was proposed and tested in this paper. This approach directly mapped articulatory position time-series data to vowels without extracting articulatory features such as mouth opening. The input time-series data were time-normalized and sampled to fixed-width vectors of articulatory positions. Three commonly used classifiers, Neural Network, Support Vector Machine and Decision Tree were used and their performances were compared on the vectors. A single speaker dataset of eight major English vowels acquired using Electromagnetic Articulograph (EMA) AG500 was used. Recognition rate using cross validation ranged from 76.07% to 91.32% for …


Exploiting Set-Level Non-Uniformity Of Capacity Demand To Enhance Cmp Cooperative Caching, Dongyuan Zhan, Hong Jiang, Sharad C. Seth Sep 2009

Exploiting Set-Level Non-Uniformity Of Capacity Demand To Enhance Cmp Cooperative Caching, Dongyuan Zhan, Hong Jiang, Sharad C. Seth

CSE Technical Reports

As the Memory Wall remains a bottleneck for Chip Multiprocessors (CMP), the effective management of CMP last level caches becomes of paramount importance in minimizing expensive off-chip memory accesses. For the CMPs with private last level caches, Cooperative Caching (CC) has been proposed to enable capacity sharing among private caches by spilling an evicted block from one cache to another. But this eviction-driven CC does not necessarily promote cache performance since it implicitly favors the applications full of block evictions regardless of their real capacity demand. The recent Dynamic Spill-Receive (DSR) paradigm improves cooperative caching by prioritizing applications with higher …


Design Of An All-Optical Wdm Lightpath Concentrator, Shivashis Saha, Jitender S. Deogun Aug 2009

Design Of An All-Optical Wdm Lightpath Concentrator, Shivashis Saha, Jitender S. Deogun

CSE Technical Reports

A design of a nonblocking, all-optical lightpath concentrator using WOC and WDM crossbar switches is presented. The proposed concentrator is highly scalable, cost-efficient, and can switch signals in both space and wavelength domains without requiring a separate wavelength conversion stage.


Selection Of Switching Sites In All-Optical Nework Topology Design, Shivashis Saha, Eric D. Manley, Jitender S. Deogun Aug 2009

Selection Of Switching Sites In All-Optical Nework Topology Design, Shivashis Saha, Eric D. Manley, Jitender S. Deogun

CSE Technical Reports

In this paper, we consider the problem of topology design for optical networks. We investigate the problem of selecting switching sites to minimize total cost of the optical network. The cost of an optical network can be expressed as a sum of three main factors: the site cost, the link cost, and the switch cost. To the best of our knowledge, this problem has not been studied in its general form as investigated in this paper. We present a mixed integer quadratic programming (MIQP) formulation of the problem to find the optimal value of the total network cost. We also …


P-Code: A New Raid-6 Code With Optimal Properties, Chao Jin, Hong Jiang, Dan Feng, Lei Tian Jun 2009

P-Code: A New Raid-6 Code With Optimal Properties, Chao Jin, Hong Jiang, Dan Feng, Lei Tian

CSE Conference and Workshop Papers

RAID-6 significantly outperforms the other RAID levels in disk-failure tolerance due to its ability to tolerate arbitrary two concurrent disk failures in a disk array. The underlying parity array codes have a significant impact on RAID-6’s performance. In this paper, we propose a new XOR-based RAID-6 code, called the Partition Code (P-Code). P-Code is a very simple and flexible vertical code, making it easy to understand and implement. It works on a group of (prime – 1) or (prime) disks, and its coding scheme is based on an equal partition of a specified two-integer-tuple set. P-Code has the following properties: …


Deployed Software Analysis, Madeline M. Diep May 2009

Deployed Software Analysis, Madeline M. Diep

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Profiling can offer a valuable characterization of software behavior. The richer the characterization is, the more effective the client analyses are in supporting quality assurance activities. For today's complex software, however, obtaining a rich characterization with the input provided by in-house test suites is becoming more difficult and expensive. Extending the profiling activity to deployed environments can mitigate this shortcoming by exposing more program behavior reflecting real software usage. To make profiling of deployed software plausible, however, we need to take into consideration that there are fundamental differences between the development and the deployed environments. Deployed environments allow for less …


An Efficient Algorithm For Real-Time Divisible Load Scheduling, Anwar Mamat, Ying Lu, Jitender S. Deogun, Steve Goddard May 2009

An Efficient Algorithm For Real-Time Divisible Load Scheduling, Anwar Mamat, Ying Lu, Jitender S. Deogun, Steve Goddard

CSE Technical Reports

Providing QoS and performance guarantees to arbitrarily divisible loads has become a significant problem for many cluster-based research computing facilities. While progress is being made in scheduling arbitrarily divisible loads, existing approaches are not very efficient and cannot scale to large clusters. In this paper we propose an efficient algorithm for real-time divisible load scheduling, which has a time complexity linear to the number of tasks and the number of nodes in the cluster.


Network Coding For Wdm All-Optical Multicast, Eric D. Manley, Jitender S. Deogun, Lisong Xu, Dennis R. Alexander Apr 2009

Network Coding For Wdm All-Optical Multicast, Eric D. Manley, Jitender S. Deogun, Lisong Xu, Dennis R. Alexander

CSE Technical Reports

Network coding has become a useful means for achieving efficient multicast, and the optical community has started to examine its application to optical networks. However, a number of challenges, including limited processing capability and coarse bandwidth granularity, need to be overcome before network coding can be effectively used in optical networks. In this paper, we address some of these problems. We consider the problem of finding efficient routes to use with coding, and we study the effectiveness of using network coding for optical-layer dedicated protection of multicast traffic. We also propose architectures for all-optical circuits capable of performing the processing …


Using Gis To Locate Areas For Growing Quality Coffee In Honduras, Ellen Mickle Apr 2009

Using Gis To Locate Areas For Growing Quality Coffee In Honduras, Ellen Mickle

Department of Environmental Studies: Undergraduate Student Theses

Abstract Small-scale coffee producers worldwide remain vulnerable to price fluctuations after the 1999-2003 coffee crisis. One way to increase small-scale farmer economic resilience is to produce a more expensive product, such as quality coffee. There is growing demand in coffee-producing and coffee-importing countries for user-friendly tools that facilitate the marketing of quality coffee. The purpose of this study is to develop a prototypical quality coffee marketing tool in the form of a GIS model that identifies regions for producing quality coffee in a country not usually associated with quality coffee, Honduras. Maps of areas for growing quality coffee were produced …


Spa: On-Line Availability Upgrades For Parity-Based Raids Through Supplementary Parity Augmentations, Lei Tian, Hong Jiang, Dan Feng, Qiang Cao, Changsheng Xie, Qin Xin Feb 2009

Spa: On-Line Availability Upgrades For Parity-Based Raids Through Supplementary Parity Augmentations, Lei Tian, Hong Jiang, Dan Feng, Qiang Cao, Changsheng Xie, Qin Xin

CSE Technical Reports

In this paper, we propose a simple but powerful on-line availability upgrade mechanism, Supplementary Parity Augmentations (SPA), to address the availability issue for parity-based RAID systems. The basic idea of SPA is to store and update the supplementary parity units on one or a few newly augmented spare disks for on-line RAID systems in the operational mode, thus achieving the goals of improving the reconstruction performance while tole-rating multiple disk failures and latent sector errors simultaneously. By applying the exclusive OR operations appropriately among supplementary parity, full parity and data units, SPA can reconstruct the data on the failed disks …


Carving And Replaying Differential Unit Test Cases From System Test Cases, Sebastian Elbaum, Hui Nee Chin, Matthew B. Dwyer, Matthew Jorde Feb 2009

Carving And Replaying Differential Unit Test Cases From System Test Cases, Sebastian Elbaum, Hui Nee Chin, Matthew B. Dwyer, Matthew Jorde

School of Computing: Faculty Publications

Unit test cases are focused and efficient. System tests are effective at exercising complex usage patterns. Differential unit tests (DUTs) are a hybrid of unit and system tests that exploits their strengths. They are generated by carving the system components, while executing a system test case, that influence the behavior of the target unit and then reassembling those components so that the unit can be exercised as it was by the system test. In this paper, we show that DUTs retain some of the advantages of unit tests, can be automatically generated, and have the potential for revealing faults related …


A Multiagent Framework For Human Coalition Formation, Nobel Khandaker, Leen-Kiat Soh Jan 2009

A Multiagent Framework For Human Coalition Formation, Nobel Khandaker, Leen-Kiat Soh

CSE Technical Reports

Human users form coalitions to solve complex tasks and earn rewards. Examples of such coalition formation can be found in the military, education, and business domains. Multiagent coalition formation techniques cannot be readily used to form human coalitions due to the unique aspects of the human coalition formation problem, e.g., uncertainty in human user behavior and changes in human user behaviors due to human learning. Thus, a multiagent system designed to form human coalitions has to solve a learning problem, that is further made difficult by the limited learning opportunities and usability issues (i.e., actions or decisions being perceived as …


Multiagent Simulation Of Collaboration And Scaffolding Of A Cscl Environment, Nobel Khandaker, Leen-Kiat Soh Jan 2009

Multiagent Simulation Of Collaboration And Scaffolding Of A Cscl Environment, Nobel Khandaker, Leen-Kiat Soh

CSE Technical Reports

Multiagent techniques improves student learning in Computer-Supported Collaborative Learning (CSCL) environments through multiagent coalition formation and intelligent support to the instructors and students. Researchers designing the multiagent tools and techniques for CSCL environments are often faced with high cost, time, and effort required to investigate the effectiveness of their tools and techniques in large-scale and longitudinal studies in a real-world environment containing human users. Here, we propose SimCoL, a multiagent environment that simulates collaborative learning among students and agents providing support to the teacher and the students. Our goal with SimCoL is to provide a comprehensive testbed for multiagent researchers …


Agent Sensing In Limited Resource Environments, Adam Eck, Leen-Kiat Soh Jan 2009

Agent Sensing In Limited Resource Environments, Adam Eck, Leen-Kiat Soh

CSE Technical Reports

One of the key challenges for multiagent systems (MAS) is optimizing performance in limited resource environments. Previous research in this area has focused on the problems of 1) resource allocation and arbitration, and 2) bounded rationality, which describe the relationship between resource constraints and both agent reasoning and actuation. However, less work exists ad-dressing the effect of consuming resources during agent sensing, particularly two important tradeoffs. First, sensing can reduce resource availability, resulting in a tradeoff between overall system performance and an agent’s sensing behavior (the Performance Tradeoff). Second, consuming resources during sensing can alter the outcome of the measurement …


Debar: A Scalable High-Performance De-Duplication Storage System For Backup And Archiving, Tianming Yang, Hong Jiang, Dan Feng, Zhongying Niu Jan 2009

Debar: A Scalable High-Performance De-Duplication Storage System For Backup And Archiving, Tianming Yang, Hong Jiang, Dan Feng, Zhongying Niu

CSE Technical Reports

We present DEBAR, a scalable and high-performance de-duplication storage system for backup and archiving, to overcome the throughput and scalability limitations of the state-of-the-art data de-duplication schemes, including the Data Domain De-duplication File System (DDFS). DEBAR uses a two-phase de-duplication scheme (TPDS) that exploits memory cache and disk index properties to judiciously turn the notoriously random and small disk I/Os of fingerprint lookups and updates into large sequential disk I/Os, hence achieving a very high de-duplication throughput. The salient feature of this approach is that both the system backup and archiving capacity and the de-duplication performance can be dynamically and …


Directed Test Suite Augmentation, Zhihong Xu, Gregg Rothermel Jan 2009

Directed Test Suite Augmentation, Zhihong Xu, Gregg Rothermel

CSE Conference and Workshop Papers

As software evolves, engineers use regression testing to evaluate its fitness for release. Such testing typically begins with existing test cases, and many techniques have been proposed for reusing these cost-effectively. After reusing test cases, however, it is also important to consider code or behavior that has not been exercised by existing test cases and generate new test cases to validate these. This process is known as test suite augmentation. In this paper we present a directed test suite augmentation technique, that utilizes results from reuse of existing test cases together with an incremental concolic testing algorithm to augment …


Technical Reports (2004 - 2009) Jan 2009

Technical Reports (2004 - 2009)

CSE Technical Reports

Authors of Technical Reports (2005-2009):
Choueiry, Berthe
Cohen, Myra
Deogun, Jitender
Dwyer, Matthew
Elbaum, Sebastian
Goddard, Steve
Henninger, Scott
Jiang, Hong
Lu, Ying
Ramamurthy, Byrav
Rothermel, Gregg
Scott, Stephen
Seth, Sharad
Soh, Leen-Kiat
Srisa-an, Witty
Swanson, David
Variyam, Vinodchandran
Wang, Jun
Xu, Lisong


Exploring Parameterized Relational Consistency, Shant K. Karakashian, Robert J. Woodward, Berthe Y. Choueiry Jan 2009

Exploring Parameterized Relational Consistency, Shant K. Karakashian, Robert J. Woodward, Berthe Y. Choueiry

CSE Technical Reports

Consistency properties and algorithms for achieving them are at the heart of the success of Constraint Programming. For non-binary Constraint Satisfaction Problems (CSPs), the relational-consistency property R(i,j)C of [Dechter and van Beek 1997] may add new non-binary constraints to the constraint network, thus modifying its topology. The domain-filtering properties of [Bessiere et al. 2008] filter the domains of the variables and leave the constraints unchanged but are restricted to combinations of two constraints. We restate the property of m-wise consistency [Gyssens 1986; Jegou 1993] as relational (*,m)-consistency, R(*,m)C. R(*,m)C ensures that any tuple in a relation is consistent in every …


A Hybrid Test Architecture To Reduce Test Application Time In Full Scan Sequential Circuits, Priyankar Ghosh, Srobona Mitra, Indranil Sengupta, Bhargab B. Bhattacharya, Sharad C. Seth Jan 2009

A Hybrid Test Architecture To Reduce Test Application Time In Full Scan Sequential Circuits, Priyankar Ghosh, Srobona Mitra, Indranil Sengupta, Bhargab B. Bhattacharya, Sharad C. Seth

CSE Conference and Workshop Papers

Abstract—Full scan based design technique is widely used to alleviate the complexity of test generation for sequential circuits. However, this approach leads to substantial increase in test application time, because of serial loading of vectors. Although BIST based approaches offer faster testing, they usually suffer from low fault coverage. In this paper, we propose a hybrid test architecture, which achieves significant reduction in test application time. The test suite consists of: (i) some external deterministic test vectors to be scanned in, and (ii) internally generated responses of the CUT to be re-applied as tests iteratively, in functional (non-scan) mode. The …


Spatio-Temporal Event Model For Cyber-Physical Systems, Ying Tan, Mehmet C. Vuran, Steve Goddard Jan 2009

Spatio-Temporal Event Model For Cyber-Physical Systems, Ying Tan, Mehmet C. Vuran, Steve Goddard

CSE Conference and Workshop Papers

The emerging Cyber-Physical Systems (CPSs) are envisioned to integrate computation, communication and control with the physical world. Therefore, CPS requires close interactions between the cyber and physical worlds both in time and space. These interactions are usually governed by events, which occur in the physical world and should autonomously be reflected in the cyber-world, and actions, which are taken by the CPS as a result of detection of events and certain decision mechanisms. Both event detection and action decision operations should be performed accurately and timely to guarantee temporal and spatial correctness. This calls for a flexible architecture and task …


Redistricting Using Heuristic-Based Polygonal Clustering, Deepti Joshi, Leen-Kiat Soh, Ashok K. Samal Jan 2009

Redistricting Using Heuristic-Based Polygonal Clustering, Deepti Joshi, Leen-Kiat Soh, Ashok K. Samal

CSE Conference and Workshop Papers

Redistricting is the process of dividing a geographic area into districts or zones. This process has been considered in the past as a problem that is computationally too complex for an automated system to be developed that can produce unbiased plans. In this paper we present a novel method for redistricting a geographic area using a heuristic-based approach for polygonal spatial clustering. While clustering geospatial polygons several complex issues need to be addressed – such as: removing order dependency, clustering all polygons assuming no outliers, and strategically utilizing domain knowledge to guide the clustering process. In order to address these …


Joint Computing And Network Resource Scheduling In A Lambda Grid Network, Vaidhehi Lakshmiraman, Byrav Ramamurthy Jan 2009

Joint Computing And Network Resource Scheduling In A Lambda Grid Network, Vaidhehi Lakshmiraman, Byrav Ramamurthy

CSE Conference and Workshop Papers

Data-intensive Grid applications require huge data transfers between grid computing nodes. These computing nodes, where computing jobs are executed, are usually geographically separated. A grid network that employs optical wavelength division multiplexing (WDM) technology and optical switches to interconnect computing resources with dynamically provisioned multi-gigabit rate bandwidth lightpath is called a Lambda Grid network. A computing task may be executed on any one of several computing nodes which possesses the necessary resources. In order to reflect the reality in job scheduling, allocation of network resources for data transfer should be taken into consideration. However, few scheduling methods consider the communication …


Optimal Segment Size For Fixed-Sized Segment Protection In Wavelength-Routed Optical Networks, Raghunath Tewari, Byrav Ramamurthy Jan 2009

Optimal Segment Size For Fixed-Sized Segment Protection In Wavelength-Routed Optical Networks, Raghunath Tewari, Byrav Ramamurthy

CSE Conference and Workshop Papers

Protecting a network against link failures is a major challenge faced by network operators. The protection scheme has to address two important objectives - fast recovery and minimizing the amount of backup resources needed. Every protection algorithm is a tradeoff between these two objectives.

In this paper, we study the problem of segment protection. In particular, we investigate what is the optimal segment size that obtains the best tradeoff between the time taken for recovery and minimizing the bandwidth used by the backup segments. We focus on the uniform fixed-length segment protection method, where each primary path is divided into …