Engineering | Open Access Articles | Digital Commons Network™

From Inductive To Deductive Learning, Mikhail Mayers, Brian Henson

Undergraduate Research & Mentoring Program

Using Machine Vision as a way to give information to Prolog. Using Prolog to solve deductive problems and analogical problems without having to manually enter all facts and information.

Go to article

Resizable, Scalable, Concurrent Hash Tables Via Relativistic Programming, Josh Triplett, Paul E. Mckenney, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Presentation focusing on software synchronization, thread locking, transactional memory, and relativistic programming. Hash table algorithms are presented with examples of relativistic list insertion and removal, and related data structures. Existing approaches are compared to new methodologies and future work with relativistic data structures.

Go to article

Efficient Support Of Consistent Cyclic Search With Read-Copy-Update And Parallel Updates, Jonathan Walpole, Paul E. Mckenney

Computer Science Faculty Publications and Presentations

A method, system and computer program product for supporting concurrent updates to a shared data element group while preserving group integrity on behalf of one or more readers that are concurrently referencing group data elements without using locks or atomic instructions. Two or more updaters may be invoked to generate new group data elements. Each new data element created by the same up dater is assigned a new generation number that is different than a global generation number associated with the data element group and which allows a reader of the data element group to determine whether the new data …

Go to article

Generalized Construction Of Scalable Concurrent Data Structures Via Relativistic Programming, Josh Triplett, Paul E. Mckenney, Philip W. Howard, Jonathan Walpole

Computer Science Faculty Publications and Presentations

We present relativistic programming, a concurrent programming model based on shared addressing, which supports efficient, scalable operation on either uniform shared-memory or distributed shared- memory systems. Relativistic programming provides a strong causal ordering property, allowing a series of read operations to appear as an atomic transaction that occurs entirely between two ordered write operations. This preserves the simple immutable-memory programming model available via mutual exclusion or transactional memory. Furthermore, relativistic programming provides joint-access parallelism, allowing readers to run concurrently with a writer on the same data. We demonstrate a generalized construction technique for concurrent data structures based on relativistic programming, …

Go to article

The Ordering Requirements Of Relativistic And Reader-Writer Locking Approaches To Shared Data Access, Philip William Howard, Josh Triplett, Jonathan Walpole, Paul E. Mckenney

Computer Science Faculty Publications and Presentations

The semantics of reader-writer locks allow read-side concurrency. Unfortunately, the locking primitives serialize access to the lock variable to an extent that little or no concurrency is realized in practice for small critical sections. Relativistic programming is a methodology that also allows read- side concurrency. Relativistic programming uses dfferent ordering constraints than reader-writer locking. The different ordering constraints allow relativistic readers to proceed without synchronization so relativistic readers scale even for very short critical sections. In this paper we explore the diferences between the ordering constraints for reader-writer locking and relativistic programs. We show how and why the dfferent ordering …

Go to article

Scalable Load And Store Processing In Latency Tolerant Processors, Amit Vasant Gandhi

Dissertations and Theses

Memory latency-tolerant architectures support thousands of in-flight instructions without proportionate scaling of cycle-critical processor resources, and thousands of useful instructions can complete in parallel with a long-latency miss to memory. These architectures, however, require large queues to track all loads and stores executed while a long-latency miss is pending. Hierarchical designs alleviate cycle-time impact of these structures but the Content-Addressable-Memory (CAM) and search functions required to enforce memory ordering and provide data-forwarding place high demand on area and power.

Many recent proposals address the complexity of load and store queues. However, none of these proposals addresses the fundamental source of …

Go to article

The Quest For Novel Computational Paradigms And Machines, Christof Teuscher

Electrical and Computer Engineering Faculty Publications and Presentations

The quest for novel and unconventional computing machines is mainly motivated by the man-machine dichotomy and by the belief that meeting tomorrow's complex real world challenges will require new paradigms and new engineering methods to organize, train, and program such machines and to interact with them.

Go to article

Infrastructure For Performance Tuning Mpi Applications, Kathryn Marie Mohror

Dissertations and Theses

Clusters of workstations are becoming increasingly popular as a low-budget alternative for supercomputing power. In these systems,message-passing is often used to allow the separate nodes to act as a single computing machine. Programmers of such systems face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due to the vast amount of performance data that is collected, and the time and expertise necessary to use traditional parallel performance tools to analyze that data.

The goal of this project is to increase the level of performance tool support for message-passing application programmers on clusters of workstations. …

Go to article

Pperfgrid: A Grid Services-Based Tool For The Exchange Of Heterogeneous Parallel Performance Data, John Jared Hoffman

Dissertations and Theses

This thesis details the approach taken in developing PPerfGrid. Section 2 discusses other research related to this project. Section 3 provides general background on the technologies utilized in PPerfGrid, focusing on the components that make up the Grid services architecture. Section 4 provides a description of the architecture of PPerfGrid. Section 5 details the implementation of PPerfGrid. Section 6 presents tests designed to measure the overhead and scalability of the PPerfGrid application. Section 7 suggests future work, and Section 8 concludes the thesis.

Go to article

A Performance Study Of Lam And Mpich On An Smp Cluster, Brian Patrick Kearns

Dissertations and Theses

Many universities and research laboratories have developed low cost clusters, built from Commodity-Off-The-Shelf (COTS) components and running mostly free software. Research has shown that these types of systems are well-equipped to handle many problems requiring parallel processing. The primary components of clusters are hardware, networking, and system software. An important system software consideration for clusters is the choice of the message passing library.

MPI (Message Passing Interface) has arguably become the most widely used message passing library on clusters and other parallel architectures, due in part to its existence as a standard. As a standard, MPI is open for anyone …

Go to article

Querying Geographically Dispersed, Heterogeneous Data Stores: The Pperfxchange Approach, Matthew Edward Colgrove

Dissertations and Theses

This thesis details PPerfXchange’s approach for querying geographically dispersed heterogeneous data stores. While elements of PPerfXchange’s method have been implemented for other application areas, PPerfXchange shows how these elements can be applied to parallel performance analysis. The accomplishments of this thesis are:

The design of an architecture for PPerfXchange, giving a uniform method to query heterogeneous data stores;
A proof of concept prototype implementation of PPerfXchange including a partial implementation of an XQuery processor and a relational database virtual XML document; and
Evaluation of PPerfXchange using example parallel performance analysis data.

Go to article

Towards Comparative Profiling Of Parallel Applications With Pperfdb, Christian Leland Hansen

Dissertations and Theses

Due to the complex nature of parallel programming, it is difficult to diagnose and solve performance related problems. Knowledge of program behavior is obtained experimentally, with repeated runs of a slightly modified version of the application or the same code in different environments. In these circumstances, comparative performance analysis can provide meaningful insights into the subtle effects of system and code changes on parallel program behavior by highlighting the difference in performance results across executions.

I have designed and implemented modules which extend the PPerfDB performance tool to allow access to existing performance data generated by several commonly used tracing …

Go to article

Static Compaction Of Test Sequences For Synchronous Sequential Circuits, Lijie Qi

Dissertations and Theses

Today, VLSI design has progressed to a stage where it needs to incorporate methods of testing circuits. The Automatic Test Pattern Generation (ATPG) is a very attractive method and feasible on almost any combinational and sequential circuit.

Currently available automatic test pattern generators (ATPGs) generate test sets that may be excessively long. Because a cost of testing depends on the test length. compaction techniques have been used to reduce that length. The motivation for studying test compaction is twofold. Firstly, by reducing the test sequence length. the memory requirements during the test application and the test application time are reduced. …

Go to article

Scheduling Of Parallel Jobs On Dynamic, Heterogenous Networks, Dan Clark, Jeremy Casas, Steve Otto, Robert Prouty, Jonathan Walpole

Computer Science Faculty Publications and Presentations

In using a shared network of workstations for parallel processing, it is not only important to consider heterogeneity and differences in processing power between the workstations but also the dynamics of the system as a whole. In such a computing environment where the use of resources vary as other applications consume and release resources, intelligent scheduling of the parallel jobs onto the available resources is essential to maximize resource utilization. Despite this realization, however, there are few systems available that provide an infrastructure for the easy development and testing of these intelligent schedulers. In this paper, an infrastructure is presented …

Go to article

Evolution Of Emergent Computation, James P. Crutchfield, Melanie Mitchell

Computer Science Faculty Publications and Presentations

A simple evolutionary process can discover sophisticated methods for emergent information processing in decentralized spatially-extended systems. The mechanisms underlying the resulting emergent computation are explicated by a novel technique for analyzing particle-based logic embedded in pattern-forming systems. Understanding how globally-coordinated computation can emerge in evolution is relevant both for the scientific understanding of natural information processing and for engineering new forms of parallel computing systems.

Go to article

Methodology For Accurate Speedup Prediction, Aruna Chittor

Dissertations and Theses

The effective use of computational resources requires a good understanding of parallel architectures and algorithms. The effect of the parallel architecture and also the parallel application on the performance of the parallel systems becomes more complex with increasing numbers of processors. We will address this issue in this thesis, and develop a methodology to predict the overall execution time of a parallel application as a function of the system and problem size by combining simple analysis with a few experimental results. We show that runtimes and speedup can be predicted more accurately by analyzing the functional forms of the sequential …

Go to article

Multiplexed Pipelining : A Cost Effective Loop Transformation Technique, Satish Pai

Dissertations and Theses

Parallel processing has gained increasing importance over the last few years. A key aim of parallel processing is to improve the execution times of scientific programs by mapping them to many processors. Loops form an important part of most computational programs and must be processed efficiently to get superior performance in terms of execution times. Important examples of such programs include graphics algorithms, matrix operations (which are used in signal processing and image processing applications), particle simulation, and other scientific applications. Pipelining uses overlapped parallelism to efficiently reduce execution time.

Go to article

Associative Processing Implemented With Content-Addressable Memories, Luis Sergio Kida

Dissertations and Theses

The associative processing model provides an alternative solution to the von Neumann bottleneck. The memory of an associative computer takes some of the responsibility for processing. Only intermediate results are exchanged between memory and processor. This greatly reduces the amount of communication between them. Content-addressable memories are one implementation of memory for this computational model. Associative computers implemented with CAMs have reported performance improvements of three orders of magnitude, which is equivalent to the performance of the same application running in a conventional computer with clock frequencies of the order of GHz. Among the benefits of content-addressable memories to the …

Go to article

Exploiting And/Or Parallelism In Prolog, Bankim Shah

Dissertations and Theses

Logic programming languages have generated increasing interest over the last few years. Logic programming languages like Prolog are being explored for different applications. Prolog is inherently parallel. Attempts are being made to utilize this inherent parallelism. There are two kinds of parallelism present in Prolog, OR parallelism and AND parallelism. OR parallelism is relatively easy to exploit while AND parallelism poses interesting issues. One of the main issues is dependencies between literals.

It is very important to use the AND parallelism available in the language structure as not exploiting it would result in a substantial loss of parallelism. Any system …

Go to article

A Practical Parallel Algorithm For The Minimization Of KröNecker Reed-Muller Expansions, Paul John Gilliam

Dissertations and Theses

A number of recent developments has increased the desirability of using exclusive OR (XOR) gates in the synthesis of switching functions. This has, in turn, led naturally to an increased interest in algorithms for the minimization of Exclusive-Or Sum of Products (ESOP) forms. Although this is an active area of research, it is not nearly as developed as the traditional Sum of Products forms. Computer programs to find minimum ESOPs are not readily available and those that do exist are impractical to use as investigative tools because they are too slow and/or require too much memory. A practical tool would …

Go to article

Threaded Octree Structures For Fast Neighbor Voxel Processing In A Parallel Ray Tracer, B.R. Naveen Chandra

Dissertations and Theses

In the field of Computer Graphics, Ray Tracing has so far been the the best algorithm for rendering of realistic three dimensional images created by mathematical models. Ray Tracing is also known for its very large computation times, where the computation depends on the picture resolution, the number of objects and the complexity of the scene.

Go to article

Parallel Architectures For Solving Combinatorial Problems Of Logic Design, Phuong Minh Ho

Dissertations and Theses

This thesis presents a new, practical approach to solve various NP-hard combinatorial problems of logic synthesis, logic programming, graph theory and related areas. A problem to be solved is polynomially time reduced to one of several generic combinatorial problems which can be expressed in the form of the Generalized Propositional Formula (GPF) : a Boolean product of clauses, where each clause is a sum of products of negated or non-negated literals.

Go to article

Implementing Ray Tracing Algorithm In Parallel Environment, Tjah Jadi

Dissertations and Theses

Ray tracing is a very popular rendering algorithm in the field of computer graphics because it can generate highly-realistic images from three-dimensional models. Unfortunately, the computational cost is very expensive. To speed up the rendering process we present both static and dynamic scheduling (balancing) strategies for a multiprocessor system. Hence, the load balancing among the processors is the most important problem in parallel processing. The implementation of the algorithm is based on a modified octree structure.

Go to article

A New General Purpose Systolic Array For Matrix Computations, Hai Van Dinh Le

Dissertations and Theses

In this thesis, we propose a new systolic architecture which is based on the Faddeev's algorithm. Because Faddeev's algorithm is inherently general purpose, our architecture is able to perform a wide class of matrix computations. And since the architecture is systolic based, it brings massive parallelism to all of its computations. As a result, many matrix operations including addition, multiplication, inversion, LU-decomposition, transpose, and solutions to linear systems of equations can now be performed extremely fast. In addition, our design introduces several concepts which are new to systolic architectures:

- It can be re-configured during run time to perform different …

Go to article

Quadtree-Based Processing Of Digital Images, Ramin Naderi

Dissertations and Theses

Image representation plays an important role in image processing applications, which usually. contain a huge amount of data. An image is a two-dimensional array of points, and each point contains information (eg: color). A 1024 by 1024 pixel image occupies 1 mega byte of space in the main memory. In actual circumstances 2 to 3 mega bytes of space are needed to facilitate the various image processing tasks. Large amounts of secondary memory are also required to hold various data sets.

In this thesis, two different operations on the quadtree are presented.

There are, in general, two types of data …

Go to article

Two New Parallel Processors For Real Time Classification Of 3-D Moving Objects And Quad Tree Generation, Farjam Majd

Dissertations and Theses

Two related image processing problems are addressed in this thesis. First, the problem of identification of 3-D objects in real time is explored. An algorithm to solve this problem and a hardware system for parallel implementation of this algorithm are proposed. The classification scheme is based on the "Invariant Numerical Shape Modeling" (INSM) algorithm originally developed for 2-D pattern recognition such as alphanumeric characters. This algorithm is then extended to 3-D and is used for general 3-D object identification. The hardware system is an SIMD parallel processor, designed in bit slice fashion for expandability. It consists of a library of …

Go to article

Full-Text Articles in Engineering

From Inductive To Deductive Learning, Mikhail Mayers, Brian Henson

Undergraduate Research & Mentoring Program

Resizable, Scalable, Concurrent Hash Tables Via Relativistic Programming, Josh Triplett, Paul E. Mckenney, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Efficient Support Of Consistent Cyclic Search With Read-Copy-Update And Parallel Updates, Jonathan Walpole, Paul E. Mckenney

Computer Science Faculty Publications and Presentations

Generalized Construction Of Scalable Concurrent Data Structures Via Relativistic Programming, Josh Triplett, Paul E. Mckenney, Philip W. Howard, Jonathan Walpole

Computer Science Faculty Publications and Presentations

The Ordering Requirements Of Relativistic And Reader-Writer Locking Approaches To Shared Data Access, Philip William Howard, Josh Triplett, Jonathan Walpole, Paul E. Mckenney

Computer Science Faculty Publications and Presentations

Scalable Load And Store Processing In Latency Tolerant Processors, Amit Vasant Gandhi

Dissertations and Theses

The Quest For Novel Computational Paradigms And Machines, Christof Teuscher

Electrical and Computer Engineering Faculty Publications and Presentations

Infrastructure For Performance Tuning Mpi Applications, Kathryn Marie Mohror

Dissertations and Theses

Pperfgrid: A Grid Services-Based Tool For The Exchange Of Heterogeneous Parallel Performance Data, John Jared Hoffman

Dissertations and Theses

A Performance Study Of Lam And Mpich On An Smp Cluster, Brian Patrick Kearns

Dissertations and Theses

Querying Geographically Dispersed, Heterogeneous Data Stores: The Pperfxchange Approach, Matthew Edward Colgrove

Dissertations and Theses

Towards Comparative Profiling Of Parallel Applications With Pperfdb, Christian Leland Hansen

Dissertations and Theses

Static Compaction Of Test Sequences For Synchronous Sequential Circuits, Lijie Qi

Dissertations and Theses

Scheduling Of Parallel Jobs On Dynamic, Heterogenous Networks, Dan Clark, Jeremy Casas, Steve Otto, Robert Prouty, Jonathan Walpole

Computer Science Faculty Publications and Presentations

Evolution Of Emergent Computation, James P. Crutchfield, Melanie Mitchell

Computer Science Faculty Publications and Presentations

Methodology For Accurate Speedup Prediction, Aruna Chittor

Dissertations and Theses

Multiplexed Pipelining : A Cost Effective Loop Transformation Technique, Satish Pai

Dissertations and Theses

Associative Processing Implemented With Content-Addressable Memories, Luis Sergio Kida

Dissertations and Theses

Exploiting And/Or Parallelism In Prolog, Bankim Shah

Dissertations and Theses

A Practical Parallel Algorithm For The Minimization Of KröNecker Reed-Muller Expansions, Paul John Gilliam

Dissertations and Theses

Threaded Octree Structures For Fast Neighbor Voxel Processing In A Parallel Ray Tracer, B.R. Naveen Chandra

Dissertations and Theses

Parallel Architectures For Solving Combinatorial Problems Of Logic Design, Phuong Minh Ho

Dissertations and Theses

Implementing Ray Tracing Algorithm In Parallel Environment, Tjah Jadi

Dissertations and Theses

A New General Purpose Systolic Array For Matrix Computations, Hai Van Dinh Le

Dissertations and Theses

Quadtree-Based Processing Of Digital Images, Ramin Naderi

Dissertations and Theses

Two New Parallel Processors For Real Time Classification Of 3-D Moving Objects And Quad Tree Generation, Farjam Majd

Dissertations and Theses