Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 38

Full-Text Articles in Physical Sciences and Mathematics

Multilevel Combinatorial Optimization Across Quantum Architectures, Hayato Ushijima-Mwesigwa, Ruslan Shaydulin, Christian F.A. Negre, Susan M. Mniszewski, Yuri Alexeev, Ilya Safro Oct 2019

Multilevel Combinatorial Optimization Across Quantum Architectures, Hayato Ushijima-Mwesigwa, Ruslan Shaydulin, Christian F.A. Negre, Susan M. Mniszewski, Yuri Alexeev, Ilya Safro

Publications

Emerging quantum processors provide an opportunity to explore new approaches for solving traditional problems in the Post Moore's law supercomputing era. However, the limited number of qubits makes it infeasible to tackle massive real-world datasets directly in the near future, leading to new challenges in utilizing these quantum processors for practical purposes. Hybrid quantum-classical algorithms that leverage both quantum and classical types of devices are considered as one of the main strategies to apply quantum computing to large-scale problems. In this paper, we advocate the use of multilevel frameworks for combinatorial optimization as a promising general paradigm for designing hybrid …


Hypergraph Partitioning With Embeddings, Justin Sybrandt, Ruslan Shaydulin, Ilya Safro Sep 2019

Hypergraph Partitioning With Embeddings, Justin Sybrandt, Ruslan Shaydulin, Ilya Safro

Publications

The problem of placing circuits on a chip or distributing sparse matrix operations can be modeled as the hypergraph partitioning problem. A hypergraph is a generalization of the traditional graph wherein each "hyperedge" may connect any number of nodes. Hypergraph partitioning, therefore, is the NP-Hard problem of dividing nodes into k" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; font-size: 16px; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; color: rgb(93, 93, 93); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; position: relative;">kk similarly sized disjoint sets while …


Relaxation-Based Coarsening For Multilevel Hypergraph Partitioning, Ruslan Shaydulin, Jie Chen, Ilya Safro Feb 2019

Relaxation-Based Coarsening For Multilevel Hypergraph Partitioning, Ruslan Shaydulin, Jie Chen, Ilya Safro

Publications

Multilevel partitioning methods that are inspired by principles of multiscaling are the most powerful practical hypergraph partitioning solvers. Hypergraph partitioning has many applications in disciplines ranging from scientific computing to data science. In this paper we introduce the concept of algebraic distance on hypergraphs and demonstrate its use as an algorithmic component in the coarsening stage of multilevel hypergraph partitioning solvers. The algebraic distance is a vertex distance measure that extends hyperedge weights for capturing the local connectivity of vertices which is critical for hypergraph coarsening schemes. The practical effectiveness of the proposed measure and corresponding coarsening scheme is demonstrated …


Automated Cluster Provisioning And Workflow Management For Parallel Scientific Applications In The Cloud, Brandon Posey, Christopher Gropp, Alexander Herzog, Amy Apon Nov 2017

Automated Cluster Provisioning And Workflow Management For Parallel Scientific Applications In The Cloud, Brandon Posey, Christopher Gropp, Alexander Herzog, Amy Apon

Publications

Many commercial cloud providers and tools are available that researchers could utilize to advance computational science research. However, adoption by the research community has been slow. In this paper we describe the automated Pro-visioning And Workflow (PAW) management tool for parallel scientific applications in the cloud. PAW is a comprehensive resource provisioning and workflow tool that automates the steps of dynamically provisioning a large scale cluster environment in the cloud, executing a set of jobs or a custom workflow and, after the jobs have completed, de-provisioning the cluster environment in a single operation. A key characteristic of PAW is that …


Maneuverable Applications: Advancing Distributed Computing, William Clay Moody, Amy W. Apon Oct 2017

Maneuverable Applications: Advancing Distributed Computing, William Clay Moody, Amy W. Apon

Publications

Extending the military principle of maneuver into the war-fighting domain of cyberspace, academic and military researchers have produced many theoretical and strategic works, though few have focused on researching the applications and systems that apply this principle. We present a survey of our research in developing new architectures for the enhancement of parallel and distributed applica-tions. Specifically, we discuss our work in applying the military concept of maneuver in the cyberspace domain by creating a set of applications and systems called “ma-neuverable applications.” Our research investigates resource provisioning, application optimization, and cybersecurity enhancement through the modification, relocation, addition or removal …


Laplacian Distribution And Domination, Domingos M. Cardoso, David P. Jacobs, Vilmar Trevisan Sep 2017

Laplacian Distribution And Domination, Domingos M. Cardoso, David P. Jacobs, Vilmar Trevisan

Publications

No abstract provided.


Moliere: Automatic Biomedical Hypothesis Generation System, Justin Sybrandt, Michael Shtutman, Ilya Safro May 2017

Moliere: Automatic Biomedical Hypothesis Generation System, Justin Sybrandt, Michael Shtutman, Ilya Safro

Publications

Hypothesis generation is becoming a crucial time-saving technique which allows biomedical researchers to quickly discover implicit connections between important concepts. Typically, these systems operate on domain-specific fractions of public medical data. MOLIERE, in contrast, utilizes information from over 24.5 million documents. At the heart of our approach lies a multi-modal and multi-relational network of biomedical objects extracted from several heterogeneous datasets from the National Center for Biotechnology Information (NCBI). These objects include but are not limited to scientific papers, keywords, genes, proteins, diseases, and diagnoses. We model hypotheses using Latent Dirichlet Allocation applied on abstracts found near shortest paths discovered …


Random Access In Nondelimited Variable-Length Record Collections For Parallel Reading With Hadoop, Jason Anderson, Christopher Gropp, Linh Ngo, Amy W. Apon May 2017

Random Access In Nondelimited Variable-Length Record Collections For Parallel Reading With Hadoop, Jason Anderson, Christopher Gropp, Linh Ngo, Amy W. Apon

Publications

The industry standard Packet CAPture (PCAP) format for storing network packet traces is normally only readable in serial due to its lack of delimiters, indexing, or blocking. This presents a challenge for parallel analysis of large networks, where packet traces can be many gigabytes in size. In this work we present RAPCAP, a novel method for random access into variable-length record collections like PCAP by identifying a record boundary within a small number of bytes of the access point. Unlike related heuristic methods that can limit scalability with a nonzero probability of error, the new method offers a correctness guarantee …


Vizspace: Interaction In The Positive Parallax Screen Plane, Oyewole Oyekoya, Emily Sassard, Tiana Johnson Apr 2017

Vizspace: Interaction In The Positive Parallax Screen Plane, Oyewole Oyekoya, Emily Sassard, Tiana Johnson

Publications

The VizSpace is a physically situated interactive system that combines touch and hand interactions behind the screen to create the effect that users are reaching inside and interacting in a 3D virtual workspace. It extends the conventional touch table interface with hand tracking and 3D visualization to enable interaction in the positive parallax plane, where the binocular focus falls behind the screen so as not to occlude projected images. This paper covers the system design, human factors and ergonomics considerations for an interactive and immersive gesture-based visualization system. Results are presented from a preliminary user study that validates the usability …


Local Standards For Sample Size At Chi, Kelly Caine May 2016

Local Standards For Sample Size At Chi, Kelly Caine

Publications

We describe the primary ways researchers can determine the size of a sample of research participants, present the benefits and drawbacks of each of those methods, and focus on improving one method that could be useful to the CHI community: local standards. To determine local standards for sample size within the CHI community, we conducted an analysis of all manuscripts published at CHI2014. We find that sample size for manuscripts published at CHI ranges from 1 -- 916,000 and the most common sample size is 12. We also find that sample size differs based on factors such as study setting …


Performance Considerations Of Network Functions Virtualization Using Containers, Jason Anderson, Udit Agarwal, Hongda Li, Hongxin Hu, Craig Lowery, Amy W. Apon Feb 2016

Performance Considerations Of Network Functions Virtualization Using Containers, Jason Anderson, Udit Agarwal, Hongda Li, Hongxin Hu, Craig Lowery, Amy W. Apon

Publications

The network performance of virtual machines plays a critical role in Network Functions Virtualization (NFV), and several technologies have been developed to address hardware-level virtualization shortcomings. Recent advances in operating system level virtualization and deployment platforms such as Docker have made containers an ideal candidate for high performance application encapsulation and deployment. However, Docker and other solutions typically use lower-performing networking mechanisms. In this paper, we explore the feasibility of using technologies designed to accelerate virtual machine networking with containers, in addition to quantifying the network performance of container-based VNFs compared to the state-of-the-art virtual machine solutions. Our results show …


Synthetic Data Generation For The Internet Of Things, Jason Anderson, K. E. Kennedy, Linh B. Ngo, Andre Luckow, Amy Apon Oct 2014

Synthetic Data Generation For The Internet Of Things, Jason Anderson, K. E. Kennedy, Linh B. Ngo, Andre Luckow, Amy Apon

Publications

The concept of Internet of Things (IoT) is rapidly moving from a vision to being pervasive in our everyday lives. This can be observed in the integration of connected sensors from a multitude of devices such as mobile phones, healthcare equipment, and vehicles. There is a need for the development of infrastructure support and analytical tools to handle IoT data, which are naturally big and complex. But, research on IoT data can be constrained by concerns about the release of privately owned data. In this paper, we present the design and implementation results of a synthetic IoT data generation framework. …


Teaching Hdfs/Mapreduce Systems Concepts To Undergraduates, Linh B. Ngo, Amy W. Apon, Edward B. Duffy May 2014

Teaching Hdfs/Mapreduce Systems Concepts To Undergraduates, Linh B. Ngo, Amy W. Apon, Edward B. Duffy

Publications

This paper presents the development of a Hadoop MapReduce module that has been taught in a course in distributed computing to upper undergraduate computer science students at Clemson University. The paper describes our teaching experiences and the feedback from the students over several semesters that have helped to shape the course. We provide suggested best practices for lecture materials, the computing platform, and the teaching methods. In addition, the computing platform and teaching methods can be extended to accommodate emerging technologies and modules for related courses.


Assessing The Effect Of High Performance Computing Capabilities On Academic Research Output, Amy Apon, Linh B. Ngo, Michael E. Payne, Paul W. Wilson Mar 2014

Assessing The Effect Of High Performance Computing Capabilities On Academic Research Output, Amy Apon, Linh B. Ngo, Michael E. Payne, Paul W. Wilson

Publications

This paper uses nonparametric methods and some new results on hypothesis testing with nonparametric efficiency estimators and applies these to analyze the effect of locally-available high performance computing (HPC) resources on universities efficiency in producing research and other outputs. We find that locally-available HPC resources enhance the technical efficiency of research output in Chemistry, Civil Engineering, Physics, and History, but not in Computer Science, Economics, nor English; we find mixed results for Biology. Out research results provide a critical first step in a quantitative economic model for investments in HPC.


Towards A System For Controlling Client-Server Traffic In Virtual Worlds Using Sdn, Jason Anderson, Jim Martin Dec 2013

Towards A System For Controlling Client-Server Traffic In Virtual Worlds Using Sdn, Jason Anderson, Jim Martin

Publications

Scaling virtual worlds in the age of cloud computing is complicated by the problem of efficiently directing client-server traffic in the face of agile and dynamic compute resources. In this proposed model, Software Defined Networking and compact encoding of avatar data in packet headers are combined to make a fast, scalable, high capacity proxy server that can hide the server infrastructure while fitting well with the IaaS paradigm of modern cloud providers.


An Infrastructure To Support Data Integration And Curation For Higher Educational Research, Linh B. Ngo, Amy W. Apon, Pengfei Xuan, Kimberley Ferguson, Christin Marshall, John Mccann, Yueli Zheng Oct 2012

An Infrastructure To Support Data Integration And Curation For Higher Educational Research, Linh B. Ngo, Amy W. Apon, Pengfei Xuan, Kimberley Ferguson, Christin Marshall, John Mccann, Yueli Zheng

Publications

The recent challenges for higher education call for research that can offer a comprehensive understanding about the performance and efficiency of higher education institutions in their three primary missions: research, education, and service. In other for this to happen, it is necessary for researchers to have access to a multitude of data sources.However, due to the nature of their academic training, many higher education practitioners do not have access to expertise in working with different data sources. In this work, we describe a design and implementation for an infrastructure that will bring together the tools and the data to provide …


Accelerating Image Feature Comparisons Using Cuda On Commodity Hardware, Amy Apon, Seth Warn, Wesley Emeneker, John Gauch, Jackson Cothren Jul 2010

Accelerating Image Feature Comparisons Using Cuda On Commodity Hardware, Amy Apon, Seth Warn, Wesley Emeneker, John Gauch, Jackson Cothren

Publications

Given multiple images of the same scene, image registration is the process of determining the correct transformation to bring the images into a common coordinate system—i.e., how the images fit together. Feature based registration applies a transformation function to the input images before performing the correlation step. The result of that transformation, also called feature extraction, is a list of significant points in the images, and the registration process will attempt to correlate these points, rather than directly comparing the input images.


Community Funding Models For Computational Resources, Amy Apon, Jeff Pummill, Dana Brunson May 2010

Community Funding Models For Computational Resources, Amy Apon, Jeff Pummill, Dana Brunson

Publications

As scientific research has extended far beyond the practicality and abilities of laboratory experiments, computational simulations have become the mainstay of enabling and furthering the research in a way never previously thought possible. It is becoming commonplace to model and simulate both the very large, such as black hole collisions in astrophysics, and the very small, such as subatomic particle behavior and interaction in high energy physics. In addition to the previous examples detailing extremes, practically every area of research currently utilizes and benefits from computational resources to simulate their work; financial modeling, weather forecasting, geological phenomena, geo-spatial data analysis, …


High Performance Computing Instrumentation And Research Productivity In U.S. Universities, Amy W. Apon, Linh B. Ngo, Stanley Ahalt, Vijay Dantuluri, Constantin Gurdgiev, Moez Limayem, Michael Stealey Jan 2010

High Performance Computing Instrumentation And Research Productivity In U.S. Universities, Amy W. Apon, Linh B. Ngo, Stanley Ahalt, Vijay Dantuluri, Constantin Gurdgiev, Moez Limayem, Michael Stealey

Publications

This paper studies the relationship between investments in High-Performance Computing (HPC) instrumentation and research competitiveness. Measures of institutional HPC investment are computed from data that is readily available from the Top 500 list, a list that has been published twice a year since 1993 that lists the fastest 500 computers in the world at that time. Institutions that are studied include US doctoral-granting institutions that fall into the very high or high research rankings according to the Carnegie Foundation classifications and additional institutions that have had entries in the Top 500 list. Research competitiveness is derived from federal funding data, …


Accelerating Sift On Parallel Architectures, Amy Apon, Seth Warn, Wesley Emeneker, Jackson Cothren Sep 2009

Accelerating Sift On Parallel Architectures, Amy Apon, Seth Warn, Wesley Emeneker, Jackson Cothren

Publications

SIFT is a widely-used algorithm that extracts features from images; using it to extract information from hundreds of terabytes of aerial and satellite photographs requires parallelization in order to be feasible. We explore accelerating an existing serial SIFT implementation with OpenMP parallelization and GPU execution.


A Performance And Productivity Study Using Mpi, Titanium, And Fortress, Amy Apon, Chris Bryan, Wesley Emeneker Dec 2008

A Performance And Productivity Study Using Mpi, Titanium, And Fortress, Amy Apon, Chris Bryan, Wesley Emeneker

Publications

The popularity of cluster computing has increased focus on usability, especially in the area of programmability. Languages and libraries that require explicit message passing have been the standard. New languages, designed for cluster computing, are coming to the forefront as a way to simplify parallel programming. Titanium and Fortress are examples of this new class of programming paradigms. This papers presents results from a productivity study of these two newcomers with MPI, the de- facto standard for parallel programming.


Developing A Coherent Cyberinfrastructure From Local Campus To National Facilities: Challenges And Strategies, Amy Apon, Patrick Dreher, Vijay Agarwala, Stan Ahalt, Guy Almes, Sue Fratkin, Thomas Hauser, Jan Odegard, Jim Pepin, Craig Stewart Jul 2008

Developing A Coherent Cyberinfrastructure From Local Campus To National Facilities: Challenges And Strategies, Amy Apon, Patrick Dreher, Vijay Agarwala, Stan Ahalt, Guy Almes, Sue Fratkin, Thomas Hauser, Jan Odegard, Jim Pepin, Craig Stewart

Publications

A fundamental goal of cyberinfrastructure (CI) is the integration of computing hardware, software, and network technology, along with data, information management, and human resources to advance scholarship and research. Such integration creates opportunities for researchers, educators, and learners to share ideas, expertise, tools, and facilities in new and powerful ways that cannot be realized if each of these components is applied independently. Bridging the gap between the reality of CI today and its potential in the immediate future is critical to building a balanced CI ecosystem that can support future scholarship and research. This report summarizes the observations and recommendations …


Capacity Planning Of A Commodity Cluster In An Academic Environment: A Case Study, Linh B. Ngo, Amy W. Apon, Baochuan Lu, Hung Bui, Nathan Hamm, Larry Dowdy, Doug Hoffman, Denny Brewer Apr 2008

Capacity Planning Of A Commodity Cluster In An Academic Environment: A Case Study, Linh B. Ngo, Amy W. Apon, Baochuan Lu, Hung Bui, Nathan Hamm, Larry Dowdy, Doug Hoffman, Denny Brewer

Publications

In this paper, the design of a simulation model for evaluating two alternative supercomputer configurations in an academic environment is presented. The workload is analyzed and modeled, and its effect on the relative performance of both systems is studied. The Integrated Capacity Planning Environment (ICPE) toolkit, developed for commodity cluster capacity planning, is successfully applied to the target environment. The ICPE is a tool for workload modeling, simulation modeling, and what-if analysis. A new characterization strategy is applied to the workload to more accurately model commodity cluster work- loads. Through "what-if" analysis, the sensitivity of the baseline system performance to …


Shibboleth As A Tool For Authorized Access Control To The Subversion Repository System, Linh B. Ngo, Amy W. Apon Sep 2007

Shibboleth As A Tool For Authorized Access Control To The Subversion Repository System, Linh B. Ngo, Amy W. Apon

Publications

Shibboleth is an architecture and protocol for allowing users to authenticate and be authorized to use a remote resource by logging into the identity management system that is maintained at their home institution. With Shibboleth, a federation of institutions can share resources among users and yet allow the administration of both the user access control to resources and the user identity and attribute information to be performed at the hosting or home institution. Subversion is a version control repository system that allows the creation of fine-grained permissions to files and directories. In this project an infrastructure, Shibbolized Subversion, has been …


Shibboleth As A Tool For Authorized Access Control To The Subversion Repository System, Amy Apon, Linh B. Ngo Sep 2007

Shibboleth As A Tool For Authorized Access Control To The Subversion Repository System, Amy Apon, Linh B. Ngo

Publications

Shibboleth is an architecture and protocol for allowing users to authenticate and be authorized to use a remote resource by logging into the identity management system that is maintained at their home institution. With Shibboleth, a federation of institutions can share resources among users and yet allow the administration of both the user access control to resources and the user identity and attribute information to be performed at the hosting or home institution. Subversion is a version control repository system that allows the creation of fine-grained permissions to files and directories. In this project an infrastructure, Shibbolized Subversion, has been …


A Case Study On Grid Performance Modeling, Amy Apon, Baochuan Lu, Larry Dowdy, Frank Robinson, Doug Hoffman, Denny Brewer Nov 2006

A Case Study On Grid Performance Modeling, Amy Apon, Baochuan Lu, Larry Dowdy, Frank Robinson, Doug Hoffman, Denny Brewer

Publications

The purpose of this case study is to develop a performance model for an enterprise grid for performance management and capacity planning1. The target environment includes grid applications such as health-care and financial services where the data is located primarily within the resources of a worldwide corporation. The approach is to build a discrete event simulation model for a representative work-flow grid. Five work-flow classes, found using a customized k-means clustering algorithm characterize the workload of the grid. Analyzing the gap between the simulation and measurement data validates the model. The case study demonstrates that the simulation model can be …


The Great Plains Network (Gpn) Middleware Test Bed, Amy Apon, Gregory Monaco, Gordon Springer Sep 2006

The Great Plains Network (Gpn) Middleware Test Bed, Amy Apon, Gregory Monaco, Gordon Springer

Publications

GPN (Great Plains Network) is a consortium of public universities in seven mid-western states. GPN goals include regional strategic planning and the development of a collaboration environment, middleware services and a regional grid for sharing computational, storage and data resources. A major challenge is to arrive at a common authentication and authorization service, based on the set of heterogeneous identity providers at each institution. GPN has built a prototype middleware test bed that includes Shibboleth and other NMI-EDIT middleware components. The test bed includes several prototype end-user applications, and is being used to further our research into fine-grained access control …


Architectural Tradeoffs For Unifying Campus Grid Resources, Amy Apon, Bart Taylor May 2006

Architectural Tradeoffs For Unifying Campus Grid Resources, Amy Apon, Bart Taylor

Publications

Most universities have a powerful collection of computing resources on campus for use in areas from high performance computing to general access student labs. However, these resources are rarely used to their full potential. Grid computing offers a way to unify these resources and to better utilize the capability they provide. The complexity of some grid tools makes learning to use them a daunting task for users not familiar with using the command line. Combining these tools together into a single web portal interface provides campus faculty and students with an easy way to access the campus resources. This paper …


Inital Starting Point Analysis For K-Means Clustering: A Case Study, Amy Apon, Frank Robinson, Denny Brewer, Larry Dowdy, Doug Hoffman, Baochuan Lu Mar 2006

Inital Starting Point Analysis For K-Means Clustering: A Case Study, Amy Apon, Frank Robinson, Denny Brewer, Larry Dowdy, Doug Hoffman, Baochuan Lu

Publications

Workload characterization is an important part of systems performance modeling. Clustering is a method used to find classes of jobs within workloads. K-Means is one of the most popular clustering algorithms. Initial starting point values are needed as input parameters when performing k-means clustering. This paper shows that the results of the running the k-means algorithm on the same workload will vary depending on the values chosen as initial starting points. Fourteen methods of composing initial starting point values are compared in a case study. The results indicate that a synthetic method, scrambled midpoints, is an effective starting point method …


Ampnet - A Highly Available Cluster Interconnection Network, Amy Apon, Larry Bilbur Apr 2003

Ampnet - A Highly Available Cluster Interconnection Network, Amy Apon, Larry Bilbur

Publications

No abstract provided.