Open Access. Powered by Scholars. Published by Universities.®

Computer Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Computer Engineering

Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu Dec 2018

Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu

Computer Science and Engineering: Theses, Dissertations, and Student Research

MicroRNAs (miRNAs) are a class of short (~22 nt) single strand RNA molecules predominantly found in eukaryotes. Being involved in many major biological processes, miRNAs can regulate gene expression by targeting mRNAs to facilitate their degradation or translational inhibition. The imprecise splicing of miRNA splicing which introduces severe variability in terms of sequences of miRNA products and their corresponding downstream gene expression regulation. For example, to study biogenesis of miRNAs, usually, biologists can deplete a gene in the miRNA biogenesis pathway and study the change of miRNA sequences, which can cause impression of miRNAs. Although high-throughput sequencing technologies such as ...


Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy Dec 2018

Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy

Computer Science and Engineering: Theses, Dissertations, and Student Research

The main appeal of the Cloud is in its cost effective and flexible access to computing power. Apache Storm is a data processing framework used to process streaming data. In our work we explore the possibility of offering Apache Storm as a software service. Further, we take advantage of the cgroups feature in Storm to divide the computing power of worker machine into smaller units to be offered to users. We predict that the compute bounds placed on the cgroups could be used to approximate the state of the workflow. We discuss the limitations of the current schedulers in facilitating ...


Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu Dec 2018

Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu

Computer Science and Engineering: Theses, Dissertations, and Student Research

The request latency is an important performance metric of a distributed database, such as the popular Apache Cassandra, because of its direct impact on the user experience. Specifically, the latency of a read or write request is defined as the total time interval from the instant when a user makes the request to the instant when the user receives the request, and it involves not only the actual read or write time at a specific database node, but also various types of latency introduced by the distributed mechanism of the database. Most of the current work focuses only on reducing ...


Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang Nov 2018

Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang

Computer Science and Engineering: Theses, Dissertations, and Student Research

Using mathematical notations and logical reasoning, formal methods precisely define a program’s specifications, from which we can instantiate valid instances of a system. With these techniques, we can perform a variety of analysis tasks to verify system dependability and rigorously prove the correctness of system properties. While there exist well-designed automated verification tools including ones considered lightweight, they still lack a strong adoption in practice. The essence of the problem is that when applied to large real world applications, they are not scalable and applicable due to the expense of thorough verification process. In this thesis, I present a ...


Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu Nov 2018

Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu

Computer Science and Engineering: Theses, Dissertations, and Student Research

A tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are often computationally expensive and are not easy to deploy ...


Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam Nov 2018

Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam

Computer Science and Engineering: Theses, Dissertations, and Student Research

Successful controllers evolve as they are refined, extended, and adapted to new systems and contexts. This evolution occurs in the controller design and also in its software implementation. Model-based design and controller synthesis can help to synchronize this evolution of design and software, but such synchronization is rarely complete as software tends to also evolve in response to elements rarely present in a control model, leading to mismatches between the control design and the software.

In this thesis, we perform a first-of-its-kind study on the evolution of two popular open-source safety-critical autopilot control software -- ArduPilot, and Paparazzi, to better understand ...


Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque Nov 2018

Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque

Computer Science and Engineering: Theses, Dissertations, and Student Research

The already high demand for more bandwidth usage has been growing rapidly. Access network traffic is usually bursty in nature and the present traffic trend is mostly video-dominant. This motivates the need for higher transmission rates in the system. At the same time, the deployment costs and maintenance expenditures have to be reasonable. Therefore, Passive Optical Networks (PON) are considered promising next-generation access technologies. As the existing PON standards are not suitable to support future-PON services and applications, the FSAN (Full Service Access Network) group and the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union) have worked on developing ...


A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun Nov 2018

A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun

Computer Science and Engineering: Theses, Dissertations, and Student Research

Concurrency faults are one of the most damaging types of faults that can affect the dependability of today’s computer systems. Currently, concurrency faults such as process-level races, order violations, and atomicity violations represent the largest class of faults that has been reported to various Linux bug repositories. Clearly, existing approaches for testing such faults during software development processes are not adequate as these faults escape in-house testing efforts and are discovered during deployment and must be debugged.

The main reason concurrency faults are hard to test is because the conditions that allow these to occur can be difficult to ...


Optical Wireless Data Center Networks, Abdelbaset S. Hamza Oct 2018

Optical Wireless Data Center Networks, Abdelbaset S. Hamza

Computer Science and Engineering: Theses, Dissertations, and Student Research

Bandwidth and computation-intensive Big Data applications in disciplines like social media, bio- and nano-informatics, Internet-of-Things (IoT), and real-time analytics, are pushing existing access and core (backbone) networks as well as Data Center Networks (DCNs) to their limits. Next generation DCNs must support continuously increasing network traffic while satisfying minimum performance requirements of latency, reliability, flexibility and scalability. Therefore, a larger number of cables (i.e., copper-cables and fiber optics) may be required in conventional wired DCNs. In addition to limiting the possible topologies, large number of cables may result into design and development problems related to wire ducting and maintenance ...


Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward Sep 2018

Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward

Computer Science and Engineering: Theses, Dissertations, and Student Research

Determining whether or not a Constraint Satisfaction Problem (CSP) has a solution is NP-complete. CSPs are solved by inference (i.e., enforcing consistency), conditioning (i.e., doing search), or, more commonly, by interleaving the two mechanisms. The most common consistency property enforced during search is Generalized Arc Consistency (GAC). In recent years, new algorithms that enforce consistency properties stronger than GAC have been proposed and shown to be necessary to solve difficult problem instances.

We frame the question of balancing the cost and the pruning effectiveness of consistency algorithms as the question of determining where, when, and how much of ...


Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka Aug 2018

Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka

Computer Science and Engineering: Theses, Dissertations, and Student Research

Software testing research often involves reproducing previous experimental results. Previous work created a repository infrastructure for containment and dissemination of testable research subjects using a private centralized storage mechanism for hosting these test subject archives. While this is a good way to store these subjects it can be inefficient when the size of subjects increases or the number of versions of the subject’s source code is large. The delivery of these large subjects from a centralized repository can be quite large and on occasion may not succeed requiring the user to repeat the download request. Coupled with the limited ...


Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall May 2018

Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall

Computer Science and Engineering: Theses, Dissertations, and Student Research

Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of ...


Assessing The Quality And Stability Of Recommender Systems, David Shriver May 2018

Assessing The Quality And Stability Of Recommender Systems, David Shriver

Computer Science and Engineering: Theses, Dissertations, and Student Research

Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these ...


Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli May 2018

Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli

Computer Science and Engineering: Theses, Dissertations, and Student Research

With the demand upsurge for high bandwidth services, continuous increase in the number of cellular subscriptions, adoption of Internet of Things (IoT), and marked growth in Machine-to-Machine (M2M) traffic, there is great stress exerted on cellular network infrastructure. The present wireline and wireless networking technologies are rigid in nature and heavily hardware-dependent, as a result of which the process of infrastructure upgrade to keep up with future demand is cumbersome and expensive.

Software-defined networks (SDN) hold the promise to decrease network rigidity by providing central control and flow abstraction, which in current network setups are hardware-based. The embrace of SDN ...


Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu May 2018

Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu

Computer Science and Engineering: Theses, Dissertations, and Student Research

Finding similar sequences to an input query sequence (DNA or proteins) from a sequence data set is an important problem in bioinformatics. It provides researchers an intuition of what could be related or how the search space can be reduced for further tasks. An exact brute-force nearest-neighbor algorithm used for this task has complexity O(m * n) where n is the database size and m is the query size. Such an algorithm faces time-complexity issues as the database and query sizes increase. Furthermore, the use of alignment-based similarity measures such as minimum edit distance adds an additional complexity to the ...


Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang Apr 2018

Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang

Computer Science and Engineering: Theses, Dissertations, and Student Research

Continuous integration (CI) development environments allow software engineers to frequently integrate and test their code. While CI environments provide advantages, they also utilize non-trivial amounts of time and resources. To address this issue, researchers have adapted techniques for test case prioritization (TCP) and regression test selection (RTS) to CI environments.

To date, current TCP techniques under CI environments have operated on test suites, and have not achieved substantial improvements. In this thesis, we use a lightweight approach based on test suite failure and execution history, and “continuously” prioritizes commits that are waiting for execution in response to the arrival of ...


Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni Apr 2018

Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni

Computer Science and Engineering: Theses, Dissertations, and Student Research

The future pervasive communication and computing devices are envisioned to be tightly integrated with biological systems, i.e., the Internet of Bio-Nano Things. In particular, the study and exploitation of existing processes for the biochemical information exchange and elaboration in biological systems are currently at the forefront of this research direction. Molecular Communication (MC), which studies biochemical information systems with theory and tools from computer communication engineering, has been recently proposed to model and characterize the aforementioned processes. Combined with the rapidly growing field of bio-informatics, which creates a rich profusion of biological data and tools to mine the underlying ...


Scheduling In Mapreduce Clusters, Chen He Feb 2018

Scheduling In Mapreduce Clusters, Chen He

Computer Science and Engineering: Theses, Dissertations, and Student Research

MapReduce is a framework proposed by Google for processing huge amounts of data in a distributed environment. The simplicity of the programming model and the fault-tolerance feature of the framework make it very popular in Big Data processing.

As MapReduce clusters get popular, their scheduling becomes increasingly important. On one hand, many MapReduce applications have high performance requirements, for example, on response time and/or throughput. On the other hand, with the increasing size of MapReduce clusters, the energy-efficient scheduling of MapReduce clusters becomes inevitable. These scheduling challenges, however, have not been systematically studied.

The objective of this dissertation is ...