Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 22 of 22

Full-Text Articles in Physical Sciences and Mathematics

Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu Dec 2018

Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

MicroRNAs (miRNAs) are a class of short (~22 nt) single strand RNA molecules predominantly found in eukaryotes. Being involved in many major biological processes, miRNAs can regulate gene expression by targeting mRNAs to facilitate their degradation or translational inhibition. The imprecise splicing of miRNA splicing which introduces severe variability in terms of sequences of miRNA products and their corresponding downstream gene expression regulation. For example, to study biogenesis of miRNAs, usually, biologists can deplete a gene in the miRNA biogenesis pathway and study the change of miRNA sequences, which can cause impression of miRNAs. Although high-throughput sequencing technologies such as …


Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy Dec 2018

Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The main appeal of the Cloud is in its cost effective and flexible access to computing power. Apache Storm is a data processing framework used to process streaming data. In our work we explore the possibility of offering Apache Storm as a software service. Further, we take advantage of the cgroups feature in Storm to divide the computing power of worker machine into smaller units to be offered to users. We predict that the compute bounds placed on the cgroups could be used to approximate the state of the workflow. We discuss the limitations of the current schedulers in facilitating …


Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu Dec 2018

Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The request latency is an important performance metric of a distributed database, such as the popular Apache Cassandra, because of its direct impact on the user experience. Specifically, the latency of a read or write request is defined as the total time interval from the instant when a user makes the request to the instant when the user receives the request, and it involves not only the actual read or write time at a specific database node, but also various types of latency introduced by the distributed mechanism of the database. Most of the current work focuses only on reducing …


Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu Nov 2018

Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

A tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are often computationally expensive and are not easy to deploy …


Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam Nov 2018

Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Successful controllers evolve as they are refined, extended, and adapted to new systems and contexts. This evolution occurs in the controller design and also in its software implementation. Model-based design and controller synthesis can help to synchronize this evolution of design and software, but such synchronization is rarely complete as software tends to also evolve in response to elements rarely present in a control model, leading to mismatches between the control design and the software.

In this thesis, we perform a first-of-its-kind study on the evolution of two popular open-source safety-critical autopilot control software -- ArduPilot, and Paparazzi, to better …


Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang Nov 2018

Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Using mathematical notations and logical reasoning, formal methods precisely define a program’s specifications, from which we can instantiate valid instances of a system. With these techniques, we can perform a variety of analysis tasks to verify system dependability and rigorously prove the correctness of system properties. While there exist well-designed automated verification tools including ones considered lightweight, they still lack a strong adoption in practice. The essence of the problem is that when applied to large real world applications, they are not scalable and applicable due to the expense of thorough verification process. In this thesis, I present a new …


A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun Nov 2018

A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Concurrency faults are one of the most damaging types of faults that can affect the dependability of today’s computer systems. Currently, concurrency faults such as process-level races, order violations, and atomicity violations represent the largest class of faults that has been reported to various Linux bug repositories. Clearly, existing approaches for testing such faults during software development processes are not adequate as these faults escape in-house testing efforts and are discovered during deployment and must be debugged.

The main reason concurrency faults are hard to test is because the conditions that allow these to occur can be difficult to replicate, …


Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque Nov 2018

Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The already high demand for more bandwidth usage has been growing rapidly. Access network traffic is usually bursty in nature and the present traffic trend is mostly video-dominant. This motivates the need for higher transmission rates in the system. At the same time, the deployment costs and maintenance expenditures have to be reasonable. Therefore, Passive Optical Networks (PON) are considered promising next-generation access technologies. As the existing PON standards are not suitable to support future-PON services and applications, the FSAN (Full Service Access Network) group and the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union) have worked on developing …


Optical Wireless Data Center Networks, Abdelbaset S. Hamza Oct 2018

Optical Wireless Data Center Networks, Abdelbaset S. Hamza

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Bandwidth and computation-intensive Big Data applications in disciplines like social media, bio- and nano-informatics, Internet-of-Things (IoT), and real-time analytics, are pushing existing access and core (backbone) networks as well as Data Center Networks (DCNs) to their limits. Next generation DCNs must support continuously increasing network traffic while satisfying minimum performance requirements of latency, reliability, flexibility and scalability. Therefore, a larger number of cables (i.e., copper-cables and fiber optics) may be required in conventional wired DCNs. In addition to limiting the possible topologies, large number of cables may result into design and development problems related to wire ducting and maintenance, heat …


Minutes & Seconds: The Scientists, Patrick Aievoli Sep 2018

Minutes & Seconds: The Scientists, Patrick Aievoli

Zea E-Books Collection

Minutes & Seconds, is a captivating intelligible read for those who strive to understand where the “what if” moment has gone. Succeeding his other captivating books, Aievoli’s deep introspective lens dials his readers in to awaken the proverbial sleeping giant inside of our consciousness. He designs an insightful exciting romp through the surreal landscape of our society and illustrates how various pioneers have lead us to a crossroads. I’m truly impressed with Aievoli’s perspicacious comprehension of where digital has taken us through the hands of these select individuals. --Sequoyah Wharton

In creating Minutes & Seconds, Aievoli has assembled an interesting …


Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward Sep 2018

Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Determining whether or not a Constraint Satisfaction Problem (CSP) has a solution is NP-complete. CSPs are solved by inference (i.e., enforcing consistency), conditioning (i.e., doing search), or, more commonly, by interleaving the two mechanisms. The most common consistency property enforced during search is Generalized Arc Consistency (GAC). In recent years, new algorithms that enforce consistency properties stronger than GAC have been proposed and shown to be necessary to solve difficult problem instances.

We frame the question of balancing the cost and the pruning effectiveness of consistency algorithms as the question of determining where, when, and how much of a higher-level …


Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka Aug 2018

Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Software testing research often involves reproducing previous experimental results. Previous work created a repository infrastructure for containment and dissemination of testable research subjects using a private centralized storage mechanism for hosting these test subject archives. While this is a good way to store these subjects it can be inefficient when the size of subjects increases or the number of versions of the subject’s source code is large. The delivery of these large subjects from a centralized repository can be quite large and on occasion may not succeed requiring the user to repeat the download request. Coupled with the limited resources …


Data Mining Ancient Script Image Data Using Convolutional Neural Networks, Shruti Daggumati, Peter Revesz Jun 2018

Data Mining Ancient Script Image Data Using Convolutional Neural Networks, Shruti Daggumati, Peter Revesz

CSE Conference and Workshop Papers

The recent surge in ancient scripts has resulted in huge image libraries of ancient texts. Data mining of the collected images enables the study of the evolution of these ancient scripts. In particular, the origin of the Indus Valley script is highly debated. We use convolutional neural networks to test which Phoenician alphabet letters and Brahmi symbols are closest to the Indus Valley script symbols. Surprisingly, our analysis shows that overall the Phoenician alphabet is much closer than the Brahmi script to the Indus Valley script symbols.


Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu May 2018

Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Finding similar sequences to an input query sequence (DNA or proteins) from a sequence data set is an important problem in bioinformatics. It provides researchers an intuition of what could be related or how the search space can be reduced for further tasks. An exact brute-force nearest-neighbor algorithm used for this task has complexity O(m * n) where n is the database size and m is the query size. Such an algorithm faces time-complexity issues as the database and query sizes increase. Furthermore, the use of alignment-based similarity measures such as minimum edit distance adds an additional complexity to the …


Assessing The Quality And Stability Of Recommender Systems, David Shriver May 2018

Assessing The Quality And Stability Of Recommender Systems, David Shriver

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these …


Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall May 2018

Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of …


Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli May 2018

Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

With the demand upsurge for high bandwidth services, continuous increase in the number of cellular subscriptions, adoption of Internet of Things (IoT), and marked growth in Machine-to-Machine (M2M) traffic, there is great stress exerted on cellular network infrastructure. The present wireline and wireless networking technologies are rigid in nature and heavily hardware-dependent, as a result of which the process of infrastructure upgrade to keep up with future demand is cumbersome and expensive.

Software-defined networks (SDN) hold the promise to decrease network rigidity by providing central control and flow abstraction, which in current network setups are hardware-based. The embrace of SDN …


Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang Apr 2018

Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Continuous integration (CI) development environments allow software engineers to frequently integrate and test their code. While CI environments provide advantages, they also utilize non-trivial amounts of time and resources. To address this issue, researchers have adapted techniques for test case prioritization (TCP) and regression test selection (RTS) to CI environments.

To date, current TCP techniques under CI environments have operated on test suites, and have not achieved substantial improvements. In this thesis, we use a lightweight approach based on test suite failure and execution history, and “continuously” prioritizes commits that are waiting for execution in response to the arrival of …


Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni Apr 2018

Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The future pervasive communication and computing devices are envisioned to be tightly integrated with biological systems, i.e., the Internet of Bio-Nano Things. In particular, the study and exploitation of existing processes for the biochemical information exchange and elaboration in biological systems are currently at the forefront of this research direction. Molecular Communication (MC), which studies biochemical information systems with theory and tools from computer communication engineering, has been recently proposed to model and characterize the aforementioned processes. Combined with the rapidly growing field of bio-informatics, which creates a rich profusion of biological data and tools to mine the underlying information, …


Internet Of Underground Things: Sensing And Communications On The Field For Precision Agriculture, Mehmet C. Vuran, Abdul Salam, Rigoberto Wong, Suat Irmak Feb 2018

Internet Of Underground Things: Sensing And Communications On The Field For Precision Agriculture, Mehmet C. Vuran, Abdul Salam, Rigoberto Wong, Suat Irmak

CSE Conference and Workshop Papers

The projected increases in World population and need for food have recently motivated adoption of information technology solutions in crop fields within precision agriculture approaches. Internet of underground things (IOUT), which consists of sensors and communication devices, partly or completely buried underground for real-time soil sensing and monitoring, emerge from this need. This new paradigm facilitates seamless integration of underground sensors, machinery, and irrigation systems with the complex social network of growers, agronomists, crop consultants, and advisors. In this paper, state-of-the-art communication architectures are reviewed, and underlying sensing technology and communication mechanisms for IOUT are presented. Recent advances in the …


Scheduling In Mapreduce Clusters, Chen He Feb 2018

Scheduling In Mapreduce Clusters, Chen He

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

MapReduce is a framework proposed by Google for processing huge amounts of data in a distributed environment. The simplicity of the programming model and the fault-tolerance feature of the framework make it very popular in Big Data processing.

As MapReduce clusters get popular, their scheduling becomes increasingly important. On one hand, many MapReduce applications have high performance requirements, for example, on response time and/or throughput. On the other hand, with the increasing size of MapReduce clusters, the energy-efficient scheduling of MapReduce clusters becomes inevitable. These scheduling challenges, however, have not been systematically studied.

The objective of this dissertation is to …


Cyber Security And Risk Society: Estonian Discourse On Cyber Risk And Security Strategy, Lauren Kook Jan 2018

Cyber Security And Risk Society: Estonian Discourse On Cyber Risk And Security Strategy, Lauren Kook

Copyright, Fair Use, Scholarly Communication, etc.

The main aim of this thesis is to call for a new analysis of cyber security which departs from the traditional security theory. I argue that the cyber domain is inherently different in nature, in that it is lacking in traditional boundaries and is reflexive in nature. Policy-makers are aware of these characteristics, and in turn this awareness changes the way that national cyber security strategy is handled and understood. These changes cannot be adequately understood through traditional understanding of security, as they often are, without missing significant details. Rather, examining these changes through the lens of Ulrich Beck’s risk …