Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Software Engineering (2)
- Algorithms (1)
- Alloy (1)
- Apache Storm (1)
- Assembly (1)
-
- Bioinformatics (1)
- Brahmi Script (1)
- Communication (1)
- Computer science (1)
- Concurrency Faults (1)
- Consistency (1)
- Constraint Processing (1)
- Constraint Satisfaction Problems (1)
- Continuous Integration (1)
- Control software (1)
- Convolutional Neural Networks (1)
- Cosine similarity (1)
- Cross-polytope (1)
- Cybersecurity (1)
- Data Center (1)
- De novo (1)
- Debugging (1)
- Design (1)
- Digital (1)
- Discovery (1)
- Distributed systems (1)
- Edge bundling (1)
- Estonia (1)
- Evolutionary Algorithm (1)
- Formal Analysis (1)
- Publication
- Publication Type
Articles 1 - 22 of 22
Full-Text Articles in Physical Sciences and Mathematics
Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu
Gmaim: An Analytical Pipeline For Microrna Splicing Profiling Using Generative Model, Kan Liu
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
MicroRNAs (miRNAs) are a class of short (~22 nt) single strand RNA molecules predominantly found in eukaryotes. Being involved in many major biological processes, miRNAs can regulate gene expression by targeting mRNAs to facilitate their degradation or translational inhibition. The imprecise splicing of miRNA splicing which introduces severe variability in terms of sequences of miRNA products and their corresponding downstream gene expression regulation. For example, to study biogenesis of miRNAs, usually, biologists can deplete a gene in the miRNA biogenesis pathway and study the change of miRNA sequences, which can cause impression of miRNAs. Although high-throughput sequencing technologies such as …
Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy
Scale-Out Algorithm For Apache Storm In Saas Environment, Ravi Kiran Puttaswamy
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
The main appeal of the Cloud is in its cost effective and flexible access to computing power. Apache Storm is a data processing framework used to process streaming data. In our work we explore the possibility of offering Apache Storm as a software service. Further, we take advantage of the cgroups feature in Storm to divide the computing power of worker machine into smaller units to be offered to users. We predict that the compute bounds placed on the cgroups could be used to approximate the state of the workflow. We discuss the limitations of the current schedulers in facilitating …
Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu
Reducing The Tail Latency Of A Distributed Nosql Database, Jun Wu
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
The request latency is an important performance metric of a distributed database, such as the popular Apache Cassandra, because of its direct impact on the user experience. Specifically, the latency of a read or write request is defined as the total time interval from the instant when a user makes the request to the instant when the user receives the request, and it involves not only the actual read or write time at a specific database node, but also various types of latency introduced by the distributed mechanism of the database. Most of the current work focuses only on reducing …
Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu
Deploying, Improving And Evaluating Edge Bundling Methods For Visualizing Large Graphs, Jieting Wu
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
A tremendous increase in the scale of graphs has been witnessed in a wide range of fields, which demands efficient and effective visualization techniques to assist users in better understandings of large graphs. Conventional node-link diagrams are often used to visualize graphs, whereas excessive edge crossings can easily incur severe visual clutter in the node-link diagram of a large graph. Edge bundling can effectively remedy visual clutter and reveal high-level graph structures. Although significant efforts have been devoted to developing edge bundling, three challenging problems remain. First, edge bundling techniques are often computationally expensive and are not easy to deploy …
Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam
Controller Evolution And Divergence: A Software Perspective, Balaji Balasubramaniam
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Successful controllers evolve as they are refined, extended, and adapted to new systems and contexts. This evolution occurs in the controller design and also in its software implementation. Model-based design and controller synthesis can help to synchronize this evolution of design and software, but such synchronization is rarely complete as software tends to also evolve in response to elements rarely present in a control model, leading to mismatches between the control design and the software.
In this thesis, we perform a first-of-its-kind study on the evolution of two popular open-source safety-critical autopilot control software -- ArduPilot, and Paparazzi, to better …
Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang
Evoalloy: An Evolutionary Approach For Analyzing Alloy Specifications, Jianghao Wang
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Using mathematical notations and logical reasoning, formal methods precisely define a program’s specifications, from which we can instantiate valid instances of a system. With these techniques, we can perform a variety of analysis tasks to verify system dependability and rigorously prove the correctness of system properties. While there exist well-designed automated verification tools including ones considered lightweight, they still lack a strong adoption in practice. The essence of the problem is that when applied to large real world applications, they are not scalable and applicable due to the expense of thorough verification process. In this thesis, I present a new …
A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun
A Comprehensive Framework To Replicate Process-Level Concurrency Faults, Supat Rattanasuksun
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Concurrency faults are one of the most damaging types of faults that can affect the dependability of today’s computer systems. Currently, concurrency faults such as process-level races, order violations, and atomicity violations represent the largest class of faults that has been reported to various Linux bug repositories. Clearly, existing approaches for testing such faults during software development processes are not adequate as these faults escape in-house testing efforts and are discovered during deployment and must be debugged.
The main reason concurrency faults are hard to test is because the conditions that allow these to occur can be difficult to replicate, …
Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque
Supporting Diverse Customers And Prioritized Traffic In Next-Generation Passive Optical Networks, Naureen Hoque
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
The already high demand for more bandwidth usage has been growing rapidly. Access network traffic is usually bursty in nature and the present traffic trend is mostly video-dominant. This motivates the need for higher transmission rates in the system. At the same time, the deployment costs and maintenance expenditures have to be reasonable. Therefore, Passive Optical Networks (PON) are considered promising next-generation access technologies. As the existing PON standards are not suitable to support future-PON services and applications, the FSAN (Full Service Access Network) group and the ITU-T (Telecommunication Standardization Sector of the International Telecommunication Union) have worked on developing …
Optical Wireless Data Center Networks, Abdelbaset S. Hamza
Optical Wireless Data Center Networks, Abdelbaset S. Hamza
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Bandwidth and computation-intensive Big Data applications in disciplines like social media, bio- and nano-informatics, Internet-of-Things (IoT), and real-time analytics, are pushing existing access and core (backbone) networks as well as Data Center Networks (DCNs) to their limits. Next generation DCNs must support continuously increasing network traffic while satisfying minimum performance requirements of latency, reliability, flexibility and scalability. Therefore, a larger number of cables (i.e., copper-cables and fiber optics) may be required in conventional wired DCNs. In addition to limiting the possible topologies, large number of cables may result into design and development problems related to wire ducting and maintenance, heat …
Minutes & Seconds: The Scientists, Patrick Aievoli
Minutes & Seconds: The Scientists, Patrick Aievoli
Zea E-Books Collection
Minutes & Seconds, is a captivating intelligible read for those who strive to understand where the “what if” moment has gone. Succeeding his other captivating books, Aievoli’s deep introspective lens dials his readers in to awaken the proverbial sleeping giant inside of our consciousness. He designs an insightful exciting romp through the surreal landscape of our society and illustrates how various pioneers have lead us to a crossroads. I’m truly impressed with Aievoli’s perspicacious comprehension of where digital has taken us through the hands of these select individuals. --Sequoyah Wharton
In creating Minutes & Seconds, Aievoli has assembled an interesting …
Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward
Higher-Level Consistencies: Where, When, And How Much, Robert J. Woodward
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Determining whether or not a Constraint Satisfaction Problem (CSP) has a solution is NP-complete. CSPs are solved by inference (i.e., enforcing consistency), conditioning (i.e., doing search), or, more commonly, by interleaving the two mechanisms. The most common consistency property enforced during search is Generalized Arc Consistency (GAC). In recent years, new algorithms that enforce consistency properties stronger than GAC have been proposed and shown to be necessary to solve difficult problem instances.
We frame the question of balancing the cost and the pruning effectiveness of consistency algorithms as the question of determining where, when, and how much of a higher-level …
Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka
Scaling Up An Infrastructure For Controlled Experimentation With Testing Techniques, Wayne D. Motycka
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Software testing research often involves reproducing previous experimental results. Previous work created a repository infrastructure for containment and dissemination of testable research subjects using a private centralized storage mechanism for hosting these test subject archives. While this is a good way to store these subjects it can be inefficient when the size of subjects increases or the number of versions of the subject’s source code is large. The delivery of these large subjects from a centralized repository can be quite large and on occasion may not succeed requiring the user to repeat the download request. Coupled with the limited resources …
Data Mining Ancient Script Image Data Using Convolutional Neural Networks, Shruti Daggumati, Peter Revesz
Data Mining Ancient Script Image Data Using Convolutional Neural Networks, Shruti Daggumati, Peter Revesz
CSE Conference and Workshop Papers
The recent surge in ancient scripts has resulted in huge image libraries of ancient texts. Data mining of the collected images enables the study of the evolution of these ancient scripts. In particular, the origin of the Indus Valley script is highly debated. We use convolutional neural networks to test which Phoenician alphabet letters and Brahmi symbols are closest to the Indus Valley script symbols. Surprisingly, our analysis shows that overall the Phoenician alphabet is much closer than the Brahmi script to the Indus Valley script symbols.
Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu
Application Of Cosine Similarity In Bioinformatics, Srikanth Maturu
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Finding similar sequences to an input query sequence (DNA or proteins) from a sequence data set is an important problem in bioinformatics. It provides researchers an intuition of what could be related or how the search space can be reduced for further tasks. An exact brute-force nearest-neighbor algorithm used for this task has complexity O(m * n) where n is the database size and m is the query size. Such an algorithm faces time-complexity issues as the database and query sizes increase. Furthermore, the use of alignment-based similarity measures such as minimum edit distance adds an additional complexity to the …
Assessing The Quality And Stability Of Recommender Systems, David Shriver
Assessing The Quality And Stability Of Recommender Systems, David Shriver
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these …
Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall
Consensus Ensemble Approaches Improve De Novo Transcriptome Assemblies, Adam Voshall
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. This thesis provides a detailed overview of …
Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli
Performance Evaluation Of V-Enodeb Using Virtualized Radio Resource Management, Sai Keerti Teja Boddepalli
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
With the demand upsurge for high bandwidth services, continuous increase in the number of cellular subscriptions, adoption of Internet of Things (IoT), and marked growth in Machine-to-Machine (M2M) traffic, there is great stress exerted on cellular network infrastructure. The present wireline and wireless networking technologies are rigid in nature and heavily hardware-dependent, as a result of which the process of infrastructure upgrade to keep up with future demand is cumbersome and expensive.
Software-defined networks (SDN) hold the promise to decrease network rigidity by providing central control and flow abstraction, which in current network setups are hardware-based. The embrace of SDN …
Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang
Cost-Effective Techniques For Continuous Integration Testing, Jingjing Liang
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Continuous integration (CI) development environments allow software engineers to frequently integrate and test their code. While CI environments provide advantages, they also utilize non-trivial amounts of time and resources. To address this issue, researchers have adapted techniques for test case prioritization (TCP) and regression test selection (RTS) to CI environments.
To date, current TCP techniques under CI environments have operated on test suites, and have not achieved substantial improvements. In this thesis, we use a lightweight approach based on test suite failure and execution history, and “continuously” prioritizes commits that are waiting for execution in response to the arrival of …
Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni
Modelling And Visualizing Selected Molecular Communication Processes In Biological Organisms: A Multi-Layer Perspective, Aditya Immaneni
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
The future pervasive communication and computing devices are envisioned to be tightly integrated with biological systems, i.e., the Internet of Bio-Nano Things. In particular, the study and exploitation of existing processes for the biochemical information exchange and elaboration in biological systems are currently at the forefront of this research direction. Molecular Communication (MC), which studies biochemical information systems with theory and tools from computer communication engineering, has been recently proposed to model and characterize the aforementioned processes. Combined with the rapidly growing field of bio-informatics, which creates a rich profusion of biological data and tools to mine the underlying information, …
Internet Of Underground Things: Sensing And Communications On The Field For Precision Agriculture, Mehmet C. Vuran, Abdul Salam, Rigoberto Wong, Suat Irmak
Internet Of Underground Things: Sensing And Communications On The Field For Precision Agriculture, Mehmet C. Vuran, Abdul Salam, Rigoberto Wong, Suat Irmak
CSE Conference and Workshop Papers
The projected increases in World population and need for food have recently motivated adoption of information technology solutions in crop fields within precision agriculture approaches. Internet of underground things (IOUT), which consists of sensors and communication devices, partly or completely buried underground for real-time soil sensing and monitoring, emerge from this need. This new paradigm facilitates seamless integration of underground sensors, machinery, and irrigation systems with the complex social network of growers, agronomists, crop consultants, and advisors. In this paper, state-of-the-art communication architectures are reviewed, and underlying sensing technology and communication mechanisms for IOUT are presented. Recent advances in the …
Scheduling In Mapreduce Clusters, Chen He
Scheduling In Mapreduce Clusters, Chen He
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
MapReduce is a framework proposed by Google for processing huge amounts of data in a distributed environment. The simplicity of the programming model and the fault-tolerance feature of the framework make it very popular in Big Data processing.
As MapReduce clusters get popular, their scheduling becomes increasingly important. On one hand, many MapReduce applications have high performance requirements, for example, on response time and/or throughput. On the other hand, with the increasing size of MapReduce clusters, the energy-efficient scheduling of MapReduce clusters becomes inevitable. These scheduling challenges, however, have not been systematically studied.
The objective of this dissertation is to …
Cyber Security And Risk Society: Estonian Discourse On Cyber Risk And Security Strategy, Lauren Kook
Cyber Security And Risk Society: Estonian Discourse On Cyber Risk And Security Strategy, Lauren Kook
Copyright, Fair Use, Scholarly Communication, etc.
The main aim of this thesis is to call for a new analysis of cyber security which departs from the traditional security theory. I argue that the cyber domain is inherently different in nature, in that it is lacking in traditional boundaries and is reflexive in nature. Policy-makers are aware of these characteristics, and in turn this awareness changes the way that national cyber security strategy is handled and understood. These changes cannot be adequately understood through traditional understanding of security, as they often are, without missing significant details. Rather, examining these changes through the lens of Ulrich Beck’s risk …