Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- HPC (3)
- MPI (3)
- Machine learning (3)
- Censorship (2)
- Data Mining (2)
-
- Deep Learning (2)
- Deep learning (2)
- Exascale (2)
- NLP (2)
- Social Media (2)
- Anomaly detection (1)
- Big data (1)
- Bot Detection (1)
- Bot detection (1)
- Bots (1)
- Call-by-need (1)
- China (1)
- Coarse Grained Modeling (1)
- Compiler (1)
- Compression (1)
- Computational Biology (1)
- Computational modeling (1)
- Connected Home (1)
- Craig Interpolants (1)
- Criticality (1)
- Cryo EM (1)
- DNA strand displacement (1)
- Data Fusion (1)
- Data mining (1)
- Distributed computing (1)
Articles 1 - 28 of 28
Full-Text Articles in Entire DC Network
Smart Applications And Resource Management In Internet Of Things, Zeinab Akhavan
Smart Applications And Resource Management In Internet Of Things, Zeinab Akhavan
Computer Science ETDs
Internet of Things (IoT) technologies are currently the principal solutions driving smart cities. These new technologies such as Cyber Physical Systems, 5G and data analytic have emerged to address various cities' infrastructure issues ranging from transportation and energy management to healthcare systems. An IoT setting primarily consists of a wide range of users and devices as a massive network interacting with different layers of the city infrastructure resulting in generating sheer volume of data to enable smart city services. The goal of smart city services is to create value for the entire ecosystem, whether this is health, education, transportation, energy, …
Computational Study Of The Effect Of Geometry On Molecular Interactions, Sarika Kumar
Computational Study Of The Effect Of Geometry On Molecular Interactions, Sarika Kumar
Computer Science ETDs
The specificity and predictability of DNA make it an excellent programmable material and have allowed bio-programmers to build sophisticated molecular circuits. These molecular devices should be precise, correct, and function as intended. In order to implement these circuits, the challenge is to build a robust, reliable, and scalable logic circuit with ideally minimum unwanted signal release. Performing experiments are expensive and time-consuming, so modeling and analyzing these bio-molecular systems become crucial in designing molecular circuits. This dissertation aimed to develop algorithms and build computational tools for automated analysis of molecular circuits that incorporate the molecular geometry of nanostructures. Molecular circuits …
Roadside Lidar Data Processing For Intelligent Transportation System, Md Parvez Mollah
Roadside Lidar Data Processing For Intelligent Transportation System, Md Parvez Mollah
Computer Science ETDs
Roadside LiDAR (Light Detection and Ranging) sensors are recently being explored for Intelligent Transportation System aiming at safer and faster traffic management and vehicular operations. However, massive data volume, occlusion, and limited viewing angles are significant obstacles to the widespread use of roadside LiDARs. In this dissertation, we address three major challenges to enable applications of Intelligent Transportation System through roadside LiDAR data: (i) real-time transmission of the massive point-cloud data from the roadside LiDAR devices to the cloud using 5G network, (ii) mitigating sensor occlusion problem to increase coverage and detect events occurred in occluded regions of a sensor, …
Domain Specific Feature Representation Learning For Diverse Temporal Data, Farhan Asif Chowdhury
Domain Specific Feature Representation Learning For Diverse Temporal Data, Farhan Asif Chowdhury
Computer Science ETDs
Humans can leverage domain context to recognize novel patterns and categories based on limited known examples. In contrast, computational learning methods are not adept at exploiting context and require sufficient labeled examples to achieve similar accuracy. Many temporal data domain, for example, seismic signals and oil mining sensor data, requires domain expert annotation, which is both costly and time-consuming. The dependency on training data limits the applicability of machine learning algorithms for domains with limited labeled data. This dissertation aims to address this gap by developing temporal mining algorithms that exploit domain context to learn discriminative feature representation from limited …
Improving Human-Automation Collaboration In Motion Planning, Torin J. Adamson
Improving Human-Automation Collaboration In Motion Planning, Torin J. Adamson
Computer Science ETDs
Human-automation collaboration is becoming a part of everyday life as AI helps us drive, make decisions, and solve a variety of other tasks. However, safe and effective collaboration systems depend on factors in trust, communication, and more. Existing studies to explore these are typically carried out in laboratory settings, providing robust data under tight environmental control. However, human behavior evolves over time, driven by external factors that cannot be fully captured in single participation sessions. These factors form the "human context", contextualizing the behavioral data for a more complete understanding. In this thesis, video game adaptations upon conventional subject studies …
Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar
Machine Learning Methods For Computational Phenotyping Using Patient Healthcare Data With Noisy Labels, Praveen Kumar
Computer Science ETDs
Positive and Unlabeled (PU) learning problems abound in many real-world applications. In healthcare informatics, diagnosed patients are considered labeled positive for a specific disease, but being undiagnosed does not mean they can be labeled negative. PU learning can improve classification performance, and estimate the positive fraction, α, among unlabeled samples. However, algorithms based on the Selected Completely At Random (SCAR) assumption are inadequate when the SCAR assumption fails (e.g., severe cases overrepresented), and when class imbalance is substantial. This dissertation presents and evaluates new algorithms to overcome these limitations. The proposed methods outperform the state-of-art for α-estimation, enhance classification performance, …
Cognizant Composites: Seamless Integration Of Circuitry And Sensors Into Structural Composites, Reuben Fresquez
Cognizant Composites: Seamless Integration Of Circuitry And Sensors Into Structural Composites, Reuben Fresquez
Computer Science ETDs
This thesis describes a set of novel techniques for embedding sensors, circuitry, and electronics into structural composites. I leverage recent developments in human computer interaction to create sensors and circuitry that are seamlessly incorporated into structural composites. I fabricate bend and compression sensors, along with circuitry, from textiles, which enables me to add electronic capabilities without impacting the composite’s structural integrity. I describe the construction of these “cognizant composites” and demonstrate their functionality. I also explore techniques for embedding standard electronic components, including microcontrollers, into structural composites. Potential applications of this technology include buildings that can warn occupants if load-bearing …
Learning Intermediate Representations For Question Answering Systems, Zakery T. Clarke
Learning Intermediate Representations For Question Answering Systems, Zakery T. Clarke
Computer Science ETDs
Question answering systems are models that can perform natural language processing (NLP) on a question, retrieve an answer from a datasource, and communicate it to a user. In question answering systems, it is important for the system to learn an underlying representation for a piece of text. There are many systems that have achieved incredible accuracy on question answering datasets such as the Stanford Question and Answer Dataset (SQuAD), but these systems often encode their knowledge in a manner that is impossible to verify. Many current models would benefit more from verifiability, than marginal accuracy improvements.
We propose a method …
Implementation Of Uniform Interpolationalgorithms, Jose A. Castellanos Joo
Implementation Of Uniform Interpolationalgorithms, Jose A. Castellanos Joo
Computer Science ETDs
This thesis discusses algorithms for the uniform interpolation problem and presents their implementation for the following theories: (quantifier-free) equality with uninterpreted functions (EUF), unit two-variable per inequality (UTVPI), and theoretic aspects for the combination of the two previous theories. The uniform interpolation algorithms implemented in this thesis were originally proposed in \cite{KAPUR2017}. Refutational proof-based solutions are the usual approach of many interpolation algorithms \cite{10.1007/978-3-642-00768-2_34, mcmillan2011interpolants, 10.1007/978-3-540-24730-2_2}. The approach taken in \cite{KAPUR2017} relies on quantifier-elimination heuristics to construct a uniform interpolant using one of the two formulas involved in the interpolation problem. The latter makes it possible to study the complexity …
Statistical Modeling Of Hpc Performance Variability And Communication, Jered B. Dominguez-Trujillo
Statistical Modeling Of Hpc Performance Variability And Communication, Jered B. Dominguez-Trujillo
Computer Science ETDs
Understanding the performance of parallel and distributed programs remains a focal point in determining how compute systems can be optimized to achieve exascale performance. Lightweight, statistical models allow developers to both characterize and predict performance trade-offs, especially as HPC systems become more heterogeneous with many-core CPUs and GPUs. This thesis presents a lightweight, statistical modeling approach of performance variation which leverages extreme value theory by focusing on the maximum length of distributed workload intervals. This approach was implemented in MPI and evaluated on several HPC systems and workloads. I then present a performance model of partitioned communication which also uses …
Sybil Defense Using Efficient Resource Burning, Diksha Gupta
Sybil Defense Using Efficient Resource Burning, Diksha Gupta
Computer Science ETDs
In 1993, Dwork and Naor proposed using computational puzzles, a resource burning mechanism, to combat spam email. In the ensuing three decades, resource burning has broadened to include communication capacity, computer memory, and human effort. It has become a well-established tool in distributed security. Due to the cost attached to utilizing resource burning mechanism, these have not been popularized in domains apart from cryptocurrency.
In this dissertation, we design efficient resource burning based Sybil defense techniques for permissionless systems. As a first step, we identify existing resource burning mechanisms in literature in Chapter 2. Additionally, we enumerate numerous open problems …
Data Mining Of Chinese Social Networks: Factors That Indicate Post Deletion, Meisam Navaki Arefi
Data Mining Of Chinese Social Networks: Factors That Indicate Post Deletion, Meisam Navaki Arefi
Computer Science ETDs
Widespread Chinese social media applications such as Sina Weibo (Chinese Twitter), the most popular social network in China, are widely known for monitoring and deleting posts to conform to Chinese government requirements. Censorship of Chinese social media is a complex process that involves many factors. There are multiple stakeholders and many different interests: economic, political, legal, personal, etc., which means that there is not a single strategy dictated by a single government authority. Moreover, sometimes Chinese social media do not follow the directives of government, out of concern that they are more strictly censoring than their competitors.
One crucial question …
Promotional Campaigns In The Era Of Social Platforms, Noor E. Abu-El-Rub
Promotional Campaigns In The Era Of Social Platforms, Noor E. Abu-El-Rub
Computer Science ETDs
The rise of social media has facilitated the diffusion of information to more easily reach millions of users. While some users connect with friends and organically share information and opinions on social media, others have exploited these platforms to gain influence and profit through promotional campaigns and advertising. The existence of promotional campaigns contributes to the spread of misleading information, spam, and fake news. Thus, these campaigns affect the trustworthiness and reliability of social media and render it as a crowd advertising platform. This dissertation studies the existence of promotional campaigns in social media and explores different ways users and …
Shared-Environment Call-By-Need, George W. Stelle
Shared-Environment Call-By-Need, George W. Stelle
Computer Science ETDs
Call-by-need semantics formalize the wisdom that work should be done at most once. It frees programmers to focus more on the correctness of their code, and less on the operational details. Because of this property, programmers of lazy functional languages rely heavily on their compiler to both preserve correctness and generate high-performance code for high level abstractions. In this dissertation I present a novel technique for compiling call-by-need semantics by using shared environments to share results of computation. I show how the approach enables a compiler that generates high-performance code, while staying simple enough to lend itself to formal reasoning. …
Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez
Adaptive Parallelism For Coupled, Multithreaded Message-Passing Programs, Samuel K. Gutiérrez
Computer Science ETDs
Hybrid parallel programming models that combine message passing (MP) and shared- memory multithreading (MT) are becoming more popular, especially with applications requiring higher degrees of parallelism and scalability. Consequently, coupled parallel programs, those built via the integration of independently developed and optimized software libraries linked into a single application, increasingly comprise message-passing libraries with differing preferred degrees of threading, resulting in thread-level heterogeneity. Retroactively matching threading levels between independently developed and maintained libraries is difficult, and the challenge is exacerbated because contemporary middleware services provide only static scheduling policies over entire program executions, necessitating suboptimal, over-subscribed or under-subscribed, configurations. In …
Multi-Resolution Analysis Of Large Molecular Structures And Interactions, Kasra Manavi
Multi-Resolution Analysis Of Large Molecular Structures And Interactions, Kasra Manavi
Computer Science ETDs
Simulation of large molecular structures and their interactions has become a major component of modern biomolecular research. Methods to simulate these type of molecules span a wide array of resolutions, from all atom molecular dynamics to model interaction energetics to systems of linear equations to evaluate population kinetics. In recent years, there has been an acceleration of molecular structural information production, primarily from x-ray crystallography and electron microscopy. This data has provided modelers the ability to produce better representations of these molecular structures. The purpose of this research is to take advantage of this information to develop multi-resolution models for …
Criticality Assessments For Improving Algorithmic Robustness, Thomas B. Jones
Criticality Assessments For Improving Algorithmic Robustness, Thomas B. Jones
Computer Science ETDs
Though computational models typically assume all program steps execute flawlessly, that does not imply all steps are equally important if a failure should occur. In the "Constrained Reliability Allocation" problem, sufficient resources are guaranteed for operations that prompt eventual program termination on failure, but those operations that only cause output errors are given a limited budget of some vital resource, insufficient to ensure correct operation for each of them.
In this dissertation, I present a novel representation of failures based on a combination of their timing and location combined with criticality assessments---a method used to predict the behavior of systems …
Mining Temporal Activity Patterns On Social Media, Nikan Chavoshi
Mining Temporal Activity Patterns On Social Media, Nikan Chavoshi
Computer Science ETDs
Social media provide communication networks for their users to easily create and share content. Automated accounts, called bots, abuse these platforms by engaging in suspicious and/or illegal activities. Bots push spam content and participate in sponsored activities to expand their audience. The prevalence of bot accounts in social media can harm the usability of these platforms, and decrease the level of trustworthiness in them. The main goal of this dissertation is to show that temporal analysis facilitates detecting bots in social media. I introduce new bot detection techniques which exploit temporal information. Since automated accounts are controlled by computer programs, …
Automatic Conversation Review For Intelligent Virtual Assistants, Ian R. Beaver
Automatic Conversation Review For Intelligent Virtual Assistants, Ian R. Beaver
Computer Science ETDs
When reviewing the performance of Intelligent Virtual Assistants (IVAs), it is desirable to prioritize conversations involving misunderstood human inputs. These conversations uncover error in natural language understanding and help prioritize and expedite improvements to the IVA. As human reviewer time is valuable and manual analysis is time consuming, prioritizing the conversations where misunderstanding has likely occurred reduces costs and speeds improvement. A system for measuring the posthoc risk of missed intent associated with a single human input is presented. Numerous indicators of risk are explored and implemented. These indicators are combined using various means and evaluated on real world data. …
Next Generation Tcp/Ip Side Channels, Xu Zhang
Next Generation Tcp/Ip Side Channels, Xu Zhang
Computer Science ETDs
Side channel techniques have been developed in recent years to fulfill various tasks in modern computer network measurements. However, due to their nature, these techniques are typically limited in terms of both fidelity and their ability to be used on the real Internet without raising ethical concerns because of packet rates. I propose the next generation of TCP/IP side channel techniques that exploit information flow in modern systems’ network stacks to overcome weaknesses in previous techniques. The proposed work is novel, non-intrusive, and can carry out measurements with high fidelity. I achieved this by deeply understanding the behaviors of modern …
Measuring Decentralization Of Chinese Censorship In Three Industry Segments, Jeffrey Knockel
Measuring Decentralization Of Chinese Censorship In Three Industry Segments, Jeffrey Knockel
Computer Science ETDs
What is forbidden to talk about using Chinese apps? Companies operating in China face a complex array of regulations and are liable for content voiced using their platforms. Previous work studying Chinese censorship uses (1) sample testing or (2) measures content deletion; however, these techniques produce an incomplete picture biased toward (1) the tested samples or (2) whichever topics were trending.
In this dissertation, I use reverse engineering to study the code that applications use to determine whether to censor content. In doing so, I can provide a more complete and unbiased view of Chinese Internet censorship. I reverse engineer …
Improving Hpc Communication Library Performance On Modern Architectures, Matthew G. F. Dosanjh
Improving Hpc Communication Library Performance On Modern Architectures, Matthew G. F. Dosanjh
Computer Science ETDs
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), they must leverage increasing levels of parallelism to achieve their performance goals. In addition to increased parallelism, machines of that scale will have strict power limitations placed on them. One direction currently being explored to alleviate those issues are many-core processors such as Intel’s Xeon Phi line. Many-core processors sacrifice clock speed and core complexity, such as out of order pipelining, to increase the number of cores on a die. While this increases floating point throughput, it can reduce the performance of serialized, synchronized, and latency sensitive code …
Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni
Distributed Knowledge Discovery For Diverse Data, Hossein Hamooni
Computer Science ETDs
In the era of new technologies, computer scientists deal with massive data of size hundreds of terabytes. Smart cities, social networks, health care systems, large sensor networks, etc. are constantly generating new data. It is non-trivial to extract knowledge from big datasets because traditional data mining algorithms run impractically on such big datasets. However, distributed systems have come to aid this problem while introducing new challenges in designing scalable algorithms. The transition from traditional algorithms to the ones that can be run on a distributed platform should be done carefully. Researchers should design the modern distributed algorithms based on the …
Spam, Fraud, And Bots: Improving The Integrity Of Online Social Media Data, Amanda Jean Minnich
Spam, Fraud, And Bots: Improving The Integrity Of Online Social Media Data, Amanda Jean Minnich
Computer Science ETDs
Online data contains a wealth of information, but as with most user-generated content, it is full of noise, fraud, and automated behavior. The prevalence of "junk" and fraudulent text affects users, businesses, and researchers alike. To make matters worse, there is a lack of ground truth data for these types of text, and the appearance of the text is constantly changing as fraudsters adapt to pressures from hosting sites. The goal of my dissertation is therefore to extract high-quality content from and identify fraudulent and automated behavior in large, complex social media datasets in the absence of ground truth data. …
Search In T Cell And Robot Swarms: Balancing Extent And Intensity, George M. Fricke
Search In T Cell And Robot Swarms: Balancing Extent And Intensity, George M. Fricke
Computer Science ETDs
This work investigates effective search and resource collection algorithms for swarms. Deterministic spiral algorithms and L ́evy search processes have been shown to be optimal for single searchers. We extend these strategies to swarms of robots and populations of T cells and measure performance under a variety of conditions.
Search extent and intensity lie on a continuum: more intensive patterns search thoroughly in the local area, while extensive patterns cover more area but may miss targets nearby. We show that the most efficient trade-off between search intensity and extent for swarms depends strongly on the distribution of targets, swarm size …
Characterizing And Improving Power And Performance In Hpc Networks, Taylor L. Groves
Characterizing And Improving Power And Performance In Hpc Networks, Taylor L. Groves
Computer Science ETDs
Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, tying together applications, analytics, storage and visualization. Despite this importance, we have not fully explored how evolving communication paradigms and network design will impact scientific workloads. As networks expand in the race towards Exascale (1×10^18 floating point operations a second), we need to reexamine this relationship so that the HPC community better understands (1) characteristics and trends in HPC communication; (2) how to best design HPC networks to save power or enhance the performance; (3) how to facilitate scalable, informed, and dynamic decisions within …
Theory And Practice Of Computing With Excitable Dynamics, Alireza Goudarzi
Theory And Practice Of Computing With Excitable Dynamics, Alireza Goudarzi
Computer Science ETDs
Reservoir computing (RC) is a promising paradigm for time series processing. In this paradigm, the desired output is computed by combining measurements of an excitable system that responds to time-dependent exogenous stimuli. The excitable system is called a reservoir and measurements of its state are combined using a readout layer to produce a target output. The power of RC is attributed to an emergent short-term memory in dynamical systems and has been analyzed mathematically for both linear and nonlinear dynamical systems. The theory of RC treats only the macroscopic properties of the reservoir, without reference to the underlying medium it …
Towards Deeper Understanding In Neuroimaging, Rex Devon Hjelm
Towards Deeper Understanding In Neuroimaging, Rex Devon Hjelm
Computer Science ETDs
Neuroimaging is a growing domain of research, with advances in machine learning having tremendous potential to expand understanding in neuroscience and improve public health. Deep neural networks have recently and rapidly achieved historic success in numerous domains, and as a consequence have completely redefined the landscape of automated learners, giving promise of significant advances in numerous domains of research. Despite recent advances and advantages over traditional machine learning methods, deep neural networks have yet to have permeated significantly into neuroscience studies, particularly as a tool for discovery. This dissertation presents well-established and novel tools for unsupervised learning which aid in …