Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Series

2020

Institution
Keyword
Publication

Articles 1 - 30 of 1259

Full-Text Articles in Entire DC Network

Inexact Tensor Methods And Their Application To Stochastic Convex Optimization, Artem Agafonov, Dmitry Kamzolov, Pavel Dvurechensky, Alexander Gasnikov, Martin Takac Dec 2020

Inexact Tensor Methods And Their Application To Stochastic Convex Optimization, Artem Agafonov, Dmitry Kamzolov, Pavel Dvurechensky, Alexander Gasnikov, Martin Takac

Machine Learning Faculty Publications

We propose general non-accelerated and accelerated tensor methods under inexact information on the derivatives of the objective, analyze their convergence rate. Further, we provide conditions for the inexactness in each derivative that is sufficient for each algorithm to achieve a desired accuracy. As a corollary, we propose stochastic tensor methods for convex optimization and obtain sufficient mini-batch sizes for each derivative. © 2020, CC BY.


Sensitivity Analysis Of An Agent-Based Simulation Model Using Reconstructability Analysis, Andey M. Nunes, Martin Zwick, Wayne Wakeland Dec 2020

Sensitivity Analysis Of An Agent-Based Simulation Model Using Reconstructability Analysis, Andey M. Nunes, Martin Zwick, Wayne Wakeland

Systems Science Faculty Publications and Presentations

Reconstructability analysis, a methodology based on information theory and graph theory, was used to perform a sensitivity analysis of an agent-based model. The NetLogo BehaviorSpace tool was employed to do a full 2k factorial parameter sweep on Uri Wilensky’s Wealth Distribution NetLogo model, to which a Gini-coefficient convergence condition was added. The analysis identified the most influential predictors (parameters and their interactions) of the Gini coefficient wealth inequality outcome. Implications of this type of analysis for building and testing agent-based simulation models are discussed.


Pyxtal_Ff: A Python Library For Automated Force Field Generation, Howard Yanxon, David Zagaceta, Binh Tang, David S. Matteson, Qiang Zhu Dec 2020

Pyxtal_Ff: A Python Library For Automated Force Field Generation, Howard Yanxon, David Zagaceta, Binh Tang, David S. Matteson, Qiang Zhu

Physics & Astronomy Faculty Research

We present PyXtal_FF—a package based on Python programming language—for developing machine learning potentials (MLPs). The aim of PyXtal_FF is to promote the application of atomistic simulations through providing several choices of atom-centered descriptors and machine learning regressions in one platform. Based on the given choice of descriptors (including the atom-centered symmetry functions, embedded atom density, SO4 bispectrum, and smooth SO3 power spectrum), PyXtal_FF can train MLPs with either generalized linear regression or neural network models, by simultaneously minimizing the errors of energy/forces/stress tensors in comparison with the data from ab-initio simulations. The trained MLP model from PyXtal_FF is interfaced with …


Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao Dec 2020

Data: The Good, The Bad And The Ethical, John D. Kelleher, Filipe Cabral Pinto, Luis M. Cortesao

Articles

It is often the case with new technologies that it is very hard to predict their long-term impacts and as a result, although new technology may be beneficial in the short term, it can still cause problems in the longer term. This is what happened with oil by-products in different areas: the use of plastic as a disposable material did not take into account the hundreds of years necessary for its decomposition and its related long-term environmental damage. Data is said to be the new oil. The message to be conveyed is associated with its intrinsic value. But as in …


Image Spam Classification With Deep Neural Networks, Ajay Pal Singh, Katerina Potika Dec 2020

Image Spam Classification With Deep Neural Networks, Ajay Pal Singh, Katerina Potika

Faculty Publications, Computer Science

Image classification is a fundamental problem of computer vision and pattern recognition. We focus on images that contain spam. Spam is unwanted bulk content, and image spam is unwanted content embedded inside the images. Image spam potentially creates a threat to the credibility of any email-based communication system. While a lot of machine learning techniques are successful in detecting textual based spam, this is not the case for image spams, which can easily evade these textual-spam detection systems. In our work, we explore and evaluate four deep learning techniques that detect image spams. First, we train deep neural networks using …


Extending Import Detection Algorithms For Concept Import From Two To Three Biomedical Terminologies, Vipina K. Keloth, James Geller, Yan Chen, Julia Xu Dec 2020

Extending Import Detection Algorithms For Concept Import From Two To Three Biomedical Terminologies, Vipina K. Keloth, James Geller, Yan Chen, Julia Xu

Publications and Research

Background: While enrichment of terminologies can be achieved in different ways, filling gaps in the IS-A hierarchy backbone of a terminology appears especially promising. To avoid difficult manual inspection, we started a research program in 2014, investigating terminology densities, where the comparison of terminologies leads to the algorithmic discovery of potentially missing concepts in a target terminology. While candidate concepts have to be approved for import by an expert, the human effort is greatly reduced by algorithmic generation of candidates. In previous studies, a single source terminology was used with one target terminology.

Methods: In this paper, we are extending …


Finding All ∈-Good Arms In Stochastic Bandits, Blake Mason, Lalit Jain, Ardhendu S. Tripathy, Robert Nowak Dec 2020

Finding All ∈-Good Arms In Stochastic Bandits, Blake Mason, Lalit Jain, Ardhendu S. Tripathy, Robert Nowak

Computer Science Faculty Research & Creative Works

The pure-exploration problem in stochastic multi-armed bandits aims to find one or more arms with the largest (or near largest) means. Examples include finding an ∈-good arm, best-arm identification, top-k arm identification, and finding all arms with means above a specified threshold. However, the problem of finding all ∈-good arms has been overlooked in past work, although arguably this may be the most natural objective in many applications. For example, a virologist may conduct preliminary laboratory experiments on a large candidate set of treatments and move all ∈-good treatments into more expensive clinical trials. Since the ultimate clinical efficacy is …


Thaw Publications, Carl Landwehr, David Kotz Dec 2020

Thaw Publications, Carl Landwehr, David Kotz

Computer Science Technical Reports

In 2013, the National Science Foundation's Secure and Trustworthy Cyberspace program awarded a Frontier grant to a consortium of four institutions, led by Dartmouth College, to enable trustworthy cybersystems for health and wellness. As of this writing, the Trustworthy Health and Wellness (THaW) project's bibliography includes more than 130 significant publications produced with support from the THaW grant; these publications document the progress made on many fronts by the THaW research team. The collection includes dissertations, theses, journal papers, conference papers, workshop contributions and more. The bibliography is organized as a Zotero library, which provides ready access to citation materials …


How Live Streaming And Twitch Have Changed The Gaming Industry, Krystal Ruiz Dec 2020

How Live Streaming And Twitch Have Changed The Gaming Industry, Krystal Ruiz

ART 108: Introduction to Games Studies

Live streaming in itself has become a booming industry in which its content consists of “streamers” who live broadcast numerous events and real-time interactions while simultaneously chatting with viewers drawing huge and increasing numbers (Adamovich). Twitch has especially excelled at garnering attention as one of the most popular live streaming platforms that focuses on broadcasting and viewing video game content (Adamovich). Twitch has grown rapidly within the last few years asserting its dominance as one of the major forces in the games industry and becoming a multi-billion-dollar industry (Adamovich). For example, according to Descrier, in 2016 there were approximately 292 …


Distributed De Novo Assembler For Large-Scale Long-Read Datasets, Sayan Goswami, Kisung Lee, Seung Jong Park Dec 2020

Distributed De Novo Assembler For Large-Scale Long-Read Datasets, Sayan Goswami, Kisung Lee, Seung Jong Park

Computer Science Faculty Research & Creative Works

Third-generation DNA sequencing technologies such as single-molecule real-time sequencing (SMRT) and nanopore sequencing have the potential to fill the gaps in the existing genome databases since the raw sequences produced by these machines are much longer than those of previous generations and therefore result in more contiguous assemblies. However, these long reads have a high error rate, which makes the assembly process computationally challenging. Moreover, since existing long-read assemblers are designed to run on a single machine, they either take days to complete or run out of memory on even moderate-sized datasets. In this paper, we present a distributed long-read …


On Improving The Memorability Of System-Assigned Recognition-Based Passwords, Mahdi Nasrullah Al-Ameen, Sonali T. Marne, Kanis Fatema, Matthew Wright, Shannon Scielzo Dec 2020

On Improving The Memorability Of System-Assigned Recognition-Based Passwords, Mahdi Nasrullah Al-Ameen, Sonali T. Marne, Kanis Fatema, Matthew Wright, Shannon Scielzo

Computer Science Faculty and Staff Publications

User-chosen passwords reflecting common strategies and patterns ease memorization but offer uncertain and often weak security, while system-assigned passwords provide higher security guarantee but suffer from poor memorability. We thus examine the technique to enhance password memorability that incorporates a scientific understanding of long-term memory. In particular, we examine the efficacy of providing users with verbal cues—real-life facts corresponding to system-assigned keywords. We also explore the usability gain of including images related to the keywords along with verbal cues. In our multi-session lab study with 52 participants, textual recognition-based scheme offering verbal cues had a significantly higher login success …


The Impact Of Shigeru Miyamoto On The Game Design Industry, Luan Tran Dec 2020

The Impact Of Shigeru Miyamoto On The Game Design Industry, Luan Tran

ART 108: Introduction to Games Studies

Nintendo started as a small company in the 1970s that sold playing cards. Having seen the exemplary gift in his son, Miyamoto's father arranged for an interview with the president of Nintendo Hiroshi Yamauchi. Consequently, Miyamoto got a position in 1977 as an apprentice in the planning department after showing his toy creations to the president. He became the first Nintendo artist as he helped create the art for the first original coin-operated arcade game. The approach demonstrated his innate abilities that would help him become the ultimate guru in the industry. Through individual discovery, Miyamoto has managed to produce …


Survey On Deep Neural Networks In Speech And Vision Systems, M. Alam, Manar D. Samad, Lasitha Vidyaratne, ‪Alexander Glandon, Khan M. Iftekharuddin Dec 2020

Survey On Deep Neural Networks In Speech And Vision Systems, M. Alam, Manar D. Samad, Lasitha Vidyaratne, ‪Alexander Glandon, Khan M. Iftekharuddin

Computer Science Faculty Research

This survey presents a review of state-of-the-art deep neural network architectures, algorithms, and systems in speech and vision applications. Recent advances in deep artificial neural network algorithms and architectures have spurred rapid innovation and development of intelligent speech and vision systems. With availability of vast amounts of sensor data and cloud computing for processing and training of deep neural networks, and with increased sophistication in mobile and embedded technology, the next-generation intelligent systems are poised to revolutionize personal and commercial computing. This survey begins by providing background and evolution of some of the most successful deep learning models for intelligent …


A Novel Spatiotemporal Prediction Method Of Cumulative Covid-19 Cases, Junzhe Cai Dec 2020

A Novel Spatiotemporal Prediction Method Of Cumulative Covid-19 Cases, Junzhe Cai

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Prediction methods are important for many applications. In particular, an accurate prediction for the total number of cases for pandemics such as the Covid-19 pandemic could help medical preparedness by providing in time a sufficient supply of testing kits, hospital beds and medical personnel. This thesis experimentally compares the accuracy of ten prediction methods for the cumulative number of Covid-19 pandemic cases. These ten methods include two types of neural networks and extrapolation methods based on best fit linear, best fit quadratic, best fit cubic and Lagrange interpolation, as well as an extrapolation method from Revesz. We also consider the …


Suffix Tree, Minwise Hashing And Streaming Algorithms For Big Data Analysis In Bioinformatics, Sairam Behera Dec 2020

Suffix Tree, Minwise Hashing And Streaming Algorithms For Big Data Analysis In Bioinformatics, Sairam Behera

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

In this dissertation, we worked on several algorithmic problems in bioinformatics using mainly three approaches: (a) a streaming model, (b) sux-tree based indexing, and (c) minwise-hashing (minhash) and locality-sensitive hashing (LSH). The streaming models are useful for large data problems where a good approximation needs to be achieved with limited space usage. We developed an approximation algorithm (Kmer-Estimate) using the streaming approach to obtain a better estimation of the frequency of k-mer counts. A k-mer, a subsequence of length k, plays an important role in many bioinformatics analyses such as genome distance estimation. We also developed new methods that use …


Semiotic Aggregation In Deep Learning, Bogdan Muşat, Răzvan Andonie Dec 2020

Semiotic Aggregation In Deep Learning, Bogdan Muşat, Răzvan Andonie

All Faculty Scholarship for the College of the Sciences

Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In …


Data Science In The Time Of Covid-19, Tony Breitzman Dec 2020

Data Science In The Time Of Covid-19, Tony Breitzman

Faculty Scholarship for the College of Science & Mathematics

No abstract provided.


Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan Dec 2020

Representational Learning Approach For Predicting Developer Expertise Using Eye Movements, Sumeet Maan

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

The thesis analyzes an existing eye-tracking dataset collected while software developers were solving bug fixing tasks in an open-source system. The analysis is performed using a representational learning approach namely, Multi-layer Perceptron (MLP). The novel aspect of the analysis is the introduction of a new feature engineering method based on the eye-tracking data. This is then used to predict developer expertise on the data. The dataset used in this thesis is inherently more complex because it is collected in a very dynamic environment i.e., the Eclipse IDE using an eye-tracking plugin, iTrace. Previous work in this area only worked on …


Spatial Frequency Implications For Global And Local Processing In Autistic Children, Riya Mody, Ayra Tusneem, Louanne Boyd, Vincent Berardi Dec 2020

Spatial Frequency Implications For Global And Local Processing In Autistic Children, Riya Mody, Ayra Tusneem, Louanne Boyd, Vincent Berardi

Student Scholar Symposium Abstracts and Posters

Visual processing in humans is done by integrating and updating multiple streams of global and local sensory input. Interaction between these two systems can be disrupted in individuals with ASD and other learning disabilities. When this integration is not done smoothly, it becomes difficult to see the “big picture”, which has been found to have implications on emotion recognition, social skills, and conversation skills. An example of this phenomenon is local interference, which is when local details are prioritized over the global features. Previous research in this field has aimed to decrease local interference by developing and evaluating a filter …


Factors Affecting Computer Science Research Productivity And Impact In Nigeria: A Bibliometric Evidence, Azubuike Ezenwoke Dec 2020

Factors Affecting Computer Science Research Productivity And Impact In Nigeria: A Bibliometric Evidence, Azubuike Ezenwoke

Library Philosophy and Practice (e-journal)

Computer science is a burgeoning research field and has the potential to accelerate the rate of industrialisation and subsequently, economic development. Using bibliometric data obtained from Scopus, this study employed a 15-year bibliometric analysis to highlight Nigeria’s productivity and impact trends in the computer science research landscape. Our findings are summarised as follows: First, Nigeria’s computer science research contribution and citations are meager in comparison to the global output. Secondly, international collaboration is generally weak as most collaborations are national in scope. Third, Nigeria’s computer science-related research is published in low-quality outlets, as Scopus has discontinued the indexing of most …


Building Postsecondary Pathways For Latinx Students In Computing: Lessons From Hispanic-Serving Institutions, Anne-Marie Núñez, David S. Knight, Sanga Kim Dec 2020

Building Postsecondary Pathways For Latinx Students In Computing: Lessons From Hispanic-Serving Institutions, Anne-Marie Núñez, David S. Knight, Sanga Kim

Departmental Technical Reports (CS)

While the COVID-19 pandemic has transformed the use of technology in education and the workforce, a shortage of computer scientists continues, and computing remains one of the least diverse STEM disciplines. Efforts to diversify the computing industry often focus on the most selective postsecondary institutions, which are predominantly White. We highlight the role of Hispanic-Serving Institutions (HSI) in gradating large numbers of STEM graduates of color, particularly Latinx students. HSIs are uniquely positioned to leverage asset-based approaches that value students’ cultural background. We describe the practices educators use in the Computing Alliance for Hispanic-Serving Institutions, a network of 40 HSIs …


Deep Learning For Screening Covid-19 Using Chest X-Ray Images, Sanhita Basu, Sushmita Mitra, Nilanjan Saha Dec 2020

Deep Learning For Screening Covid-19 Using Chest X-Ray Images, Sanhita Basu, Sushmita Mitra, Nilanjan Saha

ISI Best Publications

With the ever increasing demand for screening millions of prospective 'novel coronavirus' or COVID-19 cases, and due to the emergence of high false negatives in the commonly used PCR tests, the necessity for probing an alternative simple screening mechanism of COVID-19 using radiological images (like chest X-Rays) assumes importance. In this scenario, machine learning (ML) and deep learning (DL) offer fast, automated, effective strategies to detect abnormalities and extract key features of the altered lung parenchyma, which may be related to specific signatures of the COVID-19 virus. However, the available COVID-19 datasets are inadequate to train deep neural networks. Therefore, …


Evaluating The Reproducibility Of Physiological Stress Detection Models, Varun Mishra, Sougata Sen, Grace Chen, Tian Hao, Jeffrey Rogers, Ching-Hua Chen, David Kotz Dec 2020

Evaluating The Reproducibility Of Physiological Stress Detection Models, Varun Mishra, Sougata Sen, Grace Chen, Tian Hao, Jeffrey Rogers, Ching-Hua Chen, David Kotz

Dartmouth Scholarship

Recent advances in wearable sensor technologies have led to a variety of approaches for detecting physiological stress. Even with over a decade of research in the domain, there still exist many significant challenges, including a near-total lack of reproducibility across studies. Researchers often use some physiological sensors (custom-made or off-the-shelf), conduct a study to collect data, and build machine-learning models to detect stress. There is little effort to test the applicability of the model with similar physiological data collected from different devices, or the efficacy of the model on data collected from different studies, populations, or demographics.

This paper takes …


Is2020 A Competency Model For Undergraduate Programs In Information Systems: The Joint Acm/Ais Is2020 Task Force, Paul Leidig, Hannu Salmela Dec 2020

Is2020 A Competency Model For Undergraduate Programs In Information Systems: The Joint Acm/Ais Is2020 Task Force, Paul Leidig, Hannu Salmela

Peer-Reviewed Publications

The IS2020 report is the latest in a series of model curricula recommendations and guidelines for undergraduate degrees in Information Systems (IS). The report builds on the foundations developed in previous model curricula reports to develop a major revision of the model curriculum with the inclusion of significant new characteristics. Specifically, the IS2020 report does not directly prescribe a degree structure that targets a specific context or environment. Rather, the IS2020 report provides guidance regarding the core content of the curriculum that should be present but also provides flexibility to customize curricula according to local institutional needs.


On The Generation, Structure, And Semantics Of Grammar Patterns In Source Code Identifiers, Christian D. Newman,, Reem S. Alsuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill Dec 2020

On The Generation, Structure, And Semantics Of Grammar Patterns In Source Code Identifiers, Christian D. Newman,, Reem S. Alsuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill

Articles

Identifier names are the atoms of program comprehension. Weak identifier names decrease developer productivity and degrade the performance of automated approaches that leverage identifier names in source code analysis; threatening many of the advantages which stand to be gained from advances in artificial intelligence and machine learning. Therefore, it is vital to support developers in naming and renaming identifiers. In this paper, we extend our prior work, which studies the primary method through which names evolve: rename refactorings. In our prior work, we contextualize rename changes by examining commit messages and other refactorings. In this extension, we further consider data …


Language-Driven Region Pointer Advancement For Controllable Image Captioning, Annika Lindh, Robert J. Ross, John D. Kelleher Dec 2020

Language-Driven Region Pointer Advancement For Controllable Image Captioning, Annika Lindh, Robert J. Ross, John D. Kelleher

Conference papers

Controllable Image Captioning is a recent sub-field in the multi-modal task of Image Captioning wherein constraints are placed on which regions in an image should be described in the generated natural language caption. This puts a stronger focus on producing more detailed descriptions, and opens the door for more end-user control over results. A vital component of the Controllable Image Captioning architecture is the mechanism that decides the timing of attending to each region through the advancement of a region pointer. In this paper, we propose a novel method for predicting the timing of region pointer advancement by treating the …


Improving Binary Classification Using Filtering Based On K-Nn Proximity Graphs, Maher Ala’Raj, Munir Majdalawieh, Maysam F. Abbod Dec 2020

Improving Binary Classification Using Filtering Based On K-Nn Proximity Graphs, Maher Ala’Raj, Munir Majdalawieh, Maysam F. Abbod

All Works

© 2020, The Author(s). One of the ways of increasing recognition ability in classification problem is removing outlier entries as well as redundant and unnecessary features from training set. Filtering and feature selection can have large impact on classifier accuracy and area under the curve (AUC), as noisy data can confuse classifier and lead it to catch wrong patterns in training data. The common approach in data filtering is using proximity graphs. However, the problem of the optimal filtering parameters selection is still insufficiently researched. In this paper filtering procedure based on k-nearest neighbours proximity graph was used. Filtering parameters …


Human Parsing Based Texture Transfer From Single Image To 3d Human Via Cross-View Consistency, Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao Dec 2020

Human Parsing Based Texture Transfer From Single Image To 3d Human Via Cross-View Consistency, Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao

Machine Learning Faculty Publications

This paper proposes a human parsing based texture transfer model via cross-view consistency learning to generate the texture of 3D human body from a single image. We use the semantic parsing of human body as input for providing both the shape and pose information to reduce the appearance variation of human image and preserve the spatial distribution of semantic parts. Meanwhile, in order to improve the prediction for textures of invisible parts, we explicitly enforce the consistency across different views of the same subject by exchanging the textures predicted by two views to render images during training. The perceptual loss …


Differential Privacy Protection Over Deep Learning: An Investigation Of Its Impacted Factors, Ying Lin, Ling-Yan Bao, Ze-Minghui Li, Shu-Sheng Si, Chao-Hsien Chu Dec 2020

Differential Privacy Protection Over Deep Learning: An Investigation Of Its Impacted Factors, Ying Lin, Ling-Yan Bao, Ze-Minghui Li, Shu-Sheng Si, Chao-Hsien Chu

Research Collection School Of Computing and Information Systems

Deep learning (DL) has been widely applied to achieve promising results in many fields, but it still exists various privacy concerns and issues. Applying differential privacy (DP) to DL models is an effective way to ensure privacy-preserving training and classification. In this paper, we revisit the DP stochastic gradient descent (DP-SGD) method, which has been used by several algorithms and systems and achieved good privacy protection. However, several factors, such as the sequence of adding noise, the models used etc., may impact its performance with various degrees. We empirically show that adding noise first and clipping second will not only …


Interventional Few-Shot Learning, Zhongqi Yue, Zhang Hanwang, Qianru Sun, Xian-Sheng Hua Dec 2020

Interventional Few-Shot Learning, Zhongqi Yue, Zhang Hanwang, Qianru Sun, Xian-Sheng Hua

Research Collection School Of Computing and Information Systems

We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution …