Open Access. Powered by Scholars. Published by Universities.®

Series

2020

Discipline
Institution
Keyword
Publication
File Type

Articles 1 - 30 of 31

Full-Text Articles in Programming Languages and Compilers

On The Generation, Structure, And Semantics Of Grammar Patterns In Source Code Identifiers, Christian D. Newman,, Reem S. Alsuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill Dec 2020

On The Generation, Structure, And Semantics Of Grammar Patterns In Source Code Identifiers, Christian D. Newman,, Reem S. Alsuhaibani, Michael J. Decker, Anthony Peruma, Dishant Kaushik, Mohamed Wiem Mkaouer, Emily Hill

Articles

Identifier names are the atoms of program comprehension. Weak identifier names decrease developer productivity and degrade the performance of automated approaches that leverage identifier names in source code analysis; threatening many of the advantages which stand to be gained from advances in artificial intelligence and machine learning. Therefore, it is vital to support developers in naming and renaming identifiers. In this paper, we extend our prior work, which studies the primary method through which names evolve: rename refactorings. In our prior work, we contextualize rename changes by examining commit messages and other refactorings. In this extension, we further consider data …


A Bert-Based Dual Embedding Model For Chinese Idiom Prediction, Minghuan Tan, Jing Jiang Dec 2020

A Bert-Based Dual Embedding Model For Chinese Idiom Prediction, Minghuan Tan, Jing Jiang

Research Collection School Of Computing and Information Systems

Chinese idioms are special fixed phrases usually derived from ancient stories, whose meanings are oftentimes highly idiomatic and non-compositional. The Chinese idiom prediction task is to select the correct idiom from a set of candidate idioms given a context with a blank. We propose a BERT-based dual embedding model to encode the contextual words as well as to learn dual embeddings of the idioms. Specifically, we first match the embedding of each candidate idiom with the hidden representation corresponding to the blank in the context. We then match the embedding of each candidate idiom with the hidden representations of all …


Actor Concurrency Bugs: A Comprehensive Study On Symptoms, Root Causes, Api Usages, And Differences, Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, Raffi T. Khatchadourian Nov 2020

Actor Concurrency Bugs: A Comprehensive Study On Symptoms, Root Causes, Api Usages, And Differences, Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, Raffi T. Khatchadourian

Publications and Research

Actor concurrency is becoming increasingly important in the development of real-world software systems. Although actor concurrency may be less susceptible to some multithreaded concurrency bugs, such as low-level data races and deadlocks, it comes with its own bugs that may be different. However, the fundamental characteristics of actor concurrency bugs, including their symptoms, root causes, API usages, examples, and differences when they come from different sources are still largely unknown. Actor software development can significantly benefit from a comprehensive qualitative and quantitative understanding of these characteristics, which is the focus of this work, to foster better API documentations, development practices, …


Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 5, Raffi T. Khatchadourian Nov 2020

Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 5, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Cross-Thought For Sentence Encoder Pre-Training, Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang Nov 2020

Cross-Thought For Sentence Encoder Pre-Training, Shuohang Wang, Yuwei Fang, Siqi Sun, Zhe Gan, Yu Cheng, Jingjing Liu, Jing Jiang

Research Collection School Of Computing and Information Systems

In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering. Instead of using the original signals of full sentences, we train a Transformer-based sequence encoder over a large set of short sequences, which allows the model to automatically select the most useful information for predicting masked words. Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders trained with continuous sentence signals as well as traditional masked language modeling baselines. Our proposed approach also …


Collections In Scala, Raffi T. Khatchadourian Oct 2020

Collections In Scala, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Espade: An Efficient And Semantically Secure Shortest Path Discovery For Outsourced Location-Based Services, Bharath K. Samanthula, Divyadharshini Karthikeyan, Boxiang Dong, K. Anitha Kumari Oct 2020

Espade: An Efficient And Semantically Secure Shortest Path Discovery For Outsourced Location-Based Services, Bharath K. Samanthula, Divyadharshini Karthikeyan, Boxiang Dong, K. Anitha Kumari

Department of Computer Science Faculty Scholarship and Creative Works

With the rapid growth of smart devices and technological advancements in tracking geospatial data, the demand for Location-Based Services (LBS) is facing a constant rise in several domains, including military, healthcare and transportation. It is a natural step to migrate LBS to a cloud environment to achieve on-demand scalability and increased resiliency. Nonetheless, outsourcing sensitive location data to a third-party cloud provider raises a host of privacy concerns as the data owners have reduced visibility and control over the outsourced data. In this paper, we consider outsourced LBS where users want to retrieve map directions without disclosing their location information. …


Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 4, Raffi T. Khatchadourian Oct 2020

Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 4, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 3, Raffi T. Khatchadourian Oct 2020

Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 3, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Csci 49380/79526: Fundamentals Of Reactive Programming- Assignment 1, Raffi T. Khatchadourian Oct 2020

Csci 49380/79526: Fundamentals Of Reactive Programming- Assignment 1, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Csci 49380/79526: Fundamentals Of Reactive Programming- Syllabus, Raffi T. Khatchadourian Oct 2020

Csci 49380/79526: Fundamentals Of Reactive Programming- Syllabus, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 2, Raffi T. Khatchadourian Oct 2020

Csci 49380/79526: Fundamentals Of Reactive Programming - Assignment 2, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Inheritance Details In Scala, Raffi T. Khatchadourian Sep 2020

Inheritance Details In Scala, Raffi T. Khatchadourian

Open Educational Resources

No abstract provided.


Evaluating Performance Of Openmp Tasks In A Seismic Stencil Application, Eric Raut, Jie Meng, Mauricio Araya-Polo, Barbara Chapman Sep 2020

Evaluating Performance Of Openmp Tasks In A Seismic Stencil Application, Eric Raut, Jie Meng, Mauricio Araya-Polo, Barbara Chapman

Department of Applied Mathematics & Statistics Faculty Publications

Simulations based on stencil computations (widely used in geosciences) have been dominated by the MPI+OpenMP programming model paradigm. Little effort has been devoted to experimenting with task-based parallelism in this context. We address this by introducing OpenMP task parallelism into the kernel of an industrial seismic modeling code, Minimod. We observe that even for these highly regular stencil computations, taskified kernels are competitive with traditional OpenMP-augmented loops, and in some experiments tasks even outperform loop parallelism.

This promising result sets the stage for more complex computational patterns. Simulations involve more than just the stencil calculation: a collection of kernels is …


A Fortran-Keras Deep Learning Bridge For Scientific Computing, Jordan Ott, Mike Pritchard, Natalie Best, Erik Linstead, Milan Curcic, Pierre Baldi Aug 2020

A Fortran-Keras Deep Learning Bridge For Scientific Computing, Jordan Ott, Mike Pritchard, Natalie Best, Erik Linstead, Milan Curcic, Pierre Baldi

Engineering Faculty Articles and Research

Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way …


An Empirical Study Of Refactorings And Technical Debt In Machine Learning Systems, Yiming Tang, Raffi T. Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, Anita Raja Aug 2020

An Empirical Study Of Refactorings And Technical Debt In Machine Learning Systems, Yiming Tang, Raffi T. Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, Anita Raja

Publications and Research

Machine Learning (ML), including Deep Learning (DL), systems, i.e., those with ML capabilities, are pervasive in today's data-driven society. Such systems are complex; they are comprised of ML models and many subsystems that support learning processes. As with other complex systems, ML systems are prone to classic technical debt issues, especially when such systems are long-lived, but they also exhibit debt specific to these systems. Unfortunately, there is a gap of knowledge in how ML systems actually evolve and are maintained. In this paper, we fill this gap by studying refactorings, i.e., source-to-source semantics-preserving program transformations, performed in real-world, open-source …


Visualocv: Refined Dataflow Programming Interface For Opencv, John Boggess Aug 2020

Visualocv: Refined Dataflow Programming Interface For Opencv, John Boggess

MS in Computer Science Project Reports

OpenCV is a popular tool for developing computer vision algorithms; however, prototyping OpenCV-based algorithms is a time consuming and iterative process. VisualOCV is an open source tool to help users better understand and create computer vision algorithms. A user can see how data is processed at each step in their algorithm, and the results of any changes to the algorithm will be displayed to the user immediately. This can allow the user to easily experiment with various computer vision methods and their parameters. EyeCalc 1.0 uses the Microsoft Foundation Class Library, an old GUI framework by Microsoft, and contains various …


Automated Synthesis Of Local Time Requirement For Service Composition, Étienne André, Tian Huat Tan, Manman Chen, Shuang Liu, Jun Sun, Yang Liu, Jin Song Dong Jul 2020

Automated Synthesis Of Local Time Requirement For Service Composition, Étienne André, Tian Huat Tan, Manman Chen, Shuang Liu, Jun Sun, Yang Liu, Jin Song Dong

Research Collection School Of Computing and Information Systems

Service composition aims at achieving a business goal by composing existing service-based applications or components. The response time of a service is crucial, especially in time-critical business environments, which is often stated as a clause in service-level agreements between service providers and service users. To meet the guaranteed response time requirement of a composite service, it is important to select a feasible set of component services such that their response time will collectively satisfy the response time requirement of the composite service. In this work, we use the BPEL modeling language that aims at specifying Web services. We extend it …


What Was Written Vs. Who Read It: News Media Profiling Using Text Analysis And Social Media Context, Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav. Nakov Jul 2020

What Was Written Vs. Who Read It: News Media Profiling Using Text Analysis And Social Media Context, Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, Preslav. Nakov

Research Collection School Of Computing and Information Systems

Predicting the political bias and the factuality of reporting of entire news outlets are critical elements of media profiling, which is an understudied but an increasingly important research direction. The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim, either manually or automatically. Thus, it has been proposed to profile entire news outlets and to look for those that are likely to publish fake or biased content. This makes it possible to detect likely “fake news” the moment they are published, by simply checking the reliability of their …


An Empirical Study On The Use And Misuse Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, Baishakhi Ray Apr 2020

An Empirical Study On The Use And Misuse Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, Baishakhi Ray

Publications and Research

Streaming APIs allow for big data processing of native data structures by providing MapReduce-like operations over these structures. However, unlike traditional big data systems, these data structures typically reside in shared memory accessed by multiple cores. Although popular, this emerging hybrid paradigm opens the door to possibly detrimental behavior, such as thread contention and bugs related to non-execution and non-determinism. This study explores the use and misuse of a popular streaming API, namely, Java 8 Streams. The focus is on how developers decide whether or not to run these operations sequentially or in parallel and bugs both specific and tangential …


An Empirical Study On The Use And Misuse Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, Baishakhi Ray Apr 2020

An Empirical Study On The Use And Misuse Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, Baishakhi Ray

Publications and Research

Streaming APIs allow for big data processing of native data structures by providing MapReduce-like operations over these structures. However, unlike traditional big data systems, these data structures typically reside in shared memory accessed by multiple cores. Although popular, this emerging hybrid paradigm opens the door to possibly detrimental behavior, such as thread contention and bugs related to non-execution and non-determinism. This study explores the use and misuse of a popular streaming API, namely, Java 8 Streams. The focus is on how developers decide whether or not to run these operations sequentially or in parallel and bugs both specific and tangential …


Achieving Obfuscation Through Self-Modifying Code: A Theoretical Model, Heidi Waddell Apr 2020

Achieving Obfuscation Through Self-Modifying Code: A Theoretical Model, Heidi Waddell

Senior Honors Theses

With the extreme amount of data and software available on networks, the protection of online information is one of the most important tasks of this technological age. There is no such thing as safe computing, and it is inevitable that security breaches will occur. Thus, security professionals and practices focus on two areas: security, preventing a breach from occurring, and resiliency, minimizing the damages once a breach has occurred. One of the most important practices for adding resiliency to source code is through obfuscation, a method of re-writing the code to a form that is virtually unreadable. …


Storage Management Strategy In Mobile Phones For Photo Crowdsensing, En Wang, Zhengdao Qu, Xinyao Liang, Xiangyu Meng, Yongjian Yang, Dawei Li, Weibin Meng Apr 2020

Storage Management Strategy In Mobile Phones For Photo Crowdsensing, En Wang, Zhengdao Qu, Xinyao Liang, Xiangyu Meng, Yongjian Yang, Dawei Li, Weibin Meng

Department of Computer Science Faculty Scholarship and Creative Works

In mobile crowdsensing, some users jointly finish a sensing task through the sensors equipped in their intelligent terminals. In particular, the photo crowdsensing based on Mobile Edge Computing (MEC) collects pictures for some specific targets or events and uploads them to nearby edge servers, which leads to richer data content and more efficient data storage compared with the common mobile crowdsensing; hence, it has attracted an important amount of attention recently. However, the mobile users prefer uploading the photos through Wifi APs (PoIs) rather than cellular networks. Therefore, photos stored in mobile phones are exchanged among users, in order to …


Gradual Program Analysis, Samuel Estep Apr 2020

Gradual Program Analysis, Samuel Estep

Senior Honors Theses

Dataflow analysis and gradual typing are both well-studied methods to gain information about computer programs in a finite amount of time. The gradual program analysis project seeks to combine those two techniques in order to gain the benefits of both. This thesis explores the background information necessary to understand gradual program analysis, and then briefly discusses the research itself, with reference to publication of work done so far. The background topics include essential aspects of programming language theory, such as syntax, semantics, and static typing; dataflow analysis concepts, such as abstract interpretation, semilattices, and fixpoint computations; and gradual typing theory, …


Incorporating Digital Ethics Throughout The Software Development Process, Michael Collins, Damian Gordon, Anna Becevel, William O'Mahony Mar 2020

Incorporating Digital Ethics Throughout The Software Development Process, Michael Collins, Damian Gordon, Anna Becevel, William O'Mahony

Conference papers

The media is reporting scandals associated with computer companies with increasing regularity; whether it is the misuse of user data, breach of privacy concerns, the use of biased artificial intelligence, or the problems of automated vehicles. Because of these complex issues, there is a growing need to equip computer science students with a deep appreciation of ethics, and to ensure that in the future they will develop computer systems that are ethically-based. One particularly useful strand of their education to incorporate ethics into is when teaching them about the formal approaches to developing computer systems.

There are a number of …


Securing Bring-Your-Own-Device (Byod) Programming Exams, Oka Kurniawan, Norman Tiong Seng Lee, Christopher M. Poskitt Mar 2020

Securing Bring-Your-Own-Device (Byod) Programming Exams, Oka Kurniawan, Norman Tiong Seng Lee, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

Traditional pen and paper exams are inadequate for modern university programming courses as they are misaligned with pedagogies and learning objectives that target practical coding ability. Unfortunately, many institutions lack the resources or space to be able to run assessments in dedicated computer labs. This has motivated the development of bring-your-own-device (BYOD) exam formats, allowing students to program in a similar environment to how they learnt, but presenting instructors with significant additional challenges in preventing plagiarism and cheating. In this paper, we describe a BYOD exam solution based on lockdown browsers, software which temporarily turns students' laptops into secure workstations …


Mcdpc: Multi‐Center Density Peak Clustering, Yizhang Wang, Di Wang, Xiaofeng Zhang, Wei Pang, Chunyan Miao, Ah-Hwee Tan, You Zhou Feb 2020

Mcdpc: Multi‐Center Density Peak Clustering, Yizhang Wang, Di Wang, Xiaofeng Zhang, Wei Pang, Chunyan Miao, Ah-Hwee Tan, You Zhou

Research Collection School Of Computing and Information Systems

Density peak clustering (DPC) is a recently developed density-based clustering algorithm that achieves competitive performance in a non-iterative manner. DPC is capable of effectively handling clusters with single density peak (single center), i.e., based on DPC’s hypothesis, one and only one data point is chosen as the center of any cluster. However, DPC may fail to identify clusters with multiple density peaks (multi-centers) and may not be able to identify natural clusters whose centers have relatively lower local density. To address these limitations, we propose a novel clustering algorithm based on a hierarchical approach, named multi-center density peak clustering (McDPC). …


Arduino Microcontrollers In The Classroom: Teaching How To Phrase Effective Science Questions And How To Answer Them With Original Data, Tony Dinsmore Jan 2020

Arduino Microcontrollers In The Classroom: Teaching How To Phrase Effective Science Questions And How To Answer Them With Original Data, Tony Dinsmore

Science and Engineering Saturday Seminars

Arduino microcontrollers in the classroom: teaching how to phrase effective science questions and how to answer them with original data. Prof. Tony Dinsmore, UMass Physics This workshop will develop course modules that address a challenge in the science curriculum: how do we teach basic problem-solving and curiosity-based research skills in a classroom setting? The standard science curriculum teaches concepts and theory quite well but leaves rather little opportunity for students to take the lead in designing and implementing their own investigations. The workshop will use the Arduino, an inexpensive microcontroller that is simple to set up. A huge range of …


Safe Automated Refactoring For Intelligent Parallelization Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh Jan 2020

Safe Automated Refactoring For Intelligent Parallelization Of Java 8 Streams, Raffi T. Khatchadourian, Yiming Tang, Mehdi Bagherzadeh

Publications and Research

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages and platforms. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite, e.g., collections, and infinite data structures. However, using this API efficiently involves subtle considerations such as determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. ics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions and …


Building Something With The Raspberry Pi, Richard Kordel Jan 2020

Building Something With The Raspberry Pi, Richard Kordel

Presidential Research Grants

In 2017 Ryan Korn and I submitted a grant proposal in the annual Harrisburg University President’s Grant process. Our proposal was to partner with a local high school to install a classroom of 20 Raspberry Pi’s, along with the requisite peripherals. In that classroom students would be challenged to design something that combined programming with physical computing. In our presentation to the school we suggested that this project would give students the opportunity to be “amazing.”

As part of the grant, the top three students would be given scholarships to HU and the top five finalists would all be permitted …