Open Access. Powered by Scholars. Published by Universities.®

Series

2017

Discipline
Institution
Keyword
Publication

Articles 1 - 30 of 41

Full-Text Articles in Programming Languages and Compilers

A Restful Framework For Writing, Running, And Evaluating Code In Multiple Academic Settings, Christopher Ban Dec 2017

A Restful Framework For Writing, Running, And Evaluating Code In Multiple Academic Settings, Christopher Ban

MS in Computer Science Project Reports

In academia, students and professors want a well-structured and implemented framework for writing and running code in both testing and learning environments. The current limitations of the paper and pencil medium have led to the creation of many different online grading systems. However, no known system provides all of the essential features our client is interested in. Our system, developed in conjunction with Doctor Halterman, offers the ability to build modules from flat files, allow code to be compiled and run in the browser, provide users with immediate feedback, support multiple languages, and offer a module designed specifically for an …


What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing Dec 2017

What Do Developers Search For On The Web?, Xin Xia, Lingfeng Bao, David Lo, Pavneet Singh Kochhar, Ahmed E. Hassan, Zhenchang Xing

Research Collection School Of Computing and Information Systems

Developers commonly make use of a web search engine such as Google to locate online resources to improve their productivity. A better understanding of what developers search for could help us understand their behaviors and the problems that they meet during the software development process. Unfortunately, we have a limited understanding of what developers frequently search for and of the search tasks that they often find challenging. To address this gap, we collected search queries from 60 developers, surveyed 235 software engineers from more than 21 countries across five continents. In particular, we asked our survey participants to rate the …


Sudoku App: Model-Driven Development Of Android Apps Using Ocl?, Yoonsik Cheon, Aditi Barua Nov 2017

Sudoku App: Model-Driven Development Of Android Apps Using Ocl?, Yoonsik Cheon, Aditi Barua

Departmental Technical Reports (CS)

Model driven development (MDD) shifts the focus of software development from writing code to building models by developing an application as a series of transformations on models including eventual code generation. Can the key ideas of MDD be applied to the development of Android apps, one of the most popular mobile platforms of today? To answer this question, we perform a small case study of developing an Android app for playing Sudoku puzzles. We use the Object Constraint Language (OCL) as the notation for creating precise models and translate OCL constraints to Android Java code. Our findings are mixed in …


On Locating Malicious Code In Piggybacked Android Apps, Li Li, Daoyuan Li, Tegawende F. Bissyande, Jacques Klein, Haipeng Cai, David Lo, Yves Le Traon Nov 2017

On Locating Malicious Code In Piggybacked Android Apps, Li Li, Daoyuan Li, Tegawende F. Bissyande, Jacques Klein, Haipeng Cai, David Lo, Yves Le Traon

Research Collection School Of Computing and Information Systems

To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to provide a framework for dissecting malware and locating malicious program fragments within app code in order to build a comprehensive dataset of malicious samples. Towards addressing this need, we propose in this work a tool-based approach called HookRanker, which provides ranked lists of potentially malicious packages based on the way malware behaviour code is triggered. With experiments on a ground truth of piggybacked apps, we are able to automatically …


A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt Nov 2017

A Semantics Comparison Workbench For A Concurrent, Asynchronous, Distributed Programming Language, Claudio Corrodi, Alexander Heußner, Christopher M. Poskitt

Research Collection School Of Computing and Information Systems

A number of high-level languages and libraries have been proposed that offer novel and simple to use abstractions for concurrent, asynchronous, and distributed programming. The execution models that realise them, however, often change over time---whether to improve performance, or to extend them to new language features---potentially affecting behavioural and safety properties of existing programs. This is exemplified by SCOOP, a message-passing approach to concurrent object-oriented programming that has seen multiple changes proposed and implemented, with demonstrable consequences for an idiomatic usage of its core abstraction. We propose a semantics comparison workbench for SCOOP with fully and semi-automatic tools for analysing …


The Impact Of Coverage On Bug Density In A Large Industrial Software Project, Thomas Bach, Artur Andrzejak, Ralf Pannemans, David Lo Nov 2017

The Impact Of Coverage On Bug Density In A Large Industrial Software Project, Thomas Bach, Artur Andrzejak, Ralf Pannemans, David Lo

Research Collection School Of Computing and Information Systems

Measuring quality of test suites is one of the major challenges of software testing. Code coverage identifies tested and untested parts of code and is frequently used to approximate test suite quality. Multiple previous studies have investigated the relationship between coverage ratio and test suite quality, without a clear consent in the results. In this work we study whether covered code contains a smaller number of future bugs than uncovered code (assuming appropriate scaling). If this correlation holds and bug density is lower in covered code, coverage can be regarded as a meaningful metric to estimate the adequacy of testing. …


Defaultification Refactoring: A Tool For Automatically Converting Java Methods To Default, Raffi T. Khatchadourian, Hidehiko Masuhara Oct 2017

Defaultification Refactoring: A Tool For Automatically Converting Java Methods To Default, Raffi T. Khatchadourian, Hidehiko Masuhara

Publications and Research

Enabling interfaces to declare (instance) method implementations, Java 8 default methods can be used as a substitute for the ubiquitous skeletal implementation software design pattern. Performing this transformation on legacy software manually, though, may be non-trivial. The refactoring requires analyzing complex type hierarchies, resolving multiple implementation inheritance issues, reconciling differences between class and interface methods, and analyzing tie-breakers (dispatch precedence) with overriding class methods. All of this is necessary to preserve type-correctness and confirm semantics preservation. We demonstrate an automated refactoring tool called Migrate Skeletal Implementation to Interface for transforming legacy Java code to use the new default construct. The …


Fastshrinkage: Perceptually-Aware Retargeting Toward Mobile Platforms, Zhenguang Liu, Zepeng Wang, Luming Zhang, Rajiv Ratn Shah, Yingjie Xia, Yi Yang, Wei Liu Oct 2017

Fastshrinkage: Perceptually-Aware Retargeting Toward Mobile Platforms, Zhenguang Liu, Zepeng Wang, Luming Zhang, Rajiv Ratn Shah, Yingjie Xia, Yi Yang, Wei Liu

Research Collection School Of Computing and Information Systems

Retargeting aims at adapting an original high-resolution photo/video to a low-resolution screen with an arbitrary aspect ratio. Conventional approaches are generally based on desktop PCs, since the computation might be intolerable for mobile platforms (especially when retargeting videos). Besides, only low-level visual features are exploited typically, whereas human visual perception is not well encoded. In this paper, we propose a novel retargeting framework which fast shrinks photo/video by leveraging human gaze behavior. Specifically, we first derive a geometry-preserved graph ranking algorithm, which efficiently selects a few salient object patches to mimic human gaze shifting path (GSP) when viewing each scenery. …


Joanaudit: A Tool For Auditing Common Injection Vulnerabilities, Julian Thome, Lwin Khin Shar, Domenico Bianculli, Lionel Briand Sep 2017

Joanaudit: A Tool For Auditing Common Injection Vulnerabilities, Julian Thome, Lwin Khin Shar, Domenico Bianculli, Lionel Briand

Research Collection School Of Computing and Information Systems

JoanAudit is a static analysis tool to assist security auditors in auditing Web applications and Web services for common injection vulnerabilities during software development. It automatically identifies parts of the program code that are relevant for security and generates an HTML report to guide security auditors audit the source code in a scalable way. JoanAudit is configured with various security-sensitive input sources and sinks relevant to injection vulnerabilities and standard sanitization procedures that prevent these vulnerabilities. It can also automatically fix some cases of vulnerabilities in source code — cases where inputs are directly used in sinks without any form …


Evaluating Student Perceptions And Learning Outcomes: Differences Between Sla-Able And Non-Sla-Able Introductory Programming Courses, Christina M. Frederick, Matthew B. Pierce, Andrew Griggs, Lulu Sun Sep 2017

Evaluating Student Perceptions And Learning Outcomes: Differences Between Sla-Able And Non-Sla-Able Introductory Programming Courses, Christina M. Frederick, Matthew B. Pierce, Andrew Griggs, Lulu Sun

Publications

Engineering, computer science and subsequently knowledge of programming language is an increasingly vital skill in today’s workforce. First year engineering students are introduced to programming in addition to rigorous course loads in their first year. Second Language Acquisition (SLA) has been applied to programming course content delivery and has shown promise as an effective means of better educating new students. Results will be presented from a NSF funded study conducted over the past two years. SLA was applied to an introductory engineering course that teaches basic programming skills in a Blended learning environment (SLA-aBLe). This study examined four semesters worth …


Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan Sep 2017

Code Coverage And Postrelease Defects: A Large-Scale Study On Open Source Projects, Pavneet Singh Kochhar, David Lo, Julia Lawall, Nachiappan Nagappan

Research Collection School Of Computing and Information Systems

Testing is a pivotal activity in ensuring the quality of software. Code coverage is a common metric used as a yardstick to measure the efficacy and adequacy of testing. However, does higher coverage actually lead to a decline in postrelease bugs? Do files that have higher test coverage actually have fewer bug reports? The direct relationship between code coverage and actual bug reports has not yet been analyzed via a comprehensive empirical study on real bugs. Past studies only involve a few software systems or artificially injected bugs (mutants). In this empirical study, we examine these questions in the context …


Automated Android Application Permission Recommendation, Lingfeng Bao, David Lo, Xin Xia, Shanping Li Sep 2017

Automated Android Application Permission Recommendation, Lingfeng Bao, David Lo, Xin Xia, Shanping Li

Research Collection School Of Computing and Information Systems

The number of Android applications has increased rapidly as Android is becoming the dominant platform in the smartphone market. Security and privacy are key factors for an Android application to be successful. Android provides a permission mechanism to ensure security and privacy. This permission mechanism requires that developers declare the sensitive resources required by their applications. On installation or during runtime, users are required to agree with the permission request. However, in practice, there are numerous popular permission misuses, despite Android introducing official documents stating how to use these permissions properly. Some data mining techniques (e.g., association rule mining) have …


Evopass: Evolvable Graphical Password Against Shoulder-Surfing Attacks, Xingjie Yu, Zhan Wang, Yingjiu Li, Liang Li, Wen Tao Zhu, Li Song Sep 2017

Evopass: Evolvable Graphical Password Against Shoulder-Surfing Attacks, Xingjie Yu, Zhan Wang, Yingjiu Li, Liang Li, Wen Tao Zhu, Li Song

Research Collection School Of Computing and Information Systems

The passwords for authenticating users are susceptible to shoulder-surfing attacks in which attackers learn users' passwords through direct observations without any technical support. A straightforward solution to defend against such attacks is to change passwords periodically or even constantly, making the previously observed passwords useless. However, this may lead to a situation in which users run out of strong passwords they can remember, or they are forced to choose passwords that are weak, correlated, or difficult to memorize. To achieve both security and usability in user authentication, we propose EvoPass, the first evolvable graphical password authentication system. EvoPass transforms a …


Loopster: Static Loop Termination Analysis, Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, Xiaohong Li Sep 2017

Loopster: Static Loop Termination Analysis, Xiaofei Xie, Bihuan Chen, Liang Zou, Shang-Wei Lin, Yang Liu, Xiaohong Li

Research Collection School Of Computing and Information Systems

Loop termination is an important problem for proving the correctness of a system and ensuring that the system always reacts. Existing loop termination analysis techniques mainly depend on the synthesis of ranking functions, which is often expensive. In this paper, we present a novel approach, named Loopster, which performs an efficient static analysis to decide the termination for loops based on path termination analysis and path dependency reasoning. Loopster adopts a divide-and-conquer approach: (1) we extract individual paths from a target multi-path loop and analyze the termination of each path, (2) analyze the dependencies between each two paths, and then …


Ancr—An Adaptive Network Coding Routing Scheme For Wsns With Different-Success-Rate Links †, Xiang Ji, Anwen Wang, Chunyu Li, Chun Ma, Yao Peng, Dajin Wang, Qingyi Hua, Feng Chen, Dingyi Fang Aug 2017

Ancr—An Adaptive Network Coding Routing Scheme For Wsns With Different-Success-Rate Links †, Xiang Ji, Anwen Wang, Chunyu Li, Chun Ma, Yao Peng, Dajin Wang, Qingyi Hua, Feng Chen, Dingyi Fang

Department of Computer Science Faculty Scholarship and Creative Works

As the underlying infrastructure of the Internet of Things (IoT), wireless sensor networks (WSNs) have been widely used in many applications. Network coding is a technique in WSNs to combine multiple channels of data in one transmission, wherever possible, to save node’s energy as well as increase the network throughput. So far most works on network coding are based on two assumptions to determine coding opportunities: (1) All the links in the network have the same transmission success rate; (2) Each link is bidirectional, and has the same transmission success rate on both ways. However, these assumptions may not be …


Can Syntax Help? Improving An Lstm-Based Sentence Compression Model For New Domains, Liangguo Wang, Jing Jiang, Hai Leong Chieu, Chen Hui Ong, Dandan Song, Lejian Liao Aug 2017

Can Syntax Help? Improving An Lstm-Based Sentence Compression Model For New Domains, Liangguo Wang, Jing Jiang, Hai Leong Chieu, Chen Hui Ong, Dandan Song, Lejian Liao

Research Collection School Of Computing and Information Systems

In this paper, we study how to improve thedomain adaptability of a deletion-basedLong Short-Term Memory (LSTM) neuralnetwork model for sentence compression.We hypothesize that syntactic informationhelps in making such modelsmore robust across domains. We proposetwo major changes to the model: usingexplicit syntactic features and introducingsyntactic constraints through Integer LinearProgramming (ILP). Our evaluationshows that the proposed model works betterthan the original model as well as a traditionalnon-neural-network-based modelin a cross-domain setting.


Iupdater: Low Cost Rss Fingerprints Updating For Device-Free Localization, Liqiong Chang, Jie Xiong, Yu Wang, Xiaojiang Chen, Junhao Hu, Dingyi Fang Jul 2017

Iupdater: Low Cost Rss Fingerprints Updating For Device-Free Localization, Liqiong Chang, Jie Xiong, Yu Wang, Xiaojiang Chen, Junhao Hu, Dingyi Fang

Research Collection School Of Computing and Information Systems

While most existing indoor localization techniques are device-based, many emerging applications such as intruder detection and elderly monitoring drive the needs of device-free localization, in which the target can be localized without any device attached. Among the diverse techniques, received signal strength (RSS) fingerprint-based methods are popular because of the wide availability of RSS readings in most commodity hardware. However, current fingerprint-based systems suffer from high human labor cost to update the fingerprint database and low accuracy due to the large degree of RSS variations. In this paper, we propose a fingerprint-based device-free localization system named iUpdater to significantly reduce …


Auditing Anti-Malware Tools By Evolving Android Malware And Dynamic Loading Technique, Yinxing Xue, Guozhu Meng, Yang Liu, Tian Huat Tan, Hongxu Chen, Jun Sun, Jie Zhang Jul 2017

Auditing Anti-Malware Tools By Evolving Android Malware And Dynamic Loading Technique, Yinxing Xue, Guozhu Meng, Yang Liu, Tian Huat Tan, Hongxu Chen, Jun Sun, Jie Zhang

Research Collection School Of Computing and Information Systems

Although a previous paper shows that existing antimalware tools (AMTs) may have high detection rate, the report is based on existing malware and thus it does not imply that AMTs can effectively deal with future malware. It is desirable to have an alternative way of auditing AMTs. In our previous paper, we use malware samples from android malware collection GENOME to summarize a malware meta-model for modularizing the common attack behaviors and evasion techniques in reusable features. We then combine different features with an evolutionary algorithm, in which way we evolve malware for variants. Previous results have shown that the …


Cloud-Based Query Evaluation For Energy-Efficient Mobile Sensing, Tianli Mo, Lipyeow Lim, Sougata Sen, Archan Misra, Rajesh Krishna Balan, Youngki Lee Jul 2017

Cloud-Based Query Evaluation For Energy-Efficient Mobile Sensing, Tianli Mo, Lipyeow Lim, Sougata Sen, Archan Misra, Rajesh Krishna Balan, Youngki Lee

Research Collection School Of Computing and Information Systems

In this paper, we reduce the energy overheads of continuous mobile sensing, specifically for the case of context-aware applications that are interested in collective context or events, i.e., events expressed as a set of complex predicates over sensor data from multiple smartphones. We propose a cloud-based query management and optimization framework, called CloQue, that can support thousands of such concurrent queries, executing over a large number of individual smartphones. Our central insight is that the context of different individuals & groups often have significant correlation, and that this correlation can be learned through standard association rule mining on historical data. …


Levity Polymorphism, Richard A. Eisenberg, Simon Peyton Jones Jun 2017

Levity Polymorphism, Richard A. Eisenberg, Simon Peyton Jones

Computer Science Faculty Research and Scholarship

Parametric polymorphism is one of the linchpins of modern typed programming, but it comes with a real performance penalty. We describe this penalty; offer a principled way to reason about it (kinds as calling conventions); and propose levity polymorphism. This new form of polymorphism allows abstractions over calling conventions; we detail and verify restrictions that are necessary in order to compile levity-polymorphic functions. Levity polymorphism has created new opportunities in Haskell, including the ability to generalize nearly half of the type classes in GHC's standard library.


The Introduction Of Informal Cooperative Learning Into Our Programming Laboratories, Guity Ravai, Ludmila Nunes, Ronald Erdei Jun 2017

The Introduction Of Informal Cooperative Learning Into Our Programming Laboratories, Guity Ravai, Ludmila Nunes, Ronald Erdei

IMPACT Presentations

Presented at the Women in Engineering ProActive Network (WEPAN) Change Leader Forum: Creating a Mindset for Action in Westminster, CO, USA


Rack: Code Search In The Ide Using Crowdsourced Knowledge, Mohammad Masudur Rahman, Chanchal K. Roy, David Lo Jun 2017

Rack: Code Search In The Ide Using Crowdsourced Knowledge, Mohammad Masudur Rahman, Chanchal K. Roy, David Lo

Research Collection School Of Computing and Information Systems

Traditional code search engines often do not perform well with natural language queries since they mostly apply keyword matching. These engines thus require carefully designed queries containing information about programming APIs for code search. Unfortunately, existing studies suggest that preparing an effective query for code search is both challenging and time consuming for the developers. In this paper, we propose a novel code search tool-RACK-that returns relevant source code for a given code search query written in natural language text. The tool first translates the query into a list of relevant API classes by mining keyword-API associations from the crowdsourced …


Bug Characteristics In Blockchain Systems: A Large-Scale Empirical Study, Zhiyuan Wan, David Lo, Xin Xia, Liang Cai Jun 2017

Bug Characteristics In Blockchain Systems: A Large-Scale Empirical Study, Zhiyuan Wan, David Lo, Xin Xia, Liang Cai

Research Collection School Of Computing and Information Systems

Bugs severely hurt blockchain system dependability. A thorough understanding of blockchain bug characteristics is required to design effective tools for preventing, detecting and mitigating bugs. We perform an empirical study on bug characteristics in eight representative open source blockchain systems. First, we manually examine 1,108 bug reports to understand the nature of the reported bugs. Second, we leverage card sorting to label the bug reports, and obtain ten bug categories in blockchain systems. We further investigate the frequency distribution of bug categories across projects and programming languages. Finally, we study the relationship between bug categories and bug fixing time. The …


Cataloging Github Repositories, Abhishek Sharma, Ferdian Thung, Pavneet Singh Kochhar, Agus Sulistya, David Lo Jun 2017

Cataloging Github Repositories, Abhishek Sharma, Ferdian Thung, Pavneet Singh Kochhar, Agus Sulistya, David Lo

Research Collection School Of Computing and Information Systems

GitHub is one of the largest and most popular repository hosting service today, having about 14 million users and more than 54 million repositories as of March 2017. This makes it an excellent platform to find projects that developers are interested in exploring. GitHub showcases its most popular projects by cataloging them manually into categories such as DevOps tools, web application frameworks, and game engines. We propose that such cataloging should not be limited only to popular projects. We explore the possibility of developing such cataloging system by automatically extracting functionality descriptive text segments from readme files of GitHub repositories. …


Automated Refactoring Of Legacy Java Software To Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara May 2017

Automated Refactoring Of Legacy Java Software To Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara

Publications and Research

Java 8 default methods, which allow interfaces to contain (instance) method implementations, are useful for the skeletal implementation software design pattern. However, it is not easy to transform existing software to exploit default methods as it requires analyzing complex type hierarchies, resolving multiple implementation inheritance issues, reconciling differences between class and interface methods, and analyzing tie-breakers (dispatch precedence) with overriding class methods to preserve type-correctness and confirm semantics preservation. In this paper, we present an efficient, fully-automated, type constraint-based refactoring approach that assists developers in taking advantage of enhanced interfaces for their legacy Java software. The approach features an extensive …


Automated Refactoring Of Legacy Java Software To Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara May 2017

Automated Refactoring Of Legacy Java Software To Default Methods, Raffi T. Khatchadourian, Hidehiko Masuhara

Publications and Research

Java 8 introduces enhanced interfaces, allowing for default (instance) methods that implementers will inherit if none are provided [3]. Default methods can be used [2] as a replacement of the skeletal implementation pattern [1], which creates abstract skeletal implementation classes that implementers extend. Migrating legacy code using the skeletal implementation pattern to instead use default methods can require significant manual effort due to subtle language and semantic restrictions. It requires preserving typecorrectness by analyzing complex type hierarchies, resolving issues arising from multiple inheritance, reconciling differences between class and interface methods, and ensuring tie-breakers with overriding class methods do not alter …


Comparing Tensorflow Deep Learning Performance Using Cpus, Gpus, Local Pcs And Cloud, John Lawrence, Jonas Malmsten, Andrey Rybka, Daniel A. Sabol, Ken Triplin May 2017

Comparing Tensorflow Deep Learning Performance Using Cpus, Gpus, Local Pcs And Cloud, John Lawrence, Jonas Malmsten, Andrey Rybka, Daniel A. Sabol, Ken Triplin

Publications and Research

Deep learning is a very computational intensive task. Traditionally GPUs have been used to speed-up computations by several orders of magnitude. TensorFlow is a deep learning framework designed to improve performance further by running on multiple nodes in a distributed system. While TensorFlow has only been available for a little over a year, it has quickly become the most popular open source machine learning project on GitHub. The open source version of TensorFlow was originally only capable of running on a single node while Google’s proprietary version only was capable of leveraging distributed systems. This has now changed. In this …


Experiences With Scala Across The College-Level Curriculum, Konstantin Läufer, George K. Thiruvathukal, Mark C. Lewis Apr 2017

Experiences With Scala Across The College-Level Curriculum, Konstantin Läufer, George K. Thiruvathukal, Mark C. Lewis

Emerging Technologies Laboratory

Various hybrid-functional languages, designed to balance compile-time error detection, conciseness, and performance, have emerged. Scala, e.g., is interoperable with Java and has become an early leader in adoption, especially in the start-up and open-source spaces.

As educators, we have recognized Scala’s value as a teaching language across the CS curriculum. In CS1, the read-eval-print loop and simple, uniform syntax aid programming in the small. In CS2, higher-order methods allow concise, efficient manipulation of collections. In a programming languages course, advanced constructs facilitate the separation of concerns, program representation and interpretation, and concurrent programming. In advanced applied courses, language mechanisms and …


Cst1101–Problem Solving With Computer Programming, Syllabus, Elena Filatova Apr 2017

Cst1101–Problem Solving With Computer Programming, Syllabus, Elena Filatova

Open Educational Resources

No abstract provided.


Clustering Classes In Packages For Program Comprehension, Xiaobing Sun, Xiangyue Liu, Bin Li, Bixin Li, David Lo, Lingzhi Liao Apr 2017

Clustering Classes In Packages For Program Comprehension, Xiaobing Sun, Xiangyue Liu, Bin Li, Bixin Li, David Lo, Lingzhi Liao

Research Collection School Of Computing and Information Systems

During software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for comprehension, developers may first focus on the package comprehension. The packages in the system are of different sizes. For small-sized packages in the system, developers can easily comprehend them. However, for large-sized packages, they are difficult to understand. In this article, we focus on understanding these large-sized packages and propose a novel program comprehension approach for large-sized …