Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Programming Languages and Compilers

Research Collection School Of Computing and Information Systems

Code search

Publication Year

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

Codematcher: Searching Code Based On Sequential Semantics Of Important Query Words, Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li Jan 2022

Codematcher: Searching Code Based On Sequential Semantics Of Important Query Words, Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li

Research Collection School Of Computing and Information Systems

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR)-based models for code search, but they fail to connect the semantic gap between query and code. An early successful deep learning (DL)-based model DeepCS solved this issue by learning the relationship between pairs of code methods and corresponding natural language descriptions. Two major advantages of DeepCS are the capability of understanding irrelevant/noisy keywords and capturing sequential relationships between words in query and code. In this article, we proposed an IR-based model CodeMatcher that …


Automatic Query Reformulation For Code Search Using Crowdsourced Knowledge, Mohammad M. Rahman, Chanchal K. Roy, David Lo Jan 2019

Automatic Query Reformulation For Code Search Using Crowdsourced Knowledge, Mohammad M. Rahman, Chanchal K. Roy, David Lo

Research Collection School Of Computing and Information Systems

Traditional code search engines (e.g., Krugle) often do not perform well with natural language queries. They mostly apply keyword matching between query and source code. Hence, they need carefully designed queries containing references to relevant APIs for the code search. Unfortunately, preparing an effective search query is not only challenging but also time-consuming for the developers according to existing studies. In this article, we propose a novel query reformulation technique–RACK–that suggests a list of relevant API classes for a natural language query intended for code search. Our technique offers such suggestions by exploiting keyword-API associations from the questions and answers …


Automatic Query Reformulation For Code Search Using Crowdsourced Knowledge, Mohammad M. Rahman, Chanchal K. Roy, David Lo Jan 2019

Automatic Query Reformulation For Code Search Using Crowdsourced Knowledge, Mohammad M. Rahman, Chanchal K. Roy, David Lo

Research Collection School Of Computing and Information Systems

Traditional code search engines (e.g., Krugle) often do not perform well with natural language queries. They mostly apply keyword matching between query and source code. Hence, they need carefully designed queries containing references to relevant APIs for the code search. Unfortunately, preparing an effective search query is not only challenging but also time-consuming for the developers according to existing studies. In this article, we propose a novel query reformulation technique–RACK–that suggests a list of relevant API classes for a natural language query intended for code search. Our technique offers such suggestions by exploiting keyword-API associations from the questions and answers …


Augmenting And Structuring User Queries To Support Efficient Free-Form Code Search, Raphael Sirres, Tegawendé F. Bissyande, Dongsun Kim, David Lo, Jacques Klein, Kisub Kim, Yves Le Traon Oct 2018

Augmenting And Structuring User Queries To Support Efficient Free-Form Code Search, Raphael Sirres, Tegawendé F. Bissyande, Dongsun Kim, David Lo, Jacques Klein, Kisub Kim, Yves Le Traon

Research Collection School Of Computing and Information Systems

Source code terms such as method names and variable types are often different from conceptual words mentioned in a search query. This vocabulary mismatch problem can make code search inefficient. In this paper, we present COde voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories. To instantiate this approach, we build GitSearch, a code …


Codehow: Effective Code Search Based On Api Understanding And Extended Boolean Model (E), Fei Lv, Jian-Guang Lou, Shaowei Wang, Dongmei Zhang, Jainjun Zhao Nov 2015

Codehow: Effective Code Search Based On Api Understanding And Extended Boolean Model (E), Fei Lv, Jian-Guang Lou, Shaowei Wang, Dongmei Zhang, Jainjun Zhao

Research Collection School Of Computing and Information Systems

Over the years of software development, a vast amount of source code has been accumulated. Many code search tools were proposed to help programmers reuse previously-written code by performing free-text queries over a large-scale codebase. Our experience shows that the accuracy of these code search tools are often unsatisfactory. One major reason is that existing tools lack of query understanding ability. In this paper, we propose CodeHow, a code search technique that can recognize potential APIs a user query refers to. Having understood the potentially relevant APIs, CodeHow expands the query with the APIs and performs code retrieval by applying …