Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

Articles

2019

Code generation

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Automatic Acquisition Of Annotated Training Corpora For Test-Code Generation, Magdalena Kacmajor, John D. Kelleher Feb 2019

Automatic Acquisition Of Annotated Training Corpora For Test-Code Generation, Magdalena Kacmajor, John D. Kelleher

Articles

Open software repositories make large amounts of source code publicly available. Potentially, this source code could be used as training data to develop new, machine learning-based programming tools. For many applications, however, raw code scraped from online repositories does not constitute an adequate training dataset. Building on the recent and rapid improvements in machine translation (MT), one possibly very interesting application is code generation from natural language descriptions. One of the bottlenecks in developing these MT-inspired systems is the acquisition of parallel text-code corpora required for training code-generative models. This paper addresses the problem of automatically synthetizing parallel text-code corpora …