Facebook AI researchers have released a new benchmark and data set for evaluating neural code search models. The evaluation data set comprises 287 Stack Overflow question and answer pairs. It utilizes information from GitHub and Stack Overflow, according to the recently published Neural Code Search Evaluation Dataset.
The evaluation data set contains natural language queries and relevant code snippet answers from Stack Overflow. It also includes code snippet examples from the search corpus (public repositories from GitHub) that correctly answers each query. The team expects this data set “to serve as a benchmark in order to evaluate performance across various code search models.”
It is interesting to note here that the search corpus contains more than 24,000 of the most popular Android repositories on GitHub and is indexed using the more than 4.7 million method bodies parsed from these repositories.
A score sheet on the evaluation data set, using two models from their recent work, is also included.