Overview
APG – an ABNF Parser Generator – was originally designed to generate recursive-descent parsers directly from the ABNF grammar defining the sentences or phrases to be parsed. The approach is to recognize that ABNF defines a tree with seven types of nodes and that each node represents an operation that can guide a depth-first traversal of the tree – that is, a recursive-descent parse of the tree.
However, APG has since evolved from parsing the strictly Context-Free languages described by ABNF in a number of significant ways.
The first is through disambiguation. A Context-Free language sentence may have more than one parse tree that correctly parses the sentence. That is, different ways of phrasing the same sentence. APG disambiguates, always selecting only a single, well-defined parse tree from the multitude.
From here it was quickly realized that this method of defining a tree of node operations did not in any way require that the nodes correspond to the ABNF-defined tree. They could be expanded and enhanced in any way that might be convenient for the problem at hand. The first expansion was to add the “look ahead” nodes. That is, operations that look ahead for a specific phrase and then continue or not depending on whether the phrase is found. Next nodes with user-defined operations were introduced. That is, a node operation that is hand-written by the user for a specific phrase-matching problem. Finally, to develop an ABNF-based pattern-matching engine similar to regular expressions, regex, a number of new node operations have been added: look behind, back referencing, and begin and end of string anchors.
These additional node operations enhance the original ABNF set but do not change them. Rather they form a superset of ABNF, or as is referred to here, SABNF.
Today, APG is a versatile, well-tested generator of parsers. And because it is based on ABNF, it is especially well suited to parsing the languages of many Internet technical specifications. In fact, it is the parser of choice for several large Telecom companies. Previous versions of APG have been developed to generate parsers in C/C++, Java and JavaScript.
Version 7.0 is a complete re-write in C adding a large number of new features.