Java APG is APG, an ABNF Parser Generator, written in the Java language.

The "apg" package contains the parser generator and the runtime library of functions required by all parsers that it generates.

The "examples" package contains a driver function to run any and all of the included tests. The tests themselves are in the examples sub-packages.
These examples demonstrate a variety of ways to use Java APG and its new UDT feature.

A summary of Java APG's new features is:

  • Both APG and the parsers it generates are written entirely in Java.
  • A new operator, the User-Defined Terminal (UDT), is introduced which puts semantic actions on a nearly equal footing with the other ABNF terminal phrases such as the literal string. UDTs allow the user to write phrase recognition functions, including non-Context-Free phrases, and convert them to parser operators.
  • The AST is optionally available in XML format, freeing the user to develop a translator with the XML parser of the his/her choice.

APG was originally written to fulfill a need for a parser generator that would generate parsers directly from ABNF grammars as defined by the IETF in RFC 5234. Since then the grammar syntax for APG has evolved from that standard 1) to generate unambiguous parsers and 2) to add capabilities beyond the class of context-free-languages. Because of 2) the APG grammars are called SABNF (Superset ABNF). ABNF and SABNF will often be used interchangably in this document. The differences between RFC 5234 grammars and SABNF are summarized here:
  • The <prose-val> element is not supported.
  • APG accepts case-sensitive literal strings enclosed in single quotes.
  • APG accepts the syntactic predicate operators AND(&) and NOT(!).
  • Java APG accepts User-Defined Terminals (UDTs). UDTs appear in rule expressions just like rule names, accept that the names must begin with u_ or e_ (* see below). The underscore insures that there will never be a name conflict with a rule name. No rule name definition is given for the UDTs.
  • Prioritized-choice is used to disambiguate the grammars. That is, alternates are tried as they appear in the grammar from left to right. The first alternate to successfully match a phrase is accepted and all other alternates are ignored.
  • Repetitions always consume the longest string possible with no alternative being considered.

(*) The reason for the two designations u_ and e_ is a subtle but serious difference as to whether the UDT will accept empty strings or not. e_ indicates that the UDT will accept empty strings. u_ indicates that the UDT will not accept empty strings. The Parser enforces this distinction. If a UDT named u_my-udt, for example, returns an empty string the Parser will throw and exception. The reason for this has to do with the fact that left-recursive grammars will put the Parser into an infinite loop that will cause a stack-overflow. The check for left-recursiveness in the GeneratorAttributes class relies on knowing whether the rule name operators (and UDT operators as well) accept empty strings. The general properties of the UDT's are unknown to the Generator because they are user written. This naming convention and the Parser's enforcing of it is the only way to prevent a possible stack-overflow due to a hidden left-recursion.

For more information about APG, UDTs, its versions and downloads, you can visit the APG web site . For information on how to use Java APG you should consult the examples in this document and the source code.

Packages 
Package Description
apg
This package contains APG, the parser generator, and the runtime library required by the generated parsers.
examples
This is the main driver function for all of the following Java APG test examples.
examples.anbn
A comparison of timing and node hit statistics between the CFG and UDT parsers for the anbn, n > 0, grammar.
examples.anbncn
A comparison of timing and node hit statistics between the CFG and UDT parsers for the anbncn, n > 0, grammar.
examples.demo
Some demonstrations of using the main features of Java APG and some timing tests using UDTs.
examples.expressions
A comparison of timing and node hit statistics between the CFG and UDT parsers for the expressions grammar.
examples.inifile
A comparison of timing and node hit statistics between the CFG and UDT parsers for the the "ini" file grammar.
examples.mailbox
A comparison of timing and node hit statistics between the CFG and UDT parsers for an email address grammar.
examples.testudtlib
A comparison of timing and node hit statistics between the CFG and UDT parsers for the suite of UdtLib UDTs.