Version 7.0
Copyright © 2021 Lowell D. Thomas
APG
… an ABNF Parser Generator
Session Initiation Protocol – a Telecom Example

The Session Initiation Protocol (SIP) is described in RFC 3261. From the abstract:

"... Session Initiation Protocol (SIP), [is] an
application-layer control (signaling) protocol for creating,
modifying, and terminating sessions with one or more participants.
These sessions include Internet telephone calls, multimedia
distribution, and multimedia conferences."

SIP has been chosen for this example because:

  • it is a large grammar (300+ rules)
  • it is a grammar of significant commercial interest
  • it has a large set of well-designed and well-explained test cases

Full disclosure - I am not a SIP or telecom expert. All of the examples and discussion here deal strictly with the syntax and parsing of the ABNF grammar. The ABNF grammar used here has been slightly modified from the exact grammar extracted from RFC 3261 to work correctly with the disambiguation rules of APG;

The test messages* used here, the so-called "torture tests", have been extracted from RFC 4475 using the base64, gzip-compressed TAR archive of files included there. Those files have been separated into three directories, ./tests/valid (section 3.1.1), ./tests/invalid (section 3.1.2) and ./tests/semantics (sections 3.2, 3.3 and 3.4). Additionally, for each of these data files a text file of the same base name has been created which contains a copy of the discussion of that message from the RFC as well as my own comments about the parsing of the message. For example,

  • ./valid/dblreq.dat
  • ./valid/dblreq.txt.

Again, my comments in the *.txt files are directed strictly to the syntax. Nothing in them is intended to contradict or critique the authors.

Case 2 of this example reads all of the files from all three of the directories and collects them into a single JSON file, ./sip-tests.json. This JSON file is then used as the data source for all of the remaining test cases.

* Note that these test message are copyrighted by The Internet Society (2006).

Timing Tests
Comparison of the timing* from the various configurations is shown in the table below.

AppPPPTUDTmsec/msgfactor
APG 7.0nono0.02811.00
APG 7.0noyes0.01162.41
APG 7.0yesno0.01511.86
APG 7.0yesyes0.01172.40
APG 6.3nono0.0713.394

Notes:

  • UDT "yes" indicates a grammar with 14 UDTs replacing the 14 ABNF grammar rules with the highest number of node hits.
  • PPPT "no" indicates that PPPTs were not used, "yes" indicates that PPPTs were used.
  • factor is the base value/test value, i.e. a factor of 2 means that the test ran 2 times faster than the base.
  • msec is for the time to parse all 51 tests messages a 1000 times each.
  • Both the PPPT and UDT tests by themselves round to a factor of 2.0.
  • While the UDT test is as effective as the PPPT test, selecting, writing and testing the UDTs is very laborius.
  • Combining PPPTs and UDTs does not lead to any improvement.
  • Best to just use PPPTs and save the UDT writing for complex phrases that are difficult or impossible to describe with SABNF.
  • APG 7.0 with no PPPTs or UDTs is a factor of 2.5 faster than APG 6.3. While the APG 6.3 test did have a UDT grammar that gave a significantly greater improvement, it was a very aggressive grammar and that effort has not been undertaken here.

* OS: Ubuntu 20.04, processor: Intel i7-10710U, 6 cores, RAM: 64GB

Statistics Tests
Comparison of the total number of parse tree node hits from the various configurations is shown in the table below. The hit total count is the cumulative total from parsing all 51 of the test messages.

AppPPPTUDThits/msgfactor
APG 7.0nono29531.00
APG 7.0noyes10242.88
APG 7.0yesno5725.16
APG 7.0yesyes5575.30

Notes:

  • While PPPTs reduce the number of node hits by a factor of 5 they only speed the parser by a factor of roughly 2. This is indicative of the table look up times. Profiling also confirms this.
  • In terms of node hits, PPPTs and UDTs actually compete against one another. While UDTs reduce the number of node hits significantly, there is little further improvement due to the fact that all UDT table values are "ACTIVE", meaning that they require a full parse.

Application Requirements

  • application code must include header files:
    • <limits.h>
    • <time.h>
    • <dirent.h>
    • ../../utilities/utilities.h
    • ../../json/json.h
    • ./sip-0.h
    • ./sip-1.h
    • ./udtlib.h
  • application compilation must include source code from the directories:
    • ../../library
    • ../../utilities
    • ../../json
  • application compilation must define macros:
    • APG_TRACE
    • APG_STATS
    • APG_NO_PPPT (optional, for comparison with and without PPPTs)

The compiled example will execute the following cases. Run the application with no arguments for application usage.

  • case 1: Display application information. (type names, type sizes and defined macros)
  • case 2: Build the JSON composite file of all SIP torture test messages.
  • case 3: Parse and trace all valid SIP messages, with and without UDTs.
  • case 4: Parse and trace all invalid SIP messages, with and without UDTs.
  • case 5: Parse and trace all semantically invalid SIP messages, with and without UDTs.
  • case 6: Parse all SIP messages and measure the times, with and without UDTs.
  • case 7: Parse all SIP messages and display the node-hit statistics, with and without UDTs.
APG Version 7.0 is licensed under the 2-Clause BSD License,
an Open Source Initiative Approved License.