The driver for the SIP tests.
This is an in-depth study of using UDTs to significantly speed up the parsing process without altering the language of interest.
- Why SIP? The SIP (RFC 3261) grammar has been selected for this study because it is a substantial (300+ rule names) and commercially significant grammar.
- What messages have been used for the study? Conveniently, there exists a published set of SIP messages especially designed to stretch the parser. These have been published as RFC 4475. The published messages have a mark-up method of indicating such things as long lines, long, repeating strings and non-printing characters. The marked up text for the 13 published valid SIP messages are in the file TortureTests.txt. The function uiTortureTestTranslator() reads this file and translates all of the mark up into binary, 8-bit character codes ready for parsing.
- How was the study done? The improvements were done in stages. At each stage the rule name statistics were used to see which rules were getting parsed most often. Those rules were then converted to UDTs and optimized by handwriting the parsing of those particular phrases. The optimizations were done in six stages, including the baseline stage with no UDTs.
- What to look for in the results? The test runs each of the six stage parsers for statistics and prints the statistics for each stage along with a comparision of statistics of each stage with the baseline. It then runs timing tests for each of the six stage parsers. Again, the timing results of each test along with comparisons of each stage with the baseline are given.
The timing results will vary, of course, between computers and even between different runs on the same computer. But in my tests, I was able to see a 10-fold increase in parsing speed in the debug version of the code. In the optimized, release version this was somewhat less.
Be patient. The timing tests can take up to several minutes to complete.
To regenerate the grammars for the six stages:
apg /in:Grammar1.bnf /C:Grammar1 /dwarnings /dtypes
apg /in:Grammar2.bnf /C:Grammar2 /dwarnings /dtypes
apg /in:Grammar3.bnf /C:Grammar3 /dwarnings /dtypes
apg /in:Grammar4.bnf /C:Grammar4 /dwarnings /dtypes
apg /in:Grammar5.bnf /C:Grammar5 /dwarnings /dtypes
apg /in:SIP.bnf /C:SIPGrammar /dwarnings /dtypes
Definition in file main.c.