Version 7.0
Copyright © 2021 Lowell D. Thomas
APG
… an ABNF Parser Generator
Data Structures | Functions
apgex.c File Reference

Source code for the apgex phrase-matching engine. More...

#include "./apgex.h"
#include "../library/parserp.h"
#include "../library/tracep.h"
Include dependency graph for apgex.c:

Go to the source code of this file.

Data Structures

struct  phrase_r
 For internal object use only. Defines a phrase as an offset into vpVecRelPhases. More...
 
struct  rule_r
 For internal object use only. Relative offsets to phrase information for rules. More...
 
struct  udt_r
 For internal object use only. Relative offsets to phrase information for UDTs. More...
 
struct  result_r
 For internal object use only. The phrase matching result in relative phrases. More...
 
struct  apgex
 For internal object use only. The phrase matching object context. More...
 

Macros

Internal Use Only Macros
#define BUF_SIZE   (PATH_MAX + 256)
 
#define DOLLAR   36
 
#define AMP   38
 
#define ACCENT   96
 
#define APOS   39
 
#define LANGLE   60
 
#define RANGLE   62
 
#define UNDER   95
 

Functions

void * vpApgexCtor (exception *spEx)
 The phrase-matching engine object constructor. More...
 
void vApgexBkrCheck (exception *spEx)
 Back referencing check. More...
 
void vApgexDtor (void *vpCtx)
 The phrase-matching engine object destructor. More...
 
void vApgexPattern (void *vpCtx, const char *cpPattern, const char *cpFlags)
 Prepare a phrase-matching parser for the given pattern. More...
 
void vApgexPatternFile (void *vpCtx, const char *cpFileName, const char *cpFlags)
 Reads the SABNF grammar defining the pattern from a file. More...
 
void vApgexPatternParser (void *vpCtx, void *vpParser, const char *cpFlags)
 Define the SABNF pattern with a user-created parser. More...
 
apgex_result sApgexExec (void *vpCtx, apg_phrase *spSource)
 Attempt a pattern match on the source array of APG alphabet characters. More...
 
apg_phrase sApgexReplace (void *vpCtx, apg_phrase *spSource, apg_phrase *spReplacement)
 Replace the matched phrase with a specified phrase. More...
 
apg_phrase sApgexReplaceFunc (void *vpCtx, apg_phrase *spSource, pfn_replace pfnFunc, void *vpUser)
 Replace the matched phrase with a user-generated phrase. More...
 
apg_phrasespApgexSplit (void *vpCtx, apg_phrase *spSource, aint uiLimit, aint *uipCount)
 Split a phrase into an array of sub-phrases. More...
 
abool bApgexTest (void *vpCtx, apg_phrase *spSource)
 Report only success or failure on a pattern match. More...
 
void vApgexEnableRules (void *vpCtx, const char *cpNames, abool bEnable)
 Enable or disable specified rule and/or UDT names for phrase capture. More...
 
void vApgexSetLastIndex (void *vpCtx, aint uiLastIndex)
 Sets the index of the character in the source where the pattern-match search is to begin. More...
 
apgex_properties sApgexProperties (void *vpCtx)
 Get a copy of the object's properties. More...
 
void * vpApgexGetAst (void *vpCtx)
 Get a pointer to the AST object's context. More...
 
void * vpApgexGetTrace (void *vpCtx)
 Get a pointer to the trace object's context. More...
 
void * vpApgexGetParser (void *vpCtx)
 Get a pointer to the parser object's context. More...
 
void vApgexDisplayProperties (void *vpCtx, apgex_properties *spProperties, const char *cpFileName)
 Display the object's properties. More...
 
void vApgexDisplayPhrase (void *vpCtx, apgex_phrase *spPhrase, const char *cpFileName)
 Display the object's properties. More...
 
void vApgexDisplayResult (void *vpCtx, apgex_result *spResult, const char *cpFileName)
 Display the complete results from a pattern match. More...
 
void vApgexDisplayPatternErrors (void *vpCtx, const char *cpFileName)
 
void vApgexDefineUDT (void *vpCtx, const char *cpName, parser_callback pfnUdt)
 Define the callback function for a User-Defined Terminal (UDT). More...
 

Tracing Macros

These macros handle the tracing calls. They are implemented as macros so that when tracing is not used all tracing code is removed - the macros are empty. To enable tracing the macro APG_TRACE must be defined when the application is compiled.

#define TRACE_APGEX_HEADER(t)   vTraceApgexHeader((t))
 
#define TRACE_APGEX_FOOTER(t)   vTraceApgexFooter((t))
 
#define TRACE_APGEX_SEPARATOR(x)   vTraceApgexSeparator((x)->vpTrace, (x)->uiLastIndex)
 
#define TRACE_APGEX_CHECK(x)
 
#define TRACE_APGEX_OUTPUT(x)   vTraceApgexOutput((x))
 

Detailed Description

Source code for the apgex phrase-matching engine.

Must be compiled with the macro APG_AST defined. e.g. gcc -DAPG_AST ...

If the trace flag, "t" or "th", is set (see vApgexPattern()), then the macro APG_TRACE must also be defined.

Definition in file apgex.c.

Macro Definition Documentation

◆ ACCENT

#define ACCENT   96

Definition at line 47 of file apgex.c.

◆ AMP

#define AMP   38

Definition at line 46 of file apgex.c.

◆ APOS

#define APOS   39

Definition at line 48 of file apgex.c.

◆ BUF_SIZE

#define BUF_SIZE   (PATH_MAX + 256)

Definition at line 44 of file apgex.c.

◆ DOLLAR

#define DOLLAR   36

Definition at line 45 of file apgex.c.

◆ LANGLE

#define LANGLE   60

Definition at line 49 of file apgex.c.

◆ RANGLE

#define RANGLE   62

Definition at line 50 of file apgex.c.

◆ TRACE_APGEX_CHECK

#define TRACE_APGEX_CHECK (   x)

Definition at line 209 of file apgex.c.

◆ TRACE_APGEX_FOOTER

#define TRACE_APGEX_FOOTER (   t)    vTraceApgexFooter((t))

Definition at line 207 of file apgex.c.

◆ TRACE_APGEX_HEADER

#define TRACE_APGEX_HEADER (   t)    vTraceApgexHeader((t))

Definition at line 206 of file apgex.c.

◆ TRACE_APGEX_OUTPUT

#define TRACE_APGEX_OUTPUT (   x)    vTraceApgexOutput((x))

Definition at line 210 of file apgex.c.

◆ TRACE_APGEX_SEPARATOR

#define TRACE_APGEX_SEPARATOR (   x)    vTraceApgexSeparator((x)->vpTrace, (x)->uiLastIndex)

Definition at line 208 of file apgex.c.

◆ UNDER

#define UNDER   95

Definition at line 51 of file apgex.c.

Function Documentation

◆ bApgexTest()

abool bApgexTest ( void *  vpCtx,
apg_phrase spSource 
)

Report only success or failure on a pattern match.

Similar to sApgexExec() in default mode, except that only success or failure is reported. The matched phrase is not returned.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spSourcePointer to the source or input string as an APG phrase.
Returns
True if a phrase match was found in the source, false otherwise.

Definition at line 752 of file apgex.c.

◆ sApgexExec()

apgex_result sApgexExec ( void *  vpCtx,
apg_phrase spSource 
)

Attempt a pattern match on the source array of APG alphabet characters.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spSourcePointer to the source or input string as an APG phrase.
Returns
An apgex_result structure. Note that the return is a structure and not a pointer to a structure. The spResult element in the result structure will be NULL if no match was found.

Definition at line 479 of file apgex.c.

◆ sApgexProperties()

apgex_properties sApgexProperties ( void *  vpCtx)

Get a copy of the object's properties.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
Returns
A apgex_properties structure (not a pointer to a structure). See which for the properties details.

Definition at line 888 of file apgex.c.

◆ sApgexReplace()

apg_phrase sApgexReplace ( void *  vpCtx,
apg_phrase spSource,
apg_phrase spReplacement 
)

Replace the matched phrase with a specified phrase.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spSourcePointer to the source as a apg_phrase.
spReplacementPointer to the replacement phrase. In default mode, only the first matched phrase is replaced. In global or sticky mode, all possible matched phrases in those respective modes are replaced. The search begins at uiLastIndex which is always set to zero on return. The replacement phrase may have some special characters for dynamic replacement possibilities.
  • no special characters
    Each matched phrase is simply replaced with the specified replacement string.
  • $$
    Escape sequence to insert a dollar sign, $, in the replacement string.
  • $_
    Replace $_ with the full, original source string.
  • $&
    Replace $& with the current matched phrase.
  • $`
    Replace $` with the left context of the current matched phrase.
  • $'
    Replace $' with the right context of the current matched phrase.
  • $<rulename>
    Replace $<rulename> with the matched phrase for the rule or UDT name "rulename".
Returns
The source phrase with replacements, if any, as an apg_phrase structure. Note that the return is a structure and not a pointer to a structure.

Definition at line 528 of file apgex.c.

◆ sApgexReplaceFunc()

apg_phrase sApgexReplaceFunc ( void *  vpCtx,
apg_phrase spSource,
pfn_replace  pfnFunc,
void *  vpUser 
)

Replace the matched phrase with a user-generated phrase.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spSourcePointer to the source as a apg_phrase.
pfnFuncPointer to the replacement function. See pfn_replace for the function prototype. The returned phrase from this function will be used as the replacement for the matched phrase. In default mode, only the first matched phrase is replaced. In global or sticky mode, all possible matched phrases in those respective modes are replaced. The search begins at uiLastIndex which is always set to zero on return.
vpUserPointer to user-supplied data. This pointer will be passed to the above function, pfnFunc.
Returns
The source phrase with replacements, if any, as an apg_phrase structure. Note that the return is a structure and not a pointer to a structure.

Definition at line 581 of file apgex.c.

◆ spApgexSplit()

apg_phrase* spApgexSplit ( void *  vpCtx,
apg_phrase spSource,
aint  uiLimit,
aint uipCount 
)

Split a phrase into an array of sub-phrases.

This function is modeled after the JavaScript function str.split([separator[, limit]]) when using a regular expression. The source phrase is searched for pattern matches.

  • If a single match is found, it's left context and right context become the two members of the array of sub-phrases returned.
  • If multiple matches are found the array of sub-phrases are those sub-phrases remaining after removing the matched characters.
  • If no match is found the array has a single member which is the original source phrase.
  • If the entire source phrase is matched a single, empty sub-phrase is returned (acpPhrase = NULL, uiLength = 0).
  • If the pattern is an empty string (pattern = ""\n), each character in the source phrase is returned as a separate sub-phrase.

Also,

  • The flags, "gy", are ignored.
  • The flags "thp" are honored.
  • All rules and UDTs are disabled, even if previously enabled with vApgexEnableRules().
  • uiLastIndex is set to 0, even if previously set with vApgexSetLastIndex();
Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spSourceA pointer to the source to search for pattern matches. May not be NULL. If it is empty (uiLength = 0) the return will be a single empty phrase.
uiLimitPlaces a limit on the number of pattern matches to find. If uiLimit = 0, all matches will be found. That is, uiLimit = 0 is shorthand for uiLimit = APG_MAX_AINT.
uipCountThe number of sub-phrases in the returned array.
Returns
Returns a pointer to the first sub-string in the array. The number of array members is returned in uipCount.

Definition at line 648 of file apgex.c.

◆ vApgexBkrCheck()

void vApgexBkrCheck ( exception spEx)

Back referencing check.

The back referencing macro APG_BKR must be defined when compiling the apgex object. If not, a macro is defined in apg.h which calls this function, whose only purpose is to throw an exception reminding the user do define the macro.

Parameters
spExPointer to a valid exception structure. If not valid application will silently exit with a BAD_CONTEXT exit code.

Definition at line 279 of file apgex.c.

◆ vApgexDefineUDT()

void vApgexDefineUDT ( void *  vpCtx,
const char *  cpName,
parser_callback  pfnUdt 
)

Define the callback function for a User-Defined Terminal (UDT).

If there are any UDTs in the SABNF pattern grammar, each one of them must have a user-written callback function to do its pattern matching work. This function will define a callback function to the apgex object for a single UDT. This function must be called for each UDT appearing in the SABNF pattern grammar.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
cpNameName of the UDT.
pfnUdtThe callback function for this UDT.

Definition at line 1222 of file apgex.c.

◆ vApgexDisplayPatternErrors()

void vApgexDisplayPatternErrors ( void *  vpCtx,
const char *  cpFileName 
)

Definition at line 1180 of file apgex.c.

◆ vApgexDisplayPhrase()

void vApgexDisplayPhrase ( void *  vpCtx,
apgex_phrase spPhrase,
const char *  cpFileName 
)

Display the object's properties.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spPhrasePointer to the apgex_phrase to display. This would be a one of the members of an apgex_result or apgex_rule.
cpFileNameThe name of the file to write the display to. If NULL, stdout is used. If non-NULL any directories in the pathname must exist.

Definition at line 1098 of file apgex.c.

◆ vApgexDisplayProperties()

void vApgexDisplayProperties ( void *  vpCtx,
apgex_properties spProperties,
const char *  cpFileName 
)

Display the object's properties.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spPropertiesPointer to the properties to display. This would be a pointer to the return value of sApgexProperties().
cpFileNameThe name of the file to write the display to. If NULL, stdout is used. If non-NULL any directories in the pathname must exist.

Definition at line 1028 of file apgex.c.

◆ vApgexDisplayResult()

void vApgexDisplayResult ( void *  vpCtx,
apgex_result spResult,
const char *  cpFileName 
)

Display the complete results from a pattern match.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
spResultPointer to the pattern matched result, the return from sApgexExec();
cpFileNameThe file to display the output to. If NULL, stdout is used.

Definition at line 1129 of file apgex.c.

◆ vApgexDtor()

void vApgexDtor ( void *  vpCtx)

The phrase-matching engine object destructor.

Frees all memory allocated to the apgex object and for good measure zeros it out as insurance against reuse by a stale pointer.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). Silently ignores a NULL pointer, but exits the application with BAD_CONTEXT exit code if the pointer is non-NULL and not a valid apgex object context pointer.

Definition at line 295 of file apgex.c.

◆ vApgexEnableRules()

void vApgexEnableRules ( void *  vpCtx,
const char *  cpNames,
abool  bEnable 
)

Enable or disable specified rule and/or UDT names for phrase capture.

By default, all rules and UDTs are disabled. The result, which is equal to the start rule, is always captured, independent of these selections. However, the start rule, like all other rules, is captured independently and only if it is enabled here. Note that UDTs, if any, must always be defined with vApgexDefineUDT() prior to any matching function call. Disabling them simply means that matched phrases are not saved. They still must be defined for the parse to be performed.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
cpNamesThe name or names of the rules/UDTs to enable or disable. All names are case insensitive. May not be NULL or empty.
  • "--all" Enable/disable all rules and UDTs.
  • "name[,name,...]" A comma-delimited list of one or more names to enabled/disabled. Rule and UDT names may be mixed and in any order. All names, including "--all", are case-insensitive. Only a single comma allowed between names - no spaces or other delimiters.
bEnableIf true, the named rules/UDTS are enabled, meaning that their phrase will be captured. If false, the rules/UDTS will be disabled, meaning that their respective phrases will not be captured
Returns
Exception is thrown if any name is not a valid rule name.

Definition at line 786 of file apgex.c.

◆ vApgexPattern()

void vApgexPattern ( void *  vpCtx,
const char *  cpPattern,
const char *  cpFlags 
)

Prepare a phrase-matching parser for the given pattern.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
cpPatternThe complete SABNF grammar to define the strings to be matched. Must be a complete grammar including line end characters (\n, \r\n or \r) after each line including the last.
cpFlagsA string of flags that control the pattern-matching behavior. The flag characters may appear in any order and may appear multiple times. If "g" and "y" are both present, the first appearing will be honored.
  • NULL or empty string - default mode
    The phrase matching starts at uiLastIndex and searches forward until a match is found or the end of the source string is reached. uiLastIndex is then always reset to zero.
  • g - global mode
    The phrase matching starts at uiLastIndex and searches forward until a match is found or the end of the source string is reached. If a match is found, uiLastIndex is then set to the next character after the matched phrase. Multiple calls to sApgexExec() in global mode will find all matched phrases in the source string. uiLastIndex is set to zero when the end of the source string is reached. If no match is found, uiLastIndex is set to zero, regardless of its original value.
  • y - sticky mode
    Sticky mode is similar to global mode except that there is no searching for a matched phrase. It is either found at uiLastIndex or the match fails. In detail, the phrase matching starts at uiLastIndex. If a phrase match is found, uiLastIndex is set to the next character beyond the end of the matched phrase or zero if the end of the source string has been reached. If no match is found no further searching of the source string is done and uiLastIndex is reset to zero. Note that multiple calls to sApgexExec() will match multiple consecutive phrases, but only if there are no unmatched characters in between.
  • p - PPPT mode
    The parser will use Partially-Predictive Parsing Tables. PPPTs are for the purpose of speeding the parsing process at the expense of some added memory for the tables. Sometimes a lot of memory and in fact sometimes a prohibitive amount of memory. Since speed is often not a major concern in pattern matching, PPPTs are not used by default. Make sure you understand the memory consequences and the fact that some rules will not appear in the matched rules list if using the "p" flag. If the "p" flag is not set, compiling the application with the macro APG_NO_PPPT defined will reduce the code footprint and save some PPPT checking calls.
  • t - trace mode
    A trace of the pattern-matching parser will be generated in ASCII format. By default, the trace is displayed on the standard output, stdout. To change the output file, get a pointer to the trace object with vpApgexGetTrace() and use vTraceSetOutput(). For additional trace configuration see also vTraceConfig() and vTraceConfigDisplay(). To use the "t" flag, the application must be compiled with the macro APG_TRACE defined.
  • h - trace HTML mode
    A trace of the pattern-matching parser will be generated in HTML format. This flag must be preceded by the "t" flag or an exception is thrown.

By default, uiLastIndex begins at 0. However, it can be set to any valid value prior to the phrase matching attempt with a call to vApgexSetLastIndex()

Definition at line 378 of file apgex.c.

◆ vApgexPatternFile()

void vApgexPatternFile ( void *  vpCtx,
const char *  cpFileName,
const char *  cpFlags 
)

Reads the SABNF grammar defining the pattern from a file.

Same as vApgexPattern() except the pattern grammar is read from a file.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
cpFileNameName of the file with the SABNF grammar.
cpFlagsA string of flags. See vApgexPattern().

Definition at line 404 of file apgex.c.

◆ vApgexPatternParser()

void vApgexPatternParser ( void *  vpCtx,
void *  vpParser,
const char *  cpFlags 
)

Define the SABNF pattern with a user-created parser.

The SABNF pattern is implicitly defined by a user-supplied parser. This parser is independent of the apgex context and it's destructor is never called, even by vApgexDtor(). The supplied parser may or may not have been created with the same memory object, vpMem. It is the user's responsibility to have a properly defined catch block to handle any exceptions thrown from the supplied parser. In this case, the apgex properties will have an empty string for the unknown pattern.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
vpParserPointer to a valid SABNF parser context.
cpFlagsA string of flags. See vApgexPattern().

Definition at line 444 of file apgex.c.

◆ vApgexSetLastIndex()

void vApgexSetLastIndex ( void *  vpCtx,
aint  uiLastIndex 
)

Sets the index of the character in the source where the pattern-match search is to begin.

uiLastIndex governs the starting point of the search in the next call to any of the pattern-matching functions. It is initialized to 0 by default. It's value on consecutive calls to a pattern-matching function are normally governed by the mode (default, global, sticky) rules. This function can be used prior to any pattern-matching call to override the default behavior.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
uiLastIndexThe source character index where the pattern match is to begin. Must be lest than the source length or an exception will be thrown.

Definition at line 874 of file apgex.c.

◆ vpApgexCtor()

void* vpApgexCtor ( exception spEx)

The phrase-matching engine object constructor.

Parameters
spExPointer to a valid exception structure initialized with XCTOR(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
Returns
Returns a pointer to the apgex object context.

Definition at line 239 of file apgex.c.

◆ vpApgexGetAst()

void* vpApgexGetAst ( void *  vpCtx)

Get a pointer to the AST object's context.

This function can be called after any of the phrase-matching functions.

The AST object will reflect the results of the last successful phrase match. If the last phrase match was unsuccessful the pointer is valid but the AST will have no records. Be sure to understand the meaning of the flags used and the result of the phrase-matching function called.

Following a successful phrase match, the AST object will have records for all rules and UDTs in the pattern but all call back functions will be NULL. Use vAstSetRuleCallback() and vAstSetUdtCallback() to set the translation functions specific to the application prior to translation with vAstTranslate().

See sApgexProperties() for alternate access to this pointer.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
Returns
A pointer to the AST object's context. An exception is thrown if called with no pattern defined

Definition at line 957 of file apgex.c.

◆ vpApgexGetParser()

void* vpApgexGetParser ( void *  vpCtx)

Get a pointer to the parser object's context.

This function can be called after any of the pattern-defining functions.

See sApgexProperties() for alternate access to this pointer.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
Returns
A pointer to the parser object's context. An exception is thrown if called with no pattern defined

Definition at line 1009 of file apgex.c.

◆ vpApgexGetTrace()

void* vpApgexGetTrace ( void *  vpCtx)

Get a pointer to the trace object's context.

This function can be called after any of the pattern-defining functions.

The trace context pointer can be used to configure the trace prior to any of the phrase-matching functions.

See sApgexProperties() for alternate access to this pointer.

Parameters
vpCtxA pointer to a valid apgex object context returned from vpApgexCtor(). If not valid the application will silently exit with a BAD_CONTEXT exit code.
Returns
A pointer to the trace object's context if the "t" flag was specified. NULL otherwise. An exception is thrown if called with no pattern defined

Definition at line 984 of file apgex.c.

APG Version 7.0 is licensed under the 2-Clause BSD License,
an Open Source Initiative Approved License.