Ysura
Ysura

Reputation: 11

Python BNF Parser (String to Tree to String) on STIL (Standard Test Interface Language)

I am trying to make a STIL parser in python. I would like to be able to parse the STIL file, modify some parts and rewrite it to a new STIL file. For example:

I tried to use a python Lark BNF parser.

I managed to have a grammar that is working on my example text files. It's a little crude since I'm new to BNF parsers, some terminal words are likely missing.

What approach should I take to do the parsing? I could make a Transformer from Tree to a dict but after I am not sure how to go back to text.

NB: I used the partial BNF as basis from the IEEE1450 https://grouper.ieee.org/groups/1450/index.html

Here is the grammar used:

stil_grammar = r"""
    start: stil_session

# 1.0 STIL Organization
    stil_session: stil header? session
    session: block+
    block: user_keywords
        | user_functions
        | signals
        | signal_groups
        | pattern_exec
        | pattern_burst
        | timing
        | spec
        | selector
        | scan_structs
        | pattern
        | procedures
        | macrodefs
        | include | annotation | udb #@TODO | (null)

# 2.0 STIL Statement
    stil: "STIL" STIL_VERSION_NUMBER ";"
    STIL_VERSION_NUMBER: INTEGER "." INTEGER

# 3.0 Header Block
    TITLE: "Title"
    DATE: "Date"
    SOURCE: "Source"
    HISTORY: "History"
    
    header: "Header" "{" header_list "}"
    header_list: header_item+
    header_item: TITLE STRING ";"
                | DATE STRING ";"
                | SOURCE STRING ";"
                | HISTORY "{" history_list "}"
                | include | annotation | udb #@TODO: | (null)
    history_list: annotation+

# 4.0 Include Statement
    include: "Include" FILE_NAME ("IfNeed" BLOCKTYPE)? ";" stil header?
    BLOCKTYPE: "Include"
            | "Header"
            | "UserKeywords"
            | "UserFunctions"
            | "Signals"
            | "SignalGroups"
            | "PattenExec"
            | "PatternBurst"
            | "Timing"
            | "Spec"
            | "Selector"
            | "ScanStructures"
            | "Pattern"
            | "Procedures"
            | "MacroDefs"
            | "Ann"
    FILE_NAME: IDENTIFIER

# 5.0 UserKeywords Statement
    user_keywords: "UserKeywords" USER_DEFINED_KEYWORDS ";"
    USER_DEFINED_KEYWORDS: IDENTIFIER+
    udb: IDENTIFIER "{" UDB_TEXT "}"
        | IDENTIFIER UDB_2_TEXT ";"
    UDB_TEXT: STRING #@TODO: any sequence of characters, with the restriction that any '{' be matched with '}'
    UDB_2_TEXT: /[^{};]+/

# 6.0 UserFunctions Statement
    user_functions: "UserFunctions" USER_DEFINED_FUNCTION ";"
    USER_DEFINED_FUNCTION: IDENTIFIER+

# 7.0 Ann Statement
    annotation: "Ann" "{*" ANN_TEXT "*}"
    ANN_TEXT: /[^*]+/

# 8.0 Signals Block
    SCANIN: "ScanIn"
    SCANOUT: "ScanOut"
    BASE: "Base"
    ALIGNMENT: "Alignment"
    DATABITCOUNT: "DataBitCount"

    signals: "Signals" "{" signals_list "}"
    signals_list: signals_item+
    signals_item: SIGNAL_NAME_ARRAY_OPT SIGNAL_TYPE ";"
                | SIGNAL_NAME_ARRAY_OPT SIGNAL_TYPE "{" sig_statements? "}"
                | include | annotation | udb #@TODO: | (null)
    SIGNAL_NAME_ARRAY_OPT:  SIGNAL_NAME | IDENTIFIER "[" INTEGER ".." INTEGER "]"
    SIGNAL_NAME: IDENTIFIER | IDENTIFIER "[" INTEGER "]"
    SIGNAL_TYPE: "In" | "Out" | "InOut" | "Supply" | "Pseudo"
    sig_statements: sig_statement+
    sig_statement: TERMINATIONS
                | DEFAULT_STATE_STMT
                | SCANIN [ INTEGER ] ";"
                | SCANOUT [ INTEGER ] ";"
                | BASE BASE_TYPE ";"
                | ALIGNMENT ORIENT_TYPE ";"
                | DATABITCOUNT INTEGER ";"
                | annotation | include | udb #@TODO | (null)
   TERMINATIONS: "Termination" TERMINATION_STATE ";"
   TERMINATION_STATE: "TerminateHigh" | "TerminateLow" | "TerminateOff" | "TerminateUnknown"
   DEFAULT_STATE_STMT: "DefaultState" DEFAULT_STATE ";"
   DEFAULT_STATE: "U" | "ForceUp"
                | "D" | "ForceDown"
                | "Z" | "ForceOff"
    BASE_TYPE: "Hex" WFCS
            | "Dec" WFCS
    ORIENT_TYPE: "LSB" | "MSB"

# 9.0 SignalGroups Block
    signal_groups: "SignalGroups" DOMAIN_NAME? "{" groups_list "}"
    DOMAIN_NAME: IDENTIFIER
    groups_list: groups_item+
    groups_item: GROUP_NAME "=" sigref_expr ";"
                | GROUP_NAME "=" sigref_expr "{" sig_statements "}"
                | annotation | include | udb #@TODO | (null)
    GROUP_NAME: IDENTIFIER
    sigref_expr: SIGNAL_OR_GROUP_NAME
                | "'" grp_name_exp_list "'"
    grp_name_exp_list: "("? SIGNAL_OR_GROUP_NAME (PLUS_OR_MINUS SIGNAL_OR_GROUP_NAME)* ")"? #@TODO: Check this change in rule?
    SIGNAL_OR_GROUP_NAME: SIGNAL_NAME_ARRAY_OPT | GROUP_NAME
    PLUS_OR_MINUS: "+" | "-"

# 10.0 PatternExec Block
    pattern_exec: "PatternExec" pat_exec_name? "{" pat_exec_list_items? "}"
    pat_exec_name: IDENTIFIER
    pat_exec_list_items: pat_exec_item+
    pat_exec_item: "Timing" TIMING_NAME ";"
                | "PatternBurst" PAT_BURST_NAME ";"
                | "Category" CATEGORY_NAME ";"
                | "Selector" SELECTOR_NAME ";"
                | annotation | include | udb #@TODO | (null)
    CATEGORY_NAME: IDENTIFIER
    SELECTOR_NAME: IDENTIFIER # already defined
    TIMING_NAME: IDENTIFIER

# 11.0 Pattern Burst Block
    pattern_burst: "PatternBurst" PAT_BURST_NAME "{" pat_burst_stmnts? "}"
    PAT_BURST_NAME: IDENTIFIER
    pat_burst_stmnts: pat_burst_stmnt+
    pat_burst_stmnt: "SignalGroups" GROUPS_DOMAIN ";"
                    | "MacroDefs" SCAN_MACROS_DOMAIN ";"
                    | "Procedures" PROCEDURES_DOMAIN ";"
                    | "ScanStructures" SCAN_NAME ";"
                    | "Start" PAT_LABEL ";"
                    | "Stop" PAT_LABEL ";"
                    | "Termination" "{" termination_statements? "}"
                    | "PatList" "{" pat_list_items "}"
                    | annotation | include | udb #@TODO | (null)
    pat_list_items: pat_list_item+
    pat_list_item: PAT_BURST_NAME ";"
                | PAT_BURST_NAME "{" pat_list_stmts? "}"
    pat_list_stmts: pat_list_stmt+
    pat_list_stmt:  "SignalGroups" GROUPS_DOMAIN ";"
                    | "MacroDefs" SCAN_MACROS_DOMAIN ";"
                    | "ScanStructures" SCAN_NAME ";"
                    | "Start" PAT_LABEL ";"
                    | "Stop" PAT_LABEL ";"
                    | "Procedures" PROCEDURES_DOMAIN ";"
                    | "Termination" "{" termination_statements? "}"
                    | annotation | include | udb #@TODO | (null)
    GROUPS_DOMAIN: IDENTIFIER
    SCAN_MACROS_DOMAIN: IDENTIFIER
    SCAN_NAME: IDENTIFIER #@TODO: Not in block? Check definition
    PROCEDURES_DOMAIN: IDENTIFIER
    PAT_LABEL: IDENTIFIER
    termination_statements: termination_statement+
    termination_statement: sigref_expr TERMINATION_STATE ";"

# 12.0 Timing Block and WaveformTable Block
    PERIOD: "Period"
    WAVEFORMS: "Waveforms"
    INHERITWAVEFORMTABLE: "InheritWaveformTable"
    SUBWAVEFORMS: "SubWaveforms"
    INHERITWAVEFORM: "InheritWaveform"
    DURATION: "Duration"

    timing: "Timing" TIMING_LABEL? "{" timing_list? "}"
    timing_list: timing_item+
    timing_item: "WaveformTable" WFT "{" wft_list "}" 
                | "SignalGroups" DOMAIN_NAME ";"
                | annotation | include | udb #@TODO | (null)

    WFT: IDENTIFIER
    TIMING_LABEL: IDENTIFIER
    CELL: IDENTIFIER
    wft_list: wft_item+
    wft_item: PERIOD "'" time_expr "'" ";"
            | WAVEFORMS "{" waveforms_list "}"
            | INHERITWAVEFORMTABLE (TIMING_LABEL ".")? WFT ";"
            | SUBWAVEFORMS "{" subwaveforms_list "}"
            | annotation | include | udb #@TODO | (null)

    waveforms_list: waveforms_item+
    waveforms_item: sigref_expr LABEL? "{" waveform_items "}"
    waveform_items: waveform_item+
    waveform_item: INHERITWAVEFORM ((TIMING_LABEL ".")? WFT ".")? CELL ";"
                | WFC "{" wfc_def_list "}"
                | WFCS "{" wfcs_def_list "}"
                | annotation | include | udb #@TODO | (null)
   subwaveforms_list: subwaveforms_item+
   subwaveforms_item: SWF_LABEL ":" DURATION "'" time_expr "'" "{" sub_def_list "}"
                   |annotation | include | udb #@TODO | (null)
   SWF_LABEL: IDENTIFIER
   wfc_def_list: wfc_definition+
   wfcs_def_list: wfcs_definition+
   sub_def_list: sub_definition+
   wfc_definition: (LABEL ":")? "'" time_expr "'" EVENT ";"
                | (LABEL ":")? "'" time_expr "'" ";"
                | (LABEL ":")? EVENT ";"
                | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL ";"
                | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL "[" INTEGER "]" ";"
                | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL "[" # "]" ";"
                | INHERITWAVEFORM (((TIMING_LABEL ".")? WFT ".")? CELL ".")? WFC ";"
                | annotation | include | udb #@TODO | (null)

    wfcs_definition : (LABEL ":")? "'" time_expr "'" EVENTS ";"
                    | (LABEL ":")? "'" time_expr "'" EVENTS "[" INTEGER "]" ";"
                    | (LABEL ":")? "'" time_expr "'" ";"
                    | (LABEL ":")? EVENTS ";"
                    | (LABEL ":")? EVENTS "[" INTEGER "]" ";"
                    | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL ";"
                    | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL "[" INTEGER "]" ";"
                    | (LABEL ":")? ("'" time_expr "'")? (REPEAT INTEGER)? SWF_LABEL "[" # "]" ";"
                    | INHERITWAVEFORM (((TIMING_LABEL ".")? WFT "."?) CELL ".")? WFC ";"
                    | annotation | include | udb #@TODO | (null)
    sub_definition : "'" time_expr "'" EVENTS ";"
                    | "'" time_expr "'" EVENTS "[" INTEGER "]" ";"
                    | "'" time_expr "'" ";"
                    | EVENTS ";"
                    | EVENTS "[" INTEGER "]" ";"
                    | LABEL ":"
                    | annotation | include | udb #@TODO | (null)
    WFC: LETTER | DIGIT | "#" | "%"
    WFCS : WFC+
    time_expr : time_expr "+" time_expr
            | time_expr "-" time_expr
            | time_expr "*" time_expr
            | time_expr "/" time_expr
            | "+" time_expr
            | "-" time_expr
            | "@" time_expr
            | FUNCTION "(" function_args? ")"
            | time_expr "==" time_expr
            | time_expr "<=" time_expr
            | time_expr ">=" time_expr
            | time_expr "<" time_expr
            | time_expr ">" time_expr
            | time_expr "!=" time_expr
            | time_expr "?" time_expr ":" time_expr
            | "(" time_expr ")"
            | DECIMAL
            | DECIMAL ENGINEERING_UNITS
            | REF_VARNAME
    ENGINEERING_UNITS: ENGINEERING_PREFIX? ENGINEERING_UNIT
    ENGINEERING_PREFIX: "E" | "P" | "T" | "G" | "M" | "k" | "m" | "u" | "n" | "p" | "f" | "a"
    ENGINEERING_UNIT: "A" | "Cel" | "F" | "H" | "Hz" | "m" | "Ohm" | "s" | "W" | "V"
    REF_VARNAME: IDENTIFIER
    EVENTS: EVENT ("/" EVENT)*
    EVENT: "D" | "ForceDown"
        | "U" | "ForceUp"
        | "Z" | "ForceOff"
        | "P" | "ForcePrior"
        | "L" | "CompareLow"
        | "H" | "CompareHigh"
        | "x" | "X" | "CompareUnknown"
        | "T" | "CompareOff"
        | "V" | "CompareValid"
        | "l" | "CompareLowWindow"
        | "h" | "CompareHighWindow"
        | "t" | "CompareOffWindow"
        | "v" | "CompareValidWindow"
        | "N" | "ForceUnknown"
        | "A" | "LogicLow"
        | "B" | "LogicHigh"
        | "F" | "LogicZ"
        | "?" | "Unknown"
        | "G" | "ExpectHigh"
        | "R" | "ExpectLow"
        | "Q" | "ExpectOff"
        | "M" | "Marker"
    FUNCTION: "min"
            | "max"
            | IDENTIFIER #(note: allowed IDENTIFIERs are declared in user_functions stmt)
    function_args: time_expr | function_args "," time_expr

# 13.0 Spec and Selector Block
spec: "Spec" SPEC_NAME? "{" spec_list? "}"
SPEC_NAME: IDENTIFIER
spec_list: spec_item+
spec_item: "Category" CAT_NAME "{" var_spec_info? "}"
        | "Variable" VAR_NAME "{" [ cat_spec_info ] "}"
        | include | annotation | udb  #@TODO | (null)
CAT_NAME: IDENTIFIER
VAR_NAME: IDENTIFIER
var_spec_info: var_spec_info_item+
cat_spec_info: cat_spec_info_item+
var_spec_info_item: VAR_NAME "=" "'" time_expr "'" ";"
                | VAR_NAME "{" ("Min" "'" time_expr "'" ";")? ("Typ" "'" time_expr "'" ";")? ("Max" "'" time_expr "'" ";")? "}"
                | include | annotation | udb  #@TODO | (null)
cat_spec_info_item: CAT_NAME "'" time_expr "'" ";"
                | CAT_NAME "{" ("Min" "'" time_expr "'" ";")? ("Typ" "'" time_expr "'" ";")? ("Max" "'" time_expr "'" ";")? "}"
                | include | annotation | udb  #@TODO | (null)
selector: "Selector" SPEC_SELECTOR_NAME "{" selector_list? "}"
SPEC_SELECTOR_NAME: IDENTIFIER
selector_item: VAR_NAME SELECTOR_TYPE ";"
selector_list: selector_item+
SELECTOR_TYPE: "Min" | "Typ" | "Max" | "Meas"

# 14.0 ScanStructures Block
scan_structs: "ScanStructures" SCAN_NAME? "{" scanchains? "}"
scanchains: scanchain+
scanchain: "ScanChain" CHAINNAME "{" scan_struct_list? "}"
| include | annotation | udb  #@TODO | (null)
CHAINNAME: IDENTIFIER
scan_struct_list: scan_struct_item+
scan_struct_item: "ScanLength" INTEGER ";"
                | "ScanOutLength" INTEGER ";"
                | "ScanCells" CELLNAME_LIST ";"
                | SCANIN SIGNAL_NAME ";"
                | SCANOUT SIGNAL_NAME ";"
                | "ScanMasterClock" SIGNAL_NAME ";"
                | "ScanSlaveClock" SIGNAL_NAME ";"
                | "ScanInversion" BIT ";"
                | include | annotation | udb # | (null)
CELLNAME_LIST: CELLNAME+
CELLNAME: IDENTIFIER | "!" IDENTIFIER
BIT: "0" | "1"

# 15.0 Pattern Block
    LOOP: "Loop"


    pattern: "Pattern" PATTERN_NAME "{" pattern_statements? "}"
    PATTERN_NAME: IDENTIFIER
    chain_name: IDENTIFIER #@TODO: Not in the BNF
    pattern_statements: pattern_stmt+
    pattern_stmt: LABEL? pat_stmt
    pat_stmt: WAVEFORM_TABLE_STMT WFT ";"
            | LOOP INTEGER "{" pattern_statements? "}"
            | "MatchLoop" INTEGER "{"pattern_statements "BreakPoint" "{"pattern_statements"}" "}"
            | "MatchLoop Infinite" "{"pattern_statements "BreakPoint" "{"pattern_statements"}""}"
            | vector_stmt
            | condition_stmt
            | "Call" PROCEDURE_NAME ";"
            | "Call" PROCEDURE_NAME "{" vec_data "}"
            | "Macro" MACRO_NAME ";"
            | "Macro" MACRO_NAME "{" vec_data "}"
            | "GoTo" PAT_LABEL ";"
            | "Stop" ";"
            | "ScanChain" chain_name ";"
            | "BreakPoint" ";"
            | "BreakPoint" "{" pattern_statements "}"
            | "IddqTestPoint" ";"
            | "TimeUnit" "'" time_def "'" ";"
            | include | annotation | udb #@TODO | (null)
    WAVEFORM_TABLE_STMT: "W" | "WaveformTable"
    LABEL: IDENTIFIER ":"
    non_cyclized_data: "@" TIME_VALUE event_pair ";"
                    | "@" TIME_VALUE "{" event_pair_list? "}"
    event_pair: sigref_expr "=" EVENT
                | include | annotation | udb #@TODO | (null)
    event_pair_list: event_pair
                    | event_pair_list ";" event_pair
    vector_stmt: "V" "{" vec_data "}"
                | "Vector" "{" vec_data "}"
    condition_stmt: "C" "{" vec_data "}"
                | "Condition" "{" vec_data "}"
    TIME_VALUE: INTEGER
    time_def: DECIMAL ENGINEERING_UNITS?
    vec_data: vec_data_block*
    vec_data_block: sigref_expr "=" vec_data_string ";"
                     | sigref_expr "{" vec_data_strings "}"
                     | non_cyclized_data
                     | include | annotation | udb #@TODO | (null)
    vec_data_strings: vec_data_string+ ";"
                    | include | annotation | udb #@TODO | (null)
    vec_data_string: wfc_data_string #(Note: string type is runtime dependent based on the sig_refs Base definition)
                    | hex_data_string 
                    | dec_data_string 
    wfc_mode: "\\w " wfc_data_string
    hex_mode: "\\h " hex_data_string
            | "\\h" WFCS hex_data_string
    dec_mode: "\\d " dec_data_string
            | "\\d" WFCS dec_data_string
    wfc_data_string: wfc_data+
    wfc_data: WFCS
            | REPEAT INTEGER WFCS
            | hex_mode
            | dec_mode
    hex_data_string: hex_data+
    hex_data: HEXCHARS
            | REPEAT INTEGER HEXCHARS
            | wfc_mode
            | dec_mode
    dec_data_string: dec_data_string dec_data
                    | dec_data
    dec_data: INTEGER
            | REPEAT INTEGER INTEGER
            | wfc_mode
            | hex_mode

# 16.0 Procedures Block
    procedures: "Procedures" PROCEDURE_DOMAIN_NAME? "{" procedure_definitions? "}"
    PROCEDURE_DOMAIN_NAME: IDENTIFIER
    procedure_definitions: procedure+
    PROCEDURE_NAME: IDENTIFIER
    procedure: PROCEDURE_NAME "{" procedure_statements? "}"
            | include | annotation | udb #@TODO | (null)
    procedure_statements: procedure_or_macro_item+
    procedure_or_macro_item: "Shift" "{" pattern_statements "}"
                            | pattern_stmt

# 17.0 Macrodefs Block
    macrodefs: "MacroDefs" MACRO_DOMAIN_NAME? "{" macro_definitions? "}"
    MACRO_DOMAIN_NAME: IDENTIFIER
    macro_definitions: macro
                    | macro_definitions macro
    macro: MACRO_NAME "{" macro_statements? "}"
                    | include | annotation | udb #@TODO: | (null)
    MACRO_NAME: IDENTIFIER
    macro_statements: procedure_or_macro_item+

# 18.0 Other MisCELLaneous Statements
    IDENTIFIER: IDENTIFIER_SEGMENT  ("." IDENTIFIER_SEGMENT)*
    IDENTIFIER_SEGMENT: SIMPLE_IDENTIFIER | ESCAPED_STRING
    SIMPLE_IDENTIFIER: LETTER_OR_UNDERLINE SIMPLE_CHARACTERS
    SIMPLE_CHARACTERS: SIMPLE_CHARACTER+
    LETTER_OR_UNDERLINE: LETTER | UNDERLINE
    SIMPLE_CHARACTER: LETTER | DIGIT | UNDERLINE
    LETTER: UPPER_CASE_LETTER | LOWER_CASE_LETTER
    UPPER_CASE_LETTER: /[A-Z]/
    LOWER_CASE_LETTER: /[a-z]/
    UNDERLINE: "_"
    ESCAPED_IDENTIFIER: ESCAPED_STRING
    ESCAPED_CHARACTERS: ESCAPED_CHARACTER+
    ESCAPED_CHARACTER:  SIMPLE_CHARACTER | SPECIAL_CHARACTER | WHITESPACE_CHARACTER
    SPECIAL_CHARACTER: "!"|"@"|"#"|"$"|"%"|"^"|"&"|"*"|"("|")"|"-"|"+"|"="|"|"|"`"|"~"|"{"|"["|"}"|"]"|":"|";"|","|"'"|"."|">"|"/"|"?"|"\\"
    WHITESPACE_CHARACTER: " " | "\\t" | "\\n"
    STRING: ESCAPED_IDENTIFIER
    HEXDIGIT: DIGIT | "a" | "A" | "b" | "B" | "c" | "C" | "d" | "D" | "e" | "E" | "f" | "F"
    DIGIT: "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
    HEXDIGITS: HEXDIGIT+
    INTEGER: DIGIT+
    SIGNED_INTEGER: INTEGER | "-" INTEGER
    DECIMAL: SIGNED_INTEGER
            | SIGNED_INTEGER "." INTEGER
            | SIGNED_INTEGER "e" SIGNED_INTEGER
            | SIGNED_INTEGER "." INTEGER "e" SIGNED_INTEGER

# Others
    HEXCHARS: HEXCHAR+
    HEXCHAR: /[0-9a-fA-F]/

    REPEAT: "\\r"


    %import common.ESCAPED_STRING

    %import common.C_COMMENT
    %ignore C_COMMENT

    %import common.CPP_COMMENT
    %ignore CPP_COMMENT

    %import common.WS
    %ignore WS

    %import common.NEWLINE
    %ignore NEWLINE
"""

Can someone help me on this? Thank you.

Upvotes: 1

Views: 99

Answers (0)

Related Questions