haster8558
haster8558

Reputation: 463

Python simple brackets parser

I basically have a file with this structure:

root \
{
  field1 {
    subfield_a {
      "value1"
    }
    subfield_b {
      "value2"
    }
    subfield_c {
      "value1"
      "value2"
      "value3"
    }
    subfield_d {
    }
  }
  field2 {
    subfield_a {
      "value1"
    }
    subfield_b {
      "value1"
    }
    subfield_c {
      "value1"
      "value2"
      "value3"
      "value4"
      "value5"
    }
    subfield_d {
    }
  }
}

I want to parse this file with python to get a multidimensional array that contains all the values of a specific subfield (for examples subfield_c). E.g. :

tmp = magic_parse_function("subfield_c",file)
print tmp[0] # [ "value1", "value2", "value3"]
print tmp[1] # [ "value1", "value2", "value3", "value4", "value5"]

I'm pretty sure I've to use the pyparsing class, but I don't where to start to set the regex (?) expression. Can someone give me some pointers ?

Upvotes: 0

Views: 1127

Answers (1)

PaulMcG
PaulMcG

Reputation: 63719

You can let pyparsing take care of the matching and iterating over the input, just define what you want it to match, and pass it the body of the file as a string:

def magic_parse_function(fld_name, source):
    from pyparsing import Keyword, nestedExpr

    # define parser
    parser = Keyword(fld_name).suppress() + nestedExpr('{','}')("content")

    # search input string for matching keyword and following braced content
    matches = parser.searchString(source)

    # remove quotation marks
    return [[qs.strip('"') for qs in r[0].asList()] for r in matches]

# read content of file into a string 'file_body' and pass it to the function
tmp = magic_parse_function("subfield_c",file_body)

print(tmp[0])
print(tmp[1])

prints:

['value1', 'value2', 'value3']
['value1', 'value2', 'value3', 'value4', 'value5']

Upvotes: 1

Related Questions