Python parse complex command output

Question

Need to parse output of a command in python. The command returns something like this

A:
        2 bs found
        3 cs found
B:
        1 a found
        3 bs found
C:
        1 c found
        D:
                2 es found
                3 fs found

Need to able to do the following with the output:

access a.bs found b.a found. c.d.es found and so on.

How do I do this python? What data structure is best suited to do this?

The goal of this exercise is to run the command every 10 secs and identify a diff of what's changed

kampu · Accepted Answer

This should have a 'parsing' tag as it's a general parsing problem.

The normal solution in this kind of situation is to track a) the indentation and b) the list of structures that are currently being parsed, as you read in lines. b would begin as a list containing a single empty dict, ie. curparsing = [{}]

Loop over all input lines. For example:

with open('inputfilename','r') as f:
    for line in f:
        # code implementing the below rules.

if a line is blank (if not line.strip():), ignore it and go onto the next one (continue)
if the indentation level has decreased, we should remove the top item in the currently-parsing list (ie. curparsing.pop()). if multiple decreases are detected, we should remove multiple items from the top.
strip off any leading indentation with line=line.lstrip()
if ':' is in the line, then we've found a sub-dictionary. Read the key(the part to the left of ':'), increase the indent-level, create a new dictionary, and insert it into the dictionary at the current top of the list. Then append our newly-created dictionary to the list.
if line[0] in '123456789': then we found a report of '[count] [character]s found'. we can use regular expressions to find the count and the character, with m = re.match('([1-9]+) ([a-z])'); count, character = m.groups(); count = int(count). We then store this into the dictionary at the current top of the list: curparsing[-1][character] = count

That's pretty much it. You just loop over lines and apply these rules to each line, and at the end, curparsing[0] contains the parsed document.

Python parse complex command output

Answers (2)

Related Questions