Birdman
Birdman

Reputation: 1524

Parsing a list of tokens in Python

GOAL: I have a list of tokens. Whenever there is a section of tokens surrounded by brackets, such as {t1,t2,etc.}, I need to make that into a new sublist. An example is provided below of my desired result.


parse:['(facto)', 'dup', 'length', '/n', 'exch', 'def', '/fact', '{', '0', 'dict', 'begin', '/n', 'exch', 'def', 'n', '2', 'lt', '{', '1', '}', '{', 'n', '1', 'sub', 'fact', 'n', 'mul', '}', 'ifelse', 'end', '}', 'def', 'n', 'fact', 'stack'])

returns: ['(facto)', 'dup', 'length', '/n', 'exch', 'def', '/fact', [0, 'dict', 'begin', '/n', 'exch', 'def', 'n', 2, 'lt', [1], ['n', 1, 'sub', 'fact', 'n', 'mul'], 'ifelse', 'end'], 'def', 'n', 'fact', 'stack']


Here is my code so far:

def parse1(L):
    newL = []
    for x in L:
        if x == '}':
            return newL
        elif x == '{': 
            newL.append(parse1(L[1:]))
        else:
            newL.append(x)
    return newL

It works to the extent that whenever a { occurs, I reclusively pass the rest of the list into the function again, with a base case being when } occurs. This is working okay, but once it exits the recursion and creates a sublist, the element "x" it's iterating through has not gone past this section. So for example, if our list is: ['{', '1', '}'], the result should simply be [[1]]. However, what is happening is it's returning [[1],'1'], because once it creates the sublist (which seems to be working fine) the next element "x" the loop is going through is actually an element of that sublist, it's the element right after the '{', and according to my code, it's then appended to the list.

I feel like this is a really easy fix, I have spent very long trying to figure it out though. I understand the issues, as I have explained, but cannot for the life of me figure out how to fix it. Any help would be greatly appreciated!

Upvotes: 0

Views: 1447

Answers (1)

Scott Hunter
Scott Hunter

Reputation: 49803

This is modeled on your attempt at a solution (and also handles the integers):

# This assumes brackets are properly balanced
def parse1(L):
    newL = []
    i = 0
    while i<len(L):
        x = L[i]
        if x == '}':
            # Return the parsed list & the unparsed portion of the original list
            print(newL, L[i:])
            return newL, L[i+1:]
        elif x == '{': 
            # Split rest of L into parsed & unparsed portions
            parsed, unparsed = parse1(L[i+1:])
            # Insert parsed portion into current list
            newL.append(parsed)
            # Reset i & L for unparsed portion
            i, L = 0, unparsed
        else:
            # Convert x to an integer if possible
            try:
                x = int(x)
            except:
                pass
            newL.append(x)
            i += 1
    return newL

Upvotes: 0

Related Questions