WinnyDaPoo
WinnyDaPoo

Reputation: 145

Parsing with Python regex and storing in a list

Good day everyone. I just have a question regarding the use of regex in python. If I have a string code that consists of several lines of formula with variable or numeric value.

For example:

code1 = '''
g = -9
h = i + j
year = 2000
month = 0xA
date = 0b1101
sec = 1.8E3
d_1 = year + month
d_2 = date * sec
err = 0o0.1
'''

How would I be able to parse this so that I would get a list of set of strings?

lst = [{Variable}, {numeric value}, {operand}, {None of the above}]

so it would be:

lst = [{g, h, i , j, year, month, date, sec, d_1, d_2, err},{-9,2000,0xA,0b1101,1.8E3},{=, +, *},{0o0.1}]

What I did was just used split() to just separate each string and check if they are an int, or str, or non of the above and it works fine. But I want to know how to do this in regex

Upvotes: 0

Views: 54

Answers (1)

Mace
Mace

Reputation: 1410

Your desired output seems not very logical. But your question is about the parsing. You can indeed use split for this in a several steps. This code parses and creates a dictionary list. I guess you can use it to create your own wanted result.

code1 = '''
g = -9
h = i + j
year = 2000
month = 0xA
date = 0b1101
sec = 1.8E3
d_1 = year + month
d_2 = date * sec
err = 0o0.1
'''

lines = code1.split('\n')

results = []

for line in lines:
    if '=' in line:
        key, right_part = line.split('=')
        print(key, right_part)
        results.append({'key': key.strip(), 'right_part': right_part.strip()})

for result in results:
    value_parts = result['right_part'].split(' ')
    if len(value_parts) == 3:
        result['values'] = [value_parts[0], value_parts[2]]
        result['operand'] = value_parts[1]
    else:
        result['values'] = value_parts[0]
        result['operand'] = None

for result in results:
    print(result)

Result

{'key': 'g', 'right_part': '-9', 'values': '-9', 'operand': None}
{'key': 'h', 'right_part': 'i + j', 'values': ['i', 'j'], 'operand': '+'}
{'key': 'year', 'right_part': '2000', 'values': '2000', 'operand': None}
{'key': 'month', 'right_part': '0xA', 'values': '0xA', 'operand': None}
{'key': 'date', 'right_part': '0b1101', 'values': '0b1101', 'operand': None}
{'key': 'sec', 'right_part': '1.8E3', 'values': '1.8E3', 'operand': None}
{'key': 'd_1', 'right_part': 'year + month', 'values': ['year', 'month'], 'operand': '+'}
{'key': 'd_2', 'right_part': 'date * sec', 'values': ['date', 'sec'], 'operand': '*'}
{'key': 'err', 'right_part': '0o0.1', 'values': '0o0.1', 'operand': None}

Upvotes: 1

Related Questions