user528025
user528025

Reputation: 770

How can I parse a formatted file into variables using Python?

I have a pre-formatted text file with some variables in it, like this:

header one
   name = "this is my name"
   last_name = "this is my last name"
   addr = "somewhere"
   addr_no = 35
header
header two
   first_var = 1.002E-3
   second_var = -2.002E-8
header 

As you can see, each score starts with the string header followed by the name of the scope (one, two, etc.).

I can't figure out how to programmatically parse those options using Python so that they would be accesible to my script in this manner:

one.name = "this is my name"
one.last_name = "this is my last name"
two.first_var = 1.002E-3

Can anyone point me to a tutorial or a library or to a specific part of the docs that would help me achieve my goal?

Upvotes: 2

Views: 2260

Answers (3)

perreal
perreal

Reputation: 97918

def get_section(f):
    section=[]
    for line in f:
        section += [ line.strip("\n ") ]
        if section[-1] == 'header': break
    return section

sections = dict()
with open('input') as f:
    while True:
        section = get_section(f)
        if not section: break
        section_dict = dict()
        section_dict['sname'] = section[0].split()[1]
        for param in section[1:-2]:
            k,v = [ x.strip() for x in param.split('=')]
            section_dict[k] = v
        sections[section_dict['sname']] = section_dict

print sections['one']['name']

You can also access these sections as attributes:

class Section:
    def __init__(self, d):
        self.__dict__ = d

one = Section(sections['one'])
print one.name

Upvotes: 1

Martijn Pieters
Martijn Pieters

Reputation: 1121266

I'd parse that with a generator, yielding sections as you parse the file. ast.literal_eval() takes care of interpreting the value as a Python literal:

import ast

def load_sections(filename):
    with open(filename, 'r') as infile:
        for line in infile:
            if not line.startswith('header'):
                continue  # skip to the next line until we find a header

            sectionname = line.split(None, 1)[-1].strip()
            section = {}
            for line in infile:
                if line.startswith('header'):
                    break  # end of section
                line = line.strip()               
                key, value = line.split(' = ', 1)
                section[key] = ast.literal_eval(value)

            yield sectionname, section

Loop over the above function to receive (name, section_dict) tuples:

for name, section in load_sections(somefilename):
    print name, section

For your sample input data, that results in:

>>> for name, section in load_sections('/tmp/example'):
...     print name, section
... 
one {'last_name': 'this is my last name', 'name': 'this is my name', 'addr_no': 35, 'addr': 'somewhere'}
two {'first_var': 0.001002, 'second_var': -2.002e-08}

Upvotes: 4

crouleau
crouleau

Reputation: 156

Martijn Pieters is correct in his answer given your preformatted file, but if you can format the file in a different way in the first place, you will avoid a lot of potential bugs. If I were you, I would look into getting the file formatted as JSON (or XML), because then you would be able to use python's json (or XML) libraries to do the work for you. http://docs.python.org/2/library/json.html . Unless you're working with really bad legacy code or a system that you don't have access to, you should be able to go into the code that spits out the file in the first place and make it give you a better file.

Upvotes: 2

Related Questions