Rob
Rob

Reputation: 8101

Extracting data from a text file to use in a python script?

Basically, I have a file like this:

Url/Host:   www.example.com
Login:     user
Password:   password
Data_I_Dont_Need:    something_else

How can I use RegEx to separate the details to place them into variables?

Sorry if this is a terrible question, I can just never grasp RegEx. So another question would be, can you provide the RegEx, but kind of explain what each part of it is for?

Upvotes: 0

Views: 1440

Answers (5)

Alex Martelli
Alex Martelli

Reputation: 881507

You should put the entries in a dictionary, not in so many separate variables -- clearly, the keys you're using need NOT be acceptable as variable names (that slash in 'Url/Host' would be a killer!-), but they'll be just fine as string keys into a dictionary.

import re

there = re.compile(r'''(?x)      # verbose flag: allows comments & whitespace
                       ^         # anchor to the start
                       ([^:]+)   # group with 1+ non-colons, the key
                       :\s*      # colon, then arbitrary whitespace
                       (.*)      # group everything that follows
                       $         # anchor to the end
                    ''')

and then

 configdict = {}
 for aline in open('thefile.txt'):
   mo = there.match(aline)
   if not mo:
     print("Skipping invalid line %r" % aline)
     continue
   k, v = mo.groups()
   configdict[k] = v

the possibility of making RE patterns "verbose" (by starting them with (?x) or using re.VERBOSE as the second argument to re.compile) is very useful to allow you to clarify your REs with comments and nicely-aligning whitespace. I think it's sadly underused;-).

Upvotes: 1

jfs
jfs

Reputation: 414089

ConfigParser module supports ':' delimiter.

import ConfigParser
from cStringIO import StringIO

class Parser(ConfigParser.RawConfigParser):
    def _read(self, fp, fpname):
        data = StringIO("[data]\n"+fp.read()) 
        return ConfigParser.RawConfigParser._read(self, data, fpname)

p = Parser()
p.read("file.txt")
print dict(p.items("data"))

Output:

{'login': 'user', 'password': 'password', 'url/host': 'www.example.com'}

Though a regex or manual parsing might be more appropriate in your case.

Upvotes: 0

snim2
snim2

Reputation: 4079

For a file as simple as this you don't really need regular expressions. String functions are probably easier to understand. This code:

def parse(data):
    parsed = {}    
    for line in data.split('\n'):
        if not line: continue # Blank line
        pair = line.split(':')
        parsed[pair[0].strip()] = pair[1].strip()
    return parsed

if __name__ == '__main__':
    test = """Url/Host:   www.example.com
    Login:     user
    Password:   password
"""
    print parse(test)

Will do the job, and results in:

{'Login': 'user', 'Password': 'password', 'Url/Host': 'www.example.com'}

Upvotes: 1

mikerobi
mikerobi

Reputation: 20878

EDIT: Better Solution

for line in input: 
    key, val = re.search('(.*?):\s*(.*)', line).groups()

Upvotes: 0

Htechno
Htechno

Reputation: 6117

Well, if you don't know about regex, simply change you file like this:

Host = www.example.com
Login = uer
Password = password

And use ConfigParser python module http://docs.python.org/library/configparser.html

Upvotes: 0

Related Questions