Reputation: 4921
I'm trying to create regular expression that filters from the following partial text:
amd64 build of software 1:0.98.10-0.2svn20090909 in archive
what I want to extract is:
software 1:0.98.10-0.2svn20090909
How can I do this?? I've been trying and this is what I have so far:
p = re.compile('([a-zA-Z0-9\-\+\.]+)\ ([0-9\:\.\-]+)')
iterator = p.finditer("amd64 build of software 1:0.98.10-0.2svn20090909 in archive")
for match in iterator:
print match.group()
with result:
software 1:0.98.10-0.2
(svn20090909
is missing)
Thanks a lot.
Upvotes: 0
Views: 223
Reputation: 42082
If you have consistent lines, this is, if each entry is on one line and the first word you want is always before the numbers part (the 1:0.98 ... part) you don't need a regexp. Try this:
>>> s = 'amd64 build of software 1:0.98.10-0.2svn20090909 in archive'
>>> match = [s.split()[3], s.split()[4]]
>>> print match
['software', '1:0.98.10-0.2svn20090909']
>>> # alternatively
>>> match = s.split()[3:5] # for same result
what this is doing is the following: it first splits the line s
at the spaces (using the string method split()
) and selects the fourth and fifth elements of the resulting list; both are stored in the variable match
.
Again , this only works if you have one entry per line and if the 'software'
part always comes before the 1:0.98.10-0.2svn20090909
part.
I often avoid regexps when I can do with split lists. If the parsing becomes a nightmare, I use pyparsing.
Upvotes: 3
Reputation: 281495
This will work:
p = re.compile(r'([a-zA-Z0-9\-\+\.]+)\ ([0-9][0-9a-zA-Z\:\.\-]+)')
iterator = p.finditer("amd64 build of dvdrip software 1:0.98.10-0.2svn20090909 in archive")
for match in iterator:
print match.group()
# Prints: software 1:0.98.10-0.2svn20090909
That works by allowing the captured section to contain letters while still insisting that it starts with a number.
Without seeing all the other strings it needs to match, I can't be sure whether that's good enough.
Upvotes: 3
Reputation: 5861
Don't use a capturing group if you want everything in one piece.
Upvotes: 0