Reputation: 3079
from string:
l="\tsome string in line 1\n\tcmd: DIR @1332243996 (2012.03.20 12:46:36) state op:29 cfg:0\n\tline 3 some other string"
i want to extract "DIR", therefore i created that regex:
j = re.search(r'cmd: \w+', l)
but when i do:
print j.group()
i got:
cmd: DIR
What should I do, to get only "DIR", not with "cmd: " eg:
print j.group()
DIR
thx for all answers
Upvotes: 0
Views: 95
Reputation: 22820
RE-RE-FIXED
Here's your Regex : cmd:\s([\w//\\]+)\s@[0-9]+\s
Hint : it matches cmd: somedir @12312312
as well as cmd: another/dir @123123
Upvotes: -1
Reputation: 49846
You need to place a group (that is, brackets) around the part that you want to capture:
j = re.search(r'cmd: (\w+)', l)
k = re.search(r'cmd:\s*(\w+)', l)
print j.group(1)
You might prefer to use the k
version, which handles a variable amount of whitespace between "cmd:" and what follows.
Upvotes: 4
Reputation: 25582
You need to capture the DIR group in your regex:
j = re.search(r'cmd: (\w+)', l)
Then reference it when retrieving:
print j.group(1)
Upvotes: 5
Reputation: 92996
Make it a positive look behind assertion
j = re.search(r'(?<=cmd: )\w+', l)
See it here on Regexr
A group starting with ?<=
is a positive look behind assertion that means, it does not match, but it ensures that the content is before the pattern you want to match.
Upvotes: 4