JosiP
JosiP

Reputation: 3079

simple regex case in python

from string:

l="\tsome string in line 1\n\tcmd: DIR @1332243996 (2012.03.20 12:46:36) state op:29 cfg:0\n\tline 3 some other string"

i want to extract "DIR", therefore i created that regex:

j = re.search(r'cmd: \w+', l)

but when i do:

print j.group()

i got:

cmd: DIR

What should I do, to get only "DIR", not with "cmd: " eg:

print j.group()
DIR

thx for all answers

Upvotes: 0

Views: 95

Answers (4)

Dr.Kameleon
Dr.Kameleon

Reputation: 22820

RE-RE-FIXED

Here's your Regex : cmd:\s([\w//\\]+)\s@[0-9]+\s


Hint : it matches cmd: somedir @12312312 as well as cmd: another/dir @123123

Upvotes: -1

Marcin
Marcin

Reputation: 49846

You need to place a group (that is, brackets) around the part that you want to capture:

j = re.search(r'cmd: (\w+)', l)
k = re.search(r'cmd:\s*(\w+)', l)
print j.group(1)

You might prefer to use the k version, which handles a variable amount of whitespace between "cmd:" and what follows.

Upvotes: 4

Jon Gauthier
Jon Gauthier

Reputation: 25582

You need to capture the DIR group in your regex:

j = re.search(r'cmd: (\w+)', l)

Then reference it when retrieving:

print j.group(1)

Upvotes: 5

stema
stema

Reputation: 92996

Make it a positive look behind assertion

j = re.search(r'(?<=cmd: )\w+', l)

See it here on Regexr

A group starting with ?<= is a positive look behind assertion that means, it does not match, but it ensures that the content is before the pattern you want to match.

Upvotes: 4

Related Questions