Reputation: 93
I have a file saving IP addresses to names in format
<<%#$192.168.8.40$#% %#@Name_of_person@#% >>
I read This file and now want to extract the list using pythons regular expressions
list=re.findall("<<%#$(\S+)$#%\s%#@(\w+\s*\w*)@#%\s>>",ace)
print list
But the list is always an empty list..
can anyone tell me where is the mistake in the regular expression
edit-ace
is the variable saving the contents read from the file
Upvotes: 1
Views: 94
Reputation: 1
u use a invalid regex pattern. you may use r"<\%#\$(\S+)\$#\%\s\%#@(\w+\s*\w*)@#\%\s>>" replace "<<%#$(\S+)$#%\s%#@(\w+\s*\w*)@#%\s>>" in fandall method
good luck~!
Upvotes: 0
Reputation: 142136
Something like:
text = '<<%#$192.168.8.40$#% %#@Name_of_person@#% >>'
ip, name = [el[1] for el in re.findall(r'%#(.)(.+?)\1#%', text)]
If you can get any with just splitting on '@' and '$' then...
from itertools import itemgetter
ip, name = itemgetter(1, 3)(re.split(r'[@\$]', text))
You could also just use built-in string functions:
tmp = text.split('$')
ip, name = tmp[1], tmp[2].split('@')[1]
Upvotes: 0
Reputation: 1570
$ is a special character in regular expressions, meaning "end of line" (or "end of string", depending on the flavour). Your regex has other characters following the $, and as such only matches strings that have those characters after the end, which is impossible.
You will need to escape the $, like so: \$
I would suggest the following regular expression (formatted as a raw string since you are using Python):
r"<<%#\$([^$]+)\$#%\s%#@([^@]+)@#%\s>>"
That is, <<%#$
, then one or more non-$ characters, $#%
, a whitespace character, %#@
, one or more non-@ characters, @#%
, whitespace, >>
.
Upvotes: 4