Reputation: 3267
Got this string and regex findall:
txt = """
dx d_2,222.22 ,,
dy h..{3,333.33} ,,
dz b#(1,111.11) ,, dx-ay relative 4,444.44 ,,
"""
for n in re.findall( r'([-\w]+){1}\W+([^,{2}]+)\s+,,\W+', txt ) :
axis, value = n
print "a:", axis
print "v:", value
In second (value) group I am trying to match anything except double commas, but it seems to catch only one ","
. I can got it in this example with simple (.*?)
but for certain reasons it got to be everything except ",,"
. Thank you.
EDIT: To see what I want to accomplish just use r'([-\w]+){1}\W+(.*?)\s+,,\W+'
instead. It will give you such output:
a: dx
v: d_2,222.22
a: dy
v: h..{3,333.33}
a: dz
v: b#(1,111.11)
a: dx-ay
v: relative 4,444.44
EDIT #2: Please, answer which did not include double comma exception is not what is needed. Is there a solution...should be. So patern is :
Any whitespace - word with possibly "-"
- than " "
- and everything to ",,"
except itself.
Upvotes: 2
Views: 987
Reputation: 3267
r'(?<=,,)\s+([-\w]+)\s(.*?)(?:,,)'
is expression what is needed here. Much more simpler than I could thought.
r'(?<=,,)
is positive lookbehind assertion and it will find a match in string which is after double commas , since the lookbehind will back up 2 chars and check if the contained pattern matches.
(?:,,)
as last one is non-capturing version of regular parentheses, so everything in between should match.
\s
or \s+
is there only for the matter of this specific type of string.
Upvotes: 1
Reputation: 46861
[^,{2}]
is a character class that matches any character except: ',', '{', '2', '}'
With a "character class", also called "character set", you can tell the regex engine to match
only one out of several characters
.
It should be ([^,]{2})+
( group and capture to \1
[^,]{2} any character except: ',' (2 times)
)+ end of \1
Get the matched group from index 1 and 2
([-\w]+)\s+(.*?)\s+,,
Here is online demo
sample code:
import re
p = re.compile(ur'([-\w]+)\s+(.*?)\s+,,')
test_str = u"..."
re.findall(p, test_str)
Note: use \s*
instead of \s+
if spaces are optional.
Upvotes: 3