Reputation: 1373
im just getting to grips with how regex works in python but some of the syntax is kind of throwing me a bit.
how would you translate the following regex into a regex that can be used by the re module in python?
a(b|c)*a
it doesnt matter what the symbols are, i am more asking about the brackets and operators, how they work.
if i was to be specific about my situation, i am trying to capture all text from between two angle brackets. according to some resources that i have read, the "." character matches any character except newline, and "s" matches any whitespace, including newline, so i thought the way to do it would be:
<[.|s]*>
but evidently i was wrong.
i am interested in a solution for my specific problem, but any general information on the operators in python regex would be appreciated also.
EDIT:
after more experimenting it seems to work when i use:
<.*>
when i have text like
<foo bar>
but not for when i have
<foo
bar>
however when i try
<[\n.]*>
nothing works. and so i thought it might be the brackets doing it or something so i tried:
<[.]*>
and that didnt even work like <.*>
.. but surely, the two are the same except for the brackets..
anyone have any ideas? i'd like to be able to capture all text like:
<foo
bar>
Upvotes: 0
Views: 691
Reputation: 66
The python regular expression syntax is clearly documented here:
https://docs.python.org/2/library/re.html
For your particular case, I'd try something like:
import re
pat = re.compile('<([^>]*)>')
match = pat.search('Foo <bar> bam')
print match.groups()
# should print ('bar',)
To understand the regular expression, we can break it down into its component parts:
Upvotes: 3
Reputation: 91
a(b|c)*a
is directly usable as a Python re. <[.|s]*>
is a confused mess. [
...]
is a character range: |
has no business inside. s
does not denote a space in Python regular expressions; instead \s
does. Maybe you are confusing |s
with \s
here (but it would make more sense to use just \n
here and/or use the respective flags to have .
also match a newline).
Upvotes: 0