Reputation: 53
I have a string like this:
----------
FT Weekend
----------
Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the
literature of endurance tell us about our thirst for self-imposed hardship?
I want to delete the part from ----------
to the next ----------
included.
I have been using re.sub
:
pattern =r"-+\n.+\n-+"
re.sub(pattern, '', thestring)
Upvotes: 2
Views: 215
Reputation: 626927
The problem with your regex (-+\n.+\n-+
) is that .
matches any character but a newline, and that it is too greedy (.+
), and can span across multiple -------
entities.
You can use the following regex:
pattern = r"(?s)-+\n.+?\n-+"
The (?s)
singleline option makes .
match any character including newline.
The .+?
pattern will match 1 or more characters but as few as possible to match up to the next ----
.
See IDEONE demo
For a more profound cleanup, I'd recommend:
pattern = r"(?s)\s*-+\n.+?\n-+\s*"
See another demo
Upvotes: 0
Reputation: 67968
pattern =r"-+\n.+?\n-+"
re.sub(pattern, '', thestring,flags=re.DOTALL)
Just use DOTALL
flag.The problem with your regex was that by default .
does not match \n
.So you need to explicitly add a flag DOTALL
making it match \n
.
See demo.
https://regex101.com/r/hR7tH4/24
or
pattern =r"-+\n[\s\S]+?\n-+"
re.sub(pattern, '', thestring)
if you dont want to add a flag
Upvotes: 4
Reputation: 107297
Your regex doesn't match the expected part because .+
doesn't capture new line character. you can use re.DOTALL
flag to forced .
to match newlines or re.S
.but instead of that You can use a negated character class :
>>> print re.sub(r"-+[^-]+-+", '', s)
''
Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the
literature of endurance tell us about our thirst for self-imposed hardship?
>>>
Or more precise you can do:
>>> print re.sub(r"-+[^-]+-+[^\w]+", '', s)
'Why do we run marathons?
Are marathons and cycling races about more than exercise? What does the
literature of endurance tell us about our thirst for self-imposed hardship?
>>>
Upvotes: 2