Reputation: 52
I have a string like this below and I want to extract the highlighted part from this string using regex or any other way if possible
The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n*
Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur. Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.
description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur. Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'
#now I want to match substring between (Tornado Warning for... *** ...\n\n*)
# I tried to like this
re.search('Tornado Warning for...(.*)\n\n*', description)
# I am getting results like this
<re.Match object; span=(67, 90), match='Tornado Warning for...\n'>
#expected result
<re.Match object; span=(any, any), match='Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n*'>
it does not match full substring its only match Tornado Warning for...\n
I want to match
Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n*
where substring start Tornado Warning for...
and end \n\n*
thanks for your help and sorry for my bad English
Upvotes: 0
Views: 73
Reputation: 1275
.
unable to match \n
. Use [\W\w]
instead .
import re
description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur. Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'
print(re.search(r'Tornado Warning for\.\.\.([\W\w]*?)\n\n\*', description).group())
"""
Tornado Warning for...
Northwestern Columbia County in south central Wisconsin...
Southwestern Marquette County in south central Wisconsin...
*
"""
Upvotes: 0
Reputation: 163217
You could match
\bTornado Warning for\.\.\.(?:\n.*)*?\n\n
The pattern matches:
\bTornado Warning for\.\.\.
Match Tornado Warning for
preceded by a word boundary and escape the dots to match them literally(?:\n.*)*?
Match as least as possible times a newline and the rest of the line\n\n
Match 2 newlinesFor example
import re
description = 'The National Weather Service in Milwaukee/Sullivan has issued a\n\n* Tornado Warning for...\nNorthwestern Columbia County in south central Wisconsin...\nSouthwestern Marquette County in south central Wisconsin...\n\n* Until 945 PM CDT.\n\n* At 911 PM CDT, a severe thunderstorm capable of producing a tornado\nwas located 8 miles east of Wisconsin Dells, moving northeast at 45\nmph.\n\nHAZARD...Tornado.\n\nSOURCE...Radar indicated rotation.\n\nIMPACT...Flying debris will be dangerous to those caught without\nshelter. Mobile homes will be damaged or destroyed.\nDamage to roofs, windows, and vehicles will occur. Tree\ndamage is likely.\n\n* Locations impacted include...\nPackwaukee, Endeavor and Briggsville.'
m = re.search(r'\bTornado Warning for\.\.\.(?:\n.*)*?\n\n', description)
if m:
print(m.group())
Output
Tornado Warning for...
Northwestern Columbia County in south central Wisconsin...
Southwestern Marquette County in south central Wisconsin...
Upvotes: 1
Reputation: 74
The Regex could look like this:
matched_string = re.findall("Tornado[a-zA-Z\s\.\\\*]+\\n\\n\*", description)
print(matched_string)
Upvotes: 1