albin antony
albin antony

Reputation: 179

extract strings between two strings in python using regular expression

"<>THIS is the place to stay at when visiting the historical area of Seattle.

Your right on the water front near the ferry's and great sea food hotel.

The breakfast was great. <>"

Above is my sample text. I want to print the strings fall in between <> & <>. I want my output to be free of new line character \n, like this:

THIS is the place to stay at when visiting the historical area of Seattle. Your right on the water front near the ferry's and great sea food hotel.The breakfast was great.

I have tried the following piece of code:

import re
pattern = re.compile(r'\<>(.+?)\<>',re.DOTALL|re.MULTILINE)
text = """<>THIS is the place to stay at when visiting the historical area of Seattle.

Your right on the water front near the ferry's and great sea food hotel.

The breakfast was great.
<>"""
results = pattern.findall(text)
print results

But I am getting results like this :

["THIS is the place to stay at when visiting the historical area of Seattle.\n\nYour right on the water front near the ferry's and great sea food hotel.\n\nThe breakfast was great.\n"]

But I don't want any new line characters in my resulting string.

Upvotes: 1

Views: 629

Answers (2)

товіаѕ
товіаѕ

Reputation: 3264

just replace those characters you don't want

e.g.

result_without_newline = str(result).replace('\n', '')

hope this helps :)

Upvotes: 3

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627335

Use .replace("\n", "") on each found match (use comprehension) to replace any newline with an empty string.

See the demo:

results = [x.replace("\n", "") for x in pattern.findall(text)]
# => ["THIS is the place to stay at when visiting the historical area of Seattle.Your right on the water front near the ferry's and great sea food hotel.The breakfast was great."]

Upvotes: 4

Related Questions