Reputation: 21261
I have HTML and I want to remove IMG tag from it.
I am not good at regex, I have this function but it does not remove IMG tag
def remove_img_tags(data):
p = re.compile(r'<img.*?/>')
return p.sub('', data)
What is the proper regex? I don't want to use any library.
Upvotes: 1
Views: 2653
Reputation: 8978
All you need is to capture img
tag and replace it with empty string.
clean_data = re.sub("(<img.*?>)", "", data, 0, re.IGNORECASE | re.DOTALL | re.MULTILINE)
You'll be passing HTML content in data
. Regex will remove all img
tags, their content and return clean data in clean_data
variable.
Upvotes: 4
Reputation: 3699
Try this:
image_tag = re.compile(r'<img.*?/>').search(data).group()
data.replace(image_tag, '')
Upvotes: 1