Umair Ayub
Umair Ayub

Reputation: 21261

Remove IMG tag from HTML using Regex - Python 2.7

I have HTML and I want to remove IMG tag from it.

I am not good at regex, I have this function but it does not remove IMG tag

def remove_img_tags(data):
    p = re.compile(r'<img.*?/>')
    return p.sub('', data)

What is the proper regex? I don't want to use any library.

Upvotes: 1

Views: 2653

Answers (2)

Saleem
Saleem

Reputation: 8978

All you need is to capture img tag and replace it with empty string.

clean_data = re.sub("(<img.*?>)", "", data, 0, re.IGNORECASE | re.DOTALL | re.MULTILINE)

You'll be passing HTML content in data. Regex will remove all img tags, their content and return clean data in clean_data variable.

Upvotes: 4

Amin Etesamian
Amin Etesamian

Reputation: 3699

Try this:

image_tag = re.compile(r'<img.*?/>').search(data).group()
data.replace(image_tag, '')

Upvotes: 1

Related Questions