Nitin Gautam
Nitin Gautam

Reputation: 9

How to remove HTML tags from a list of texts in Python

I am going through a web series for python and I'm really new. I was able to extract all the posts of a blogas a list of strings. These strings contain HTML tags that I wish to remove.

I followed this answer

Strip HTML from strings in Python

but I am getting an error

<ipython-input-42-d28731ec9a50> in strip_tags(html)
     14 def strip_tags(html):
     15     s = MLStripper()
---> 16     s.feed(html)
     17     return s.get_data()

C:\ProgramData\Anaconda3\lib\html\parser.py in feed(self, data)
    108         as you want (may include '\n').
    109         """
--> 110         self.rawdata = self.rawdata + data
    111         self.goahead(0)
    112 

TypeError: must be str, not list

HELP!!

Thanks :P

Upvotes: 0

Views: 153

Answers (1)

Ying Joy
Ying Joy

Reputation: 21

you can try the regex.

<(.*?)> and </(.*?)>

Upvotes: 2

Related Questions