rick
rick

Reputation: 4131

How do you remove html tags using Universal Feed Parser?

The documentation lists the tags that are allowed/removed by default:

http://www.feedparser.org/docs/html-sanitization.html

But it doesn't say anything about how you can specify which additional tags you want removed.

Is there a way to do this using Universal Feed Parser or do you have to do further processing using your own regex and/or something like Beautiful Soup?

Upvotes: 3

Views: 1866

Answers (1)

Jochen Ritzel
Jochen Ritzel

Reputation: 107676

i took a quick look over the code and i don't think there is a way to overwrite them directly. But you can overwrite feedparser._HTMLSanitizer.acceptable_elements, the list of tags that wont get removed before doing feedparser.parse

Upvotes: 6

Related Questions