Reputation: 887
I'm importing an RSS feed which has a series of empty paragraphs "<p> </p>
".
I am using gsub however it's not stripping the elements from the document:
document.gsub(/<p>\s*<\/p>/,"")
or gsub(/<p> <\/p>/,"")
Is there an alternative method or a mistake in the above?
The below appears to work?
gsub(/<p>.<\/p>/,"")
Upvotes: 0
Views: 1658
Reputation: 958
If the paragraph elements in your RSS feed uses id
and classes
try this:
gsub(/\<p(\s((class)|(id))=[\'\"][A-z0-9\s]+[\'\"]\s*)*\>\s*\<\/p\>/,"")
Upvotes: 0
Reputation: 9167
Correct regex like in example:
>> document = "<p>\n\n\n \n</p>aaa<p> </p>bbb"
=> "<p>\n\n\n \n</p>aaa<p> </p>bbb"
>> document.gsub(/<p>[\s$]*<\/p>/, '')
=> "aaabbb"
Upvotes: 5