user715697
user715697

Reputation: 887

Remove empty paragraphs

I'm importing an RSS feed which has a series of empty paragraphs "<p> </p>".

I am using gsub however it's not stripping the elements from the document:

document.gsub(/<p>\s*<\/p>/,"") or gsub(/<p> <\/p>/,"")

Is there an alternative method or a mistake in the above?

The below appears to work?

gsub(/<p>.<\/p>/,"")

Upvotes: 0

Views: 1658

Answers (2)

Bruno
Bruno

Reputation: 958

If the paragraph elements in your RSS feed uses id and classes try this:

gsub(/\<p(\s((class)|(id))=[\'\"][A-z0-9\s]+[\'\"]\s*)*\>\s*\<\/p\>/,"")

Upvotes: 0

Hck
Hck

Reputation: 9167

Correct regex like in example:

>> document = "<p>\n\n\n   \n</p>aaa<p>  </p>bbb"                       
=> "<p>\n\n\n   \n</p>aaa<p>  </p>bbb"                                  
>> document.gsub(/<p>[\s$]*<\/p>/, '')                                  
=> "aaabbb"    

Upvotes: 5

Related Questions