Reputation: 10348

How to remove a DIV containing other DIVs using a HTML/DOM parser or Xpath

I have a string which contain a DIV tag to remove.
I can recognize the DIV to remove by its parameters (the specific style in this case) that is unique. This DIV contains a lot of HTML inside including other DIVs.

<div style="padding-top: 10px; clear: both; width: 100%;">
    { a lot other divs here}
</div>

How remove it from the string?

EDIT: (Any useful technique is welcome)

EDIT 2: I know about the inconvenience of using ergualr expressions. If you have a solution using regexs is welcome too because is a one-stop parsing process ans the text is very small and the HTML is well-construted (Indeed is XHTML).

EDIT 3: If possible please show an example using a HTML/DOM parser or Xpath or whatever. The problem here is not select data else remove data. Can be done with HTML/DOM parser or Xpath?

Upvotes: 0

Answers (3)

ttback

Reputation: 2111

XPath is easiest and it works with JQuery. Check on the reference. http://saxon.sourceforge.net/saxon6.5/expressions.html

Since it's based on location(path), you can specify how deep you want to go like how you work with file paths.

You can try stuffs like //{Tag above div}/div

This is different from //div because // doesn't care where to start, it will get all the Divs anywhere in the doc, so your starting tag after // gotta be unique. You can even start from //html and just / down through the DOM tree like entering an address if you want. There shouldn't be that many levels between html and your first div.

Upvotes: 0