Reputation: 639
I have an html data in my string in which i need to get only paragraph values.Below is a sample html.
<html>
<head>
<title>
<script>
<div>
Some contents
</div>
<div>
<p> This is what i want </p>
<p> Select all data from p </p>
<p> Upto this is required </p>
</div>
<div>
Other html elements
</div>
So how to get the data from the paragraphs using string manipulation.
Desired Output
<Div>
<p> This is what i want </p>
<p> Select all data from p </p>
<p> Upto this is required </p>
</div>
Upvotes: 0
Views: 370
Reputation: 2261
If you use Html Agility Pack as mentioned in the other posts, you can get all paragraph elements in the html by using:
HtmlDocument doc = new HtmlDocument();
doc.Load("your html string");
var pNodes = doc.DocumentNode.SelectNodes("//div[@id='id of the div']/p")
Since you are using .net Framework 2.0, you would want an older version of Agility Pack, which can be found here: HTML Agility Pack
If you want just the text inside the paragraph, you can use
var pNodes = doc.DocumentNode.SelectNodes("//div[@id='id of the div']/p/text()")
Upvotes: 0
Reputation: 3136
Xpath is the obvious answer (if the HTML is decent, has a root etc), failing that some third party widget like chilkat
Upvotes: 0
Reputation: 435
I have used Html agility Pack for something like this. Then you can use LINQ to get what you want.
Upvotes: 0
Reputation: 68606
Give the div an ID, e.g.
<div id="test">
<p> This is what i want </p>
<p> Select all data from p </p>
<p> Upto this is required </p>
</div>
then use //div[@id='test']/p
.
The solution broken down:
//div - All div elements
[@id='test'] - With an ID attribute whose value is test
/p
Upvotes: 1