Reputation: 930
I have to display first N (for example say 50 or 100) characters out of entire html string. I have to display well formated html.If i apply simple substring that will get me a malformated html string E.g.
Sample string : "<html><body><a href="http://foo.com">foo</a></body></html>"
trucated string: "<html><body><a href="http://foo.com">foo<"
This will get me malformated html :(
Any ideas on how to achieve this ??
Upvotes: 4
Views: 1867
Reputation: 498952
You can try using the HTML Agility Pack - it will parse out the HTML for you, but you will need to figure out how to produce a truncated version yourself. It should make things a lot easier though.
Upvotes: 3
Reputation: 17132
Parse the HTML into a DOM tree. Start with the deepest/innermost elements and
Rinse, lather, repeat.
This may truncate your string to the empty string, if your desired length is small enough.
For extra kicks, you could try removing attributes of the nodes as you go.
Upvotes: 1
Reputation: 7395
I've seen some forum systems simply append a </b></u></i></s> after every single post. You could approach this in a similar fashion.
Of course, its ugly and it wouldn't fix that trailing <
That is by far the simplest method. Better method would actually be generating a tree and... kicking nodes off until you meet the requirement.
Upvotes: 0