thalm
thalm

Reputation: 2930

how to parse a string with html tags in its substrings which are bold, italic, underlined

I created some kind of text rendering tool for a 2D graphics framework in c#.

Now i was trying to parse a text with specific html tags in it, like:

"Hello <b>world</b>!" 

But the parsing code was getting ugly and I thought, there must be some lib that does exactly that. At the end it should output an array of data structures like:

string text;
bool IsBold;
bool IsItalic;
bool IsUnderlined;
...

or

string text;
FontStyle FontStyle;

Anyone know of such a parser?

Thanks a lot!

Upvotes: 3

Views: 1093

Answers (3)

Richard J. Ross III
Richard J. Ross III

Reputation: 55583

I do not know how this would work, but here are some HTML parsers:
html_parse
htmlagilitypack

Upvotes: 0

skyfoot
skyfoot

Reputation: 20769

Tidy.net is a fantastic tool which is a port from the original Tidy project which is used in the HTML Tidy firefox plugin. Run your code through Tidy and it will return clean, compliant html.

Upvotes: 0

Oded
Oded

Reputation: 499212

The HTML Agility Pack is a good HTML parser (and also parses fragments).

You can query it using XPath syntax (it is similar to XmlDocument) - not sure how good a fit it will be for your requirements.

Upvotes: 3

Related Questions