Arktype
Arktype

Reputation: 105

Alternatives to Regular Expression for HTML

I've seen over and over, and over and over and over on Stack Overflow that Regular Expression are NOT a good fit for XHTML. What I haven't seen however is an alternative.

Most text editors have a built in RegEx search and replace that is just super easy to use. Well, except for the fact that it doesn't work well with HTML. Is there some tool or language that is meant for parsing and replacing XHTML? It would be great if you could say "find all paragraph tags that have the class of "quote" that are within the DIV with the class of "monkey", and then add a H2 tag with "Monkey Quote" inside.

Another example that I'm struggling with finding a solution to is to find all words within Paragraph tags and wrap a SPAN tag around them (for word-by-word highlighting audio). That kind of stuff.

Is there a tool or language that is meant for this kind of thing?

Upvotes: 1

Views: 426

Answers (2)

Devon_C_Miller
Devon_C_Miller

Reputation: 16518

If you have a well formed document, XSLT and XPATH can do what you need.

Upvotes: 3

sethcall
sethcall

Reputation: 2897

From your last comment, I'm assuming you'd like something useful from the command-line.

If so, answered pretty well here:

Grep and Sed Equivalent for XML Command Line Processing

Upvotes: 3

Related Questions