Reputation: 2617
I'm currently struggling with a text parser to format java protected words with there own HTML tags.
so I want
class HelloWorld
To appear as a string
<span class= "class">class</span> HelloWorld
Which I managed to get working, however class is a protected word, so I want to be able to distinquish using regex beween
class
and
"class" or class=
Here is my current code.
word = word.replaceAll("\\b"+javaWord+"\\b",addTag(javaWord,javaWord));
Really struggling so appreciate any help?
Upvotes: 2
Views: 360
Reputation: 75222
Instead of "\\b"+javaWord+"\\b"
, try
"(?<![\\w\"])"+javaWord+"(?![\\w\"=])"
But @sgusc makes a good point: this technique can't be extended to deal with keywords in longer string literals, or in comments either.
Upvotes: 2
Reputation: 48196
you're better off creating your own state machine that iterates over the input, each time you see whitespace (or just non alphabetic chars) you then flush the buffer depending on what word you just passed
so that when you pass a "
you ignore until the next (unescaped) "
(same with <
and >
) (or just see it as one word with <span class="string">
around it ;)
Upvotes: 0