garyamorris
garyamorris

Reputation: 2617

Java Regex find word with no = on end

I'm currently struggling with a text parser to format java protected words with there own HTML tags.

so I want

class HelloWorld

To appear as a string

<span class= "class">class</span> HelloWorld

Which I managed to get working, however class is a protected word, so I want to be able to distinquish using regex beween

class

and

"class" or class=

Here is my current code.

word = word.replaceAll("\\b"+javaWord+"\\b",addTag(javaWord,javaWord));

Really struggling so appreciate any help?

Upvotes: 2

Views: 360

Answers (2)

Alan Moore
Alan Moore

Reputation: 75222

Instead of "\\b"+javaWord+"\\b", try

"(?<![\\w\"])"+javaWord+"(?![\\w\"=])"

But @sgusc makes a good point: this technique can't be extended to deal with keywords in longer string literals, or in comments either.

Upvotes: 2

ratchet freak
ratchet freak

Reputation: 48196

you're better off creating your own state machine that iterates over the input, each time you see whitespace (or just non alphabetic chars) you then flush the buffer depending on what word you just passed

so that when you pass a " you ignore until the next (unescaped) " (same with < and >) (or just see it as one word with <span class="string"> around it ;)

Upvotes: 0

Related Questions