user1911868
user1911868

Reputation: 53

Regex not matching empty string

Pattern srcAttrPattern = Pattern.compile("(?i)(?<=src=\")[^\"]*(?<!\")");
Matcher srcMatcher=srcAttrPattern.matcher("src=\"\"");
System.out.println(srcMatcher.find());

This prints false. How do I interpret the above code ? Is there any modification needed to include src="" for the above code to serve purpose of empty as well as filled string. This statement is basically to match the src tag in <img> of html contents.

Upvotes: 3

Views: 161

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

Note that to parse HTML, you'd better use some dedicated parser (e.g. Jsoup).

As for the current issue of matching a src="" string, the final negative lookbehind requires the character before the current location to be other than a quote. Since you are using a negated character class [^"]* (0+ characters other than ") you just do not need that lookbehind.

Remove (?<!") and you will match the empty string in src="" with the "(?i)(?<=src=\")[^\"]*".

See the regex demo

Upvotes: 2

Related Questions