vertex
vertex

Reputation: 31

Extract encoded strings based on percentage symbol with Regex and Java

I am trying to to detect/match encoded chars starting with %.

My Regex is ([%][2-9|A-F][0-9A-F]{1,2})+

On regexr.com it works and it matched what I need.

I used these strings for tests: caf%C3%A9+100%+noir%C20 and test%C3%A9+%C3%A0+100%

In my Java code it is returning only the first group.

String pattern = "([%][2-9|A-F][0-9A-F]{1,2})+";
Matcher matcher = Pattern.compile(pattern ).matcher(input);
if (matcher.find()) {
  for (int i = 0; i < matcher.groupCount(); i++) {
    System.out.println(matcher.group(i));
  }
}

And the output for caf%C3%A9+100%+noir%C20 is %C3%A9 and not %C3%A9 + %C20.

For test%C3%A9+%C3%A0+100% is %C3%A9 and not %C3%A9 + %C3%A0

Upvotes: 1

Views: 223

Answers (2)

bkis
bkis

Reputation: 2587

The Regex you are using is overly complicated. Also, the way you are trying to print all the matches doesn't work. Try this:

String input = "caf%C3%A9+100%+noir%C20";
String pattern = "(?:%[2-9A-F][0-9A-F]{1,2})+";
Matcher matcher = Pattern.compile(pattern ).matcher(input);

while (matcher.find()) {
    System.out.println(matcher.group());
}

This prints:

%C3%A9
%C20

Upvotes: 2

vertex
vertex

Reputation: 31

Based on @41686d6564 comment, the solution is to use a while loop and group(0):

String pattern = "([%][2-9A-F][0-9A-F]{1,2})+"; 
Matcher matcher = Pattern.compile(pattern).matcher(input);
while (matcher.find()) {
  System.out.println(matcher.group(0));
}

Upvotes: 2

Related Questions