Reputation: 21
I was trying to find specific word from the string But I couldn't able to find the exact match regex. The String can dynamically be changed in two forms
https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F
https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H687987R
I need to find the pattern to get the token value.
I have tried with this regex (?<=token\=).*
I was able to get the token from first string but not in second.
Output should be like below.
EC-1J942953KU425764F
Upvotes: 0
Views: 1492
Reputation: 16520
How about using the regex pattern
[&?]token=([^&\r\n]*)
Then just extract capture group 1
String regex = "[&?]token=([^&\r\n]*)";
String input =
"https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F\n" +
"https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H6879";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while(matcher.find())
{
System.out.printf("Token is %s%n", matcher.group(1));
}
Upvotes: 0
Reputation: 4266
If the format is always one of these two, and you don't specifically want to use a regex
, then something like this may suffice:
int val = str.indexOf("paymentid");
System.out.println(str.substring(str.indexOf("token"), (val != -1) ? val - 1 : str.length()));
Or of course you can replace val
with str.indexOf("paymentid")
and do it in one line.
Upvotes: 0
Reputation: 91
Instead you can use spring-web UriComponentsBuilder
String url = "https://www.test.com/vgi-bin/tmpscr?cmd=_temp-out&useraction=commit&token=EC-1J942953KU425764F&paymentid=PAY-12345K4776H687987R";
MultiValueMap<String, String> queryParams =
UriComponentsBuilder.fromUriString(url).build().getQueryParams();
queryParams.get("token")
or you can use URIBuilder
List<NameValuePair> queryParams = new URIBuilder(url)
.getQueryParams();
Upvotes: 0
Reputation: 27770
You don't need the lookbehind if you define a capture group instead, which can be a little easier to read IMO.
Also note that the semicolon character used to be an allowed URL param separator according to the spec, so you may want to include that when you match param values in case you need to support an older or inconsistent platform:
token=([^&;\n]+)
The second match should be the token itself.
Upvotes: 0
Reputation: 163632
The .*
matches any character zero or more times and is greedy and in your regex will match until the end of the string.
You could use your positive lookbehind and followed by matching not an ampersand or a newline one or more times using a negated character class [^&\n]+
. You do not have to escape the equals sign.
(?<=token=)[^&\n]+
Upvotes: 1