JAVA REGEX: Match until the specific character

I have this Java code

String cookies = TextUtils.join(";",  LoginActivity.msCookieManager.getCookieStore().getCookies());
Log.d("TheCookies", cookies);
Pattern csrf_pattern = Pattern.compile("csrf_cookie=(.+)(?=;)");
Matcher csrf_matcher = csrf_pattern.matcher(cookies);
while (csrf_matcher.find()) {
    json.put("csrf_key", csrf_matcher.group(1));
    Log.d("CSRF KEY", csrf_matcher.group(1));
}

The String contains something like this:

SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e

Im trying to get the csrf_cookie data by using this Regular Expression:

csrf_cookie=(.+)(?=;)

I expect a result like this in the code:

csrf_matcher.group(1);

e18d027da2fb95e888ebede711f1bc39

instead I get a:

3492f8670f4b09a6b3c3cbdfcc59e512;ci_session=8d823b309a361587fac5d67ad4706359b40d7bd0

What is the possible work around for this problem?

Upvotes: 1

Views: 2503

Answers (2)

Julio
Julio

Reputation: 5308

You are getting more data than expected because you are using an greedy '+' (It will match as long as it can)

For example the pattern a+ could match on aaa the following: a, aa, and aaa. Where the later is 'preferred' if the pattern is greedy.

So you are matching

csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e;

as long as it ends with a ';'. The first ';' is skipped with .+ and the last ';' is found with the possitive lookahead

To make a patter ungreedy/lazy use +? instead of + (so a+? would match a (three times) on aaa string)

So try with:

csrf_cookie=(.+?);

or just match anything that is not a ';'

csrf_cookie=([^;]*);

that way you don't need to make it lazy.

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520958

Here is a one-liner using String#replaceAll:

String input = "SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e";
String cookie = input.replaceAll(".*csrf_cookie=([^;]*).*", "$1");
System.out.println(cookie);

e18d027da2fb95e888ebede711f1bc39

Demo

Note: We could have used a formal regex pattern matcher, and in face you may want to do this if you need to do this search/replacement often in your code.

Upvotes: 3

Related Questions