vishesh
vishesh

Reputation: 2045

Java regex to match these strings

I have following 2 urls:

https://docs.google.com/a/abc.com/spreadsheet/ccc?key=0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E&usp=drive_web#gid=0

https://docs.google.com/a/abc.com/file/d/0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E/edit

I am using following regex:

Pattern.compile(".*key=|/d/(.[^&/])")

as a result of it I want that the matcher.group() returns both urls upto fileId(0Aj9Oa8x5fqsL678FNhOUF0ZEN5b25iVVZNdjdUQm9mM1E) part and matcher.group(1) returns the fileId.

but I am not getting these results.

Upvotes: 0

Views: 58

Answers (3)

fge
fge

Reputation: 121710

If you don't need to use a regex, then use URI:

private static final Pattern PARAM_SEPARATOR = Pattern.compile("&");
private static final Pattern PATH_MATCHER = Pattern.compile("/file/d/([^/]+)");

// In query parameter...
public static String getKeyQueryParamFromURI(final String input)
{
    final URI uri = URI.create(input);
    final String params = uri.getQuery();
    if (params == null)
        return null;
    for (final String param: PARAM_SEPARATOR.split(input))
        if (param.startsWith("key="))
            return param.substring(4);
    return null;
}

// In path...
public static String getPathMatcherFromURI(final String input)
{
    final URI uri = URI.create(input);
    final String path = uri.getPath();
    if (path == null)
        return null;
    final Matcher m = PATH_MATCHER.matcher(input);
    return m.find() ? m.group(1) : null;
}

Note that unlike a regex, you will receive the result unescaped. If for instance the URI reads key=a%20b, this will return you "a b"!

If you insist on using a regex (why?), then do that instead for the query parameter:

private static final Pattern PATTERN = Pattern.compile("(?<=[?&])key=([^&]+)");

public static String getKeyQueryParamFromURI(final String input)
{
    final Matcher m = PATTERN.matcher(input);
    return m.find() ? m.group(1) : null;
}

But you'll have to unescape the parameter value yourself...

Upvotes: 1

Orel Eraki
Orel Eraki

Reputation: 12196

It's prefer for two different regex pattern to split the regex statement and not use |(OR). With using different pattern you will have the first capture group the result you wanted.

Pattern1:

.*key=(.*)=.*

Pattern2:

.*\/file\/?\/(.*)\/.*

Upvotes: 0

collapsar
collapsar

Reputation: 17238

you fell victim to the precedence rules in regex expressions and forgot the repetition specifier for your character class. try

Pattern.compile("(key=|/d/)([^&/]+)")

your result will be in $2.

Upvotes: 1

Related Questions