user791437
user791437

Reputation: 99

Why does this regex working Java but not Groovy

I'm trying to extract the list of JDK download URLs from the Oracle website. I originally wrote a working version in Go and tried to port it to Groovy so it would run in Jenkins. The regex is not matching in Groovy. So I wrote a version in Java to troubleshoot:

The following Java code succeeds:

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) throws Exception {
        URL oracle = new URL("http://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html");
        BufferedReader in = new BufferedReader(
                 new InputStreamReader(oracle.openStream()));

        StringBuffer contents = new StringBuffer();
        String inputLine;
        while ((inputLine = in.readLine()) != null)
            contents.append(inputLine+"\n");
        in.close();

        Pattern pattern = Pattern.compile("(http://download\\.oracle\\.com/otn/java/jdk/\\du\\d{2}-b\\d{1,2}/(jdk-\\du\\d{2}-(linux|windows)-x64\\.(exe|tar\\.gz)))");
        Matcher matcher = pattern.matcher(contents);
        assert matcher.matches();
    }
}

Yet the (hopefully equivalent) Groovy fails:

URL url = new URL('http://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html')
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuffer contents = new StringBuffer();
String inputLine
while ((inputLine = bufferedReader.readLine()) != null)
    contents.append(inputLine + "\n");
bufferedReader.close();

def pattern = ~/(http\/\/download\.oracle\.com\/otn\/java\/jdk\/\du\d{2}-b\d{1,2}\/(jdk-\du\d{2}-(linux\|windows)-x64\.(exe\|tar\.gz)))/
def matcher = pattern.matcher(contents)
assert matcher.matches()

I've tried to minimize the differences between the Groovy and Java versions for troubleshooting purposes (yea - I know there are more idiomatic ways of slurping urls in Groovy).

Anyone know why the Groovy version fails?

Upvotes: 4

Views: 328

Answers (1)

cwallenpoole
cwallenpoole

Reputation: 81988

Your biggest issue is that you are missing a : in the second regular expression (you have http\/ instead of http:\/). After that, it looks like they should match.

Upvotes: 1

Related Questions