Reputation: 99
I'm trying to extract the list of JDK download URLs from the Oracle website. I originally wrote a working version in Go and tried to port it to Groovy so it would run in Jenkins. The regex is not matching in Groovy. So I wrote a version in Java to troubleshoot:
The following Java code succeeds:
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) throws Exception {
URL oracle = new URL("http://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html");
BufferedReader in = new BufferedReader(
new InputStreamReader(oracle.openStream()));
StringBuffer contents = new StringBuffer();
String inputLine;
while ((inputLine = in.readLine()) != null)
contents.append(inputLine+"\n");
in.close();
Pattern pattern = Pattern.compile("(http://download\\.oracle\\.com/otn/java/jdk/\\du\\d{2}-b\\d{1,2}/(jdk-\\du\\d{2}-(linux|windows)-x64\\.(exe|tar\\.gz)))");
Matcher matcher = pattern.matcher(contents);
assert matcher.matches();
}
}
Yet the (hopefully equivalent) Groovy fails:
URL url = new URL('http://www.oracle.com/technetwork/java/javase/downloads/java-archive-javase8-2177648.html')
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuffer contents = new StringBuffer();
String inputLine
while ((inputLine = bufferedReader.readLine()) != null)
contents.append(inputLine + "\n");
bufferedReader.close();
def pattern = ~/(http\/\/download\.oracle\.com\/otn\/java\/jdk\/\du\d{2}-b\d{1,2}\/(jdk-\du\d{2}-(linux\|windows)-x64\.(exe\|tar\.gz)))/
def matcher = pattern.matcher(contents)
assert matcher.matches()
I've tried to minimize the differences between the Groovy and Java versions for troubleshooting purposes (yea - I know there are more idiomatic ways of slurping urls in Groovy).
Anyone know why the Groovy version fails?
Upvotes: 4
Views: 328
Reputation: 81988
Your biggest issue is that you are missing a :
in the second regular expression (you have http\/
instead of http:\/
). After that, it looks like they should match.
Upvotes: 1