user3775041
user3775041

Reputation: 332

Java PathMatcher not working properly on Windows

I try to implement a JUnit test for my SimpleFileVisitor but the used PathMatcher doesn't work properly on Windows. The problem seems to be the PathMatcher with a regex pattern behaves different on Linux and Windows:

import java.nio.file.FileSystems;
import java.nio.file.PathMatcher;
import java.nio.file.Paths;

public class TestApp{

     public static void main(String []args){
        final PathMatcher glob = FileSystems.getDefault().getPathMatcher("glob:{/,/test}");
        final PathMatcher regex = FileSystems.getDefault().getPathMatcher("regex:/|/test");

        System.err.println(glob.matches(Paths.get("/")));       // --> Linux=true  Windows=true
        System.err.println(glob.matches(Paths.get("/test")));   // --> Linux=true  Windows=true
        System.err.println(glob.matches(Paths.get("/test2")));  // --> Linux=false Windows=false

        System.err.println(regex.matches(Paths.get("/")));      // --> Linux=true  Windows=false
        System.err.println(regex.matches(Paths.get("/test")));  // --> Linux=true  Windows=false
        System.err.println(regex.matches(Paths.get("/test2"))); // --> Linux=false Windows=false
     }  
}

But I've a longer list in my regex for multiple files which are not easy to migrate to glob syntax. Otherwise I've nested groups which is not allowed or an even longer list if I wrote every pattern as a not-grouped pattern.

What is the best way to do this in a cross-platform manner?

Upvotes: 4

Views: 1195

Answers (3)

HungNM2
HungNM2

Reputation: 3425

This code worked well for window and linux:

    String pattern = "regex:\\./src/main/java/.*\\.java|\\./src/main/java/.*\\.txt";
    String newPattern;
    
    if(File.separator.equals("\\")) { //window fix
        newPattern = pattern.replace("/", "\\\\"); 
    }else { //linux
        newPattern = pattern;
    }
    
    PathMatcher pathMatcher = FileSystems.getDefault().getPathMatcher(newPattern);

Upvotes: 1

DuncG
DuncG

Reputation: 15186

If you want a version which does not include Windows file separator character in the regex when the code is run on Linux, you can also use:

String sep = Pattern.quote(File.separator);
PathMatcher regex = FileSystems.getDefault().getPathMatcher("regex:"+sep+"|"+sep+"test");

This prints same output on Linux/Windows.

Upvotes: 1

Fullslack
Fullslack

Reputation: 300

First I want to say this is undocumented behavior in the glob handling syntax of the PathMatcher. It appears to convert backward slashes (as common on the Windows filesystems) to forward slashes (or vice-versa). Thus making it always work between Linux and Windows.

The following line demonstrates the different output:

System.out.println(Paths.get("/test")); // Will output '\test' on Windows, '/test' on Linux

To solve the original question we need to get some RegexFu going.

FileSystems.getDefault().getPathMatcher("regex:/|/test");

Needs to become

FileSystems.getDefault().getPathMatcher("regex:(/|\\\\)|((/|\\\\)test)");
  • The first group will check between / and \ (you need \\ to escape the \, but because Java it needs to be input like \\\\).
  • Second group is made up out of two parts, where again the first part checks between either / or \ and the second part is the text entered in the question.

Thanks to @user3775041 for a bit cleaner regex:

FileSystems.getDefault().getPathMatcher("regex:[/\\\\]|[/\\\\]test");

This has been tested on Windows 10 and Ubuntu 20.04 with both having the following output:

true
true
false
true
true
false

Edit: a good site to test regex patterns in Java is https://www.regexplanet.com/advanced/java/index.html

Upvotes: 2

Related Questions