zain
zain

Reputation: 314

Regex for matching any character within separators in a string

I have strings of the format

/person/PATH_VARIABLE/address/PATH_VARIABLE
/person/PATH_VARIABLE/PATH_VARIABLE
/person/PATH_VARIABLE/address
.....
etc

I need to replace the PATH_VARIABLE with regular expression so that it allows me to match anything between separator / or nothing at end so that when I match the regex string with my input String I have a complete match

/person/abc/address/xy123 matches with the first
/person/abc/1233 matches with the second

I have tried a few things

public static void main(String[] args) {

        // Sample Strings to be subtituted
        String y = "/person/PATH_VARIABLE/address/PATH_VARIABLE";
        String y1 = "/person/PATH_VARIABLE/PATH_VARIABLE";

        // Tried this 
        //y = y.replaceAll("PATH_VARIABLE", "\\((.*?)\\)");
        //y1 = y1.replaceAll("PATH_VARIABLE", "\\((.*?)\\)");

        // Tried this one
        y = y.replaceAll("PATH_VARIABLE", "(?<=/)(.*?)(?=/?)");
        y1 = y1.replaceAll("PATH_VARIABLE", "(?<=/)(.*?)(?=/?)");

        // Sample input strings to match 
        String x = "/person/user.zian/address/123";
        String x1 = "/person/nhbb/bhbhb/ghyu";
        String x2 = "/person/nhbb/bhbhb";


        System.out.println(x.matches(y)); // returns true
        System.out.println(x1.matches(y)); // returns false
        System.out.println(x1.matches(y1)); // returns true but should return false
        System.out.println(x2.matches(y1)); // returns true


    }

Upvotes: 1

Views: 597

Answers (1)

Sweeper
Sweeper

Reputation: 271040

You are overcomplicating the regex to replace "PATH_VARIABLE" with. It can be as simple as [^/]* - every character that is not a /.

y = y.replaceAll("PATH_VARIABLE", "[^/]*");
y1 = y1.replaceAll("PATH_VARIABLE", "[^/]*");

However, this will only work if the rest of your path does not contain characters that have special meanings in regex. In the specific case you showed, it doesn't have such characters.

If your path does contain characters like that, then you need to wrap everything other than PATH_VARIABLE in \Q and \E, so that they are treated literally.

For example, /person+hello/PATH_VARIABLE/address/PATH_VARIABLE would need to become this first:

\Q/person+hello/\EPATH_VARIABLE\Q/address/\EPATH_VARIABLE

and then you can replace the PATH_VARIABLEs.

You can add the \Q and \E by finding the start and end indices of all the PATH_VARIABLEs and inserting them in.

Upvotes: 1

Related Questions