brain storm
brain storm

Reputation: 31252

how does this regex work in Java?

I have the following piece of code that splits the string and returns an array of strings.

public static void main(String[] args) {
      String name="what is going on";
      String[] ary = name.split("");
      System.out.println(Arrays.toString(ary));
       }
//output: [, w, h, a, t,  , i, s,  , g, o, i, n, g,  , o, n]  

To prevent the trailing spaces, the following regex was employed during split. but I would like to know how it works

public static void main(String[] args) {
          String name="what is going on";
          String[] ary = name.split("(?!^)");
          System.out.println(Arrays.toString(ary));
           } //[w, h, a, t,  , i, s,  , g, o, i, n, g,  , o, n]

if someone can explain what the regex looks for and how that regex is used for split, it will be very helpful for Java beginner community. Thanks a lot

Upvotes: 0

Views: 137

Answers (2)

Eric Jablow
Eric Jablow

Reputation: 7889

In your first example, the empty pattern matches before every character in the string. So it matches before the first character, before the second, etc. The String.split(String) Javadoc indicates that trailing empty strings are ignored, but the returned strings includes what is before the first match. So, the array is {"", "w", "h", ..., "n"}.

The second example has a regexp that matches any place except for the beginning of the string. The (? and ) bound a lookahead. The ! makes it a negative lookahead and the ^ means the beginning of the string. Moreover, no characters are actually consumed by the regexp. So, it matches after the first character, after the second, and so on. None of the characters themselves get consumed, so you have:

 w h a t   i s   g o   i n g   o n
  ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^

The carets here are break points with a space above.

Upvotes: 4

Csoki
Csoki

Reputation: 139

It splits the string to substrings and divide it on the regex char or string: BUT not puts the regex into output so:

string s1 = "divided by spaces"; and s1.split("\s")[0] will be the divided s1.split("\s")[1] will be the by and NOT the " "

Upvotes: 1

Related Questions