user7793464
user7793464

Reputation: 31

Split string at every Nth occurrence of character

I need to split a string like the following after every fourth pipe -

Input -

userid456|userid457|userid458|userid459|userid460|userid461|userid462|userid463|userid464|userid465|userid466|userid467|userid468|userid469|userid470|userid471|userid472|userid473|userid474|userid475|userid476|userid477|userid478|userid479|userid480|userid481|userid482|userid483

Output -

userid456|userid457|userid458|userid459

userid460|userid461|userid462|userid463

userid464|userid465|userid466|userid467

userid468|userid469|userid470|userid471

etc.

Was trying to figure out if there is an easy way using regexp and String.split()

Upvotes: 3

Views: 2891

Answers (3)

Hatem
Hatem

Reputation: 31

The closest I could come up with is this:

String[] parts = input.split("(?<=\\G(\\w{1,11}\\|){4}+)");

This has two issues:

  1. you must specify a max number of characters between the | chars
  2. It will include the | at the end of each part

The output looks like:

userid456|userid457|userid458|userid459|
userid460|userid461|userid462|userid463|
userid464|userid465|userid466|userid467|
userid468|userid469|userid470|userid471|
userid472|userid473|userid474|userid475|
userid476|userid477|userid478|userid479|
userid480|userid481|userid482|userid483

Upvotes: 0

anubhava
anubhava

Reputation: 784988

Instead of using split you can use a match using this regex:

(?:[^|]+\|){3}[^|]+

RegEx Demo

Java Code:

final String regex = "(?:[^|]*\\|){3}[^|]*";

final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Match: " + matcher.group(0));
}

Upvotes: 1

user557597
user557597

Reputation:

You would be better off matching instead of splitting.
It's the same effect, but with more control.

((?:[^|]*?\|){3}[^|]*?)\||([\S\s]+)

Explained

     (                             # (1 start)
          (?:                           # Group - 3 pipes
               [^|]*?                        # Optional non-pipe chars
               \|                            # Pipe
          ){3}                          # End group 
          [^|]*?                        # Optional non-pipe chars
     )                             # (1 end)
     \|                            # 4th pipe, consume but do not capture

  |  
     ( [\S\s]+ )                   # (2), Final text less than 4 pipes

Upvotes: 0

Related Questions