Frank Lee
Frank Lee

Reputation: 53

Java arrays - split() with user-specified delimiters

I am trying to write a simple program that takes two user inputs: a String to be split, and a String that specifies one or more delimiters. The program should print an array of strings consisting of the substrings split AND the delimiters. I must implement the public static String[] split(String s, String regex)

If the String to be split is

cd#34#abef#1256

My current code correctly outputs

[cd, 34, abef, 1256]

What I need outputted is

[cd, #, 34, abef, #, 1256]

And what if the String to be split has two user-specified delimiters

cd?34?abef#1256

How can I split that so it looks like

[cd, ?, 34, ?, abef, #, 1256]

None of the previous questions I've looked into used user-specified Strings and delimiters.

Here's my current code:

import java.util.Arrays;
import java.util.Scanner;

public class StringSplit
{
    public static void main(String[] args)
    { 
        Scanner scan = new Scanner(System.in);
        System.out.print("Enter a string: ");
        String str = scan.next();
        System.out.print("Specify delimiter(s): ");
        String del = scan.next();
        String[] result = split(str, del);
        System.out.print(Arrays.toString(result));
    }

    public static String[] split(String s, String regex)
    {
        String[] myString = s.split(regex);
        return myString;
    }
}

Upvotes: 0

Views: 1800

Answers (7)

LoganMzz
LoganMzz

Reputation: 1623

You can use regex based on your delimiter and hack on appendReplacement/appendTail to capture non-matched characters. Here the code with explanation :

public class SplitWithDelimiter {

  //Do compilation on build, make instance thread-safe !
  private final Pattern pattern;
  public SplitWithDelimiter(String regex) {
    pattern = Pattern.compile(regex);
  }

  public List<String> split(String string) {
    List<String> substrings = new ArrayList<>(); // Value to return

    Matcher m = pattern.matcher(string);         // Matcher to find delimiters
    StringBuffer buffer = new StringBuffer();    // Buffer to reuse (see hack belows)

    while (m.find()) {                           // Find next

      m.appendReplacement(buffer, "");           // Hack: Append non-matched characters to the empty buffer
      substrings.add(buffer.toString());         // Adds buffer content
      buffer.delete(0, buffer.length());         // Reset buffer (but keep allocate char array)

      substrings.add(m.group());                 // Adds matched delimiter 
    }

    m.appendTail(buffer);                        // Hack: Append left characters to the empty buffer
    substrings.add(buffer.toString());           // Adds buffer content

    return substrings;
  }

  public static void main(String[] args) {

    String input = "?cd?34?abef#1256";  // User input
    String chars = "#?";

    String regex = "[" + Pattern.quote(chars) + "]";  // Builds a regular expression from char list
    List<String> splits = new SplitWithDelimiter(regex).split(input); // Do the split
    System.out.println(splits);
  }
}

Note: I assume that sequences of delimiter character are independant. If not just adapt the poor regex generation from user input. I also assume that you want to capture of empty sequences of "non-matcher characters". If not required, it's easy to filter on when buffer is empty.

Upvotes: 0

anubhava
anubhava

Reputation: 785128

You can use this lookahead and lookbehind based regex for splitting:

(?<=#)|(?=#)

Which means split on positions where next char is # or previous char is #

For multiple delimiters:

(?<=[?#])|(?=[?#])

RegEx Demo

Your Java method can be this:

public static String[] split(String s, String d) {
    String del = Pattern.quote(d);
    String[] myString = s.split("(?<=[" + del + "])|(?=[" + del + "])");
    return myString;
}

And call it as:

System.out.println(
   Arrays.toString(split("aa{bb}(cc)[dd]ee#ff...gg?hh*+ii", "#.?*+-[](){}"))
);

Output:

[aa, {, bb, }, (, cc, ), [, dd, ], ee, #, ff, ., ., ., gg, ?, hh, *, +, ii]

Upvotes: 4

Andreas
Andreas

Reputation: 159086

split() by definition excludes the delimiters, so you can't use it unless you use zero-width look-ahead/-behind groups, and even then you may have trouble with special characters.

Do it yourself:

public static List<String> split(String text, String delimiters) {
    List<String> result = new ArrayList<>();
    int start = 0;
    for (int i = 0; i < text.length(); i++)
        if (delimiters.indexOf(text.charAt(i)) != -1) {
            if (start < i)
                result.add(text.substring(start, i));
            result.add(text.substring(i, i + 1));
            start = i + 1;
        }
    if (start < text.length())
        result.add(text.substring(start));
    return result;
}

If you need to return value to be String[], change the return statement:

    return result.toArray(new String[result.size()]);

Test

System.out.println(split("cd#34#abef#1256", "#"));
System.out.println(split("cd?34?abef#1256", "#?"));
System.out.println(split("aa{bb}(cc)[dd]ee#ff...gg?hh*+ii", "#.?*+[](){}"));

Output

[cd, #, 34, #, abef, #, 1256]
[cd, ?, 34, ?, abef, #, 1256]
[aa, {, bb, }, (, cc, ), [, dd, ], ee, #, ff, ., ., ., gg, ?, hh, *, +, ii]

Note: The third test case will likely fail on any implementation that tries to use regex.

Upvotes: 3

JAVAC
JAVAC

Reputation: 1270

in your case regexp must be look like this [?#]

this is how your split method looks like

public static String[] split(String s, String regex)
    {
        String[] myString = s.split("["+regex+"]");
        return myString;
    }

Upvotes: -1

chenchuk
chenchuk

Reputation: 5742

Simple solution using char[] and comparing each char :

public static void main(String[] args)
{ 
    // example string
    String str = "vv*aabb?eegg?fff";
    char[] chars=str.toCharArray();

    // list of delimiters
    List<Character> delimiters = new ArrayList<Character>();
    delimiters.add('*');
    delimiters.add('?');
    StringBuilder sb=new StringBuilder();

    for(int i=0 ; i<chars.length;i++){
        if (delimiters.contains(chars[i])){
            // if its a delimiter - add commas
            sb.append(", " + chars[i] + ", ");
        } else {
            // if not - add the char only
            sb.append(chars[i]);
        }
    }
    System.out.println(sb.toString());
}

Upvotes: 0

Eveis
Eveis

Reputation: 202

This is for one delimeter, you expand it for second delimeter

import java.util.Arrays;
import java.util.Scanner;

public class StringSplit
{
    public static void main(String[] args)
    { 
        Scanner scan = new Scanner(System.in);
        System.out.print("Enter a string: ");
        String str = scan.next();
        System.out.print("Specify delimiter(s): ");
        String del = scan.next();
        String[] result = split(str, del);
        System.out.print(Arrays.toString(result));
    }

    public static String[] split(String s, String regex)
    {
        String[] myString = s.split(regex);
        int templength = myString.length;
        String[] temp = new String[(2*templength)];
        int y=0;
        for (int i=0;i<templength ;i++) {

            temp[y] = myString[i];

            temp[++y] = regex;
            y++;

        }
       String temp2[]= Arrays.copyOf(temp, temp.length-1);
        return temp2;
    }
}

Upvotes: -1

sh0rug0ru
sh0rug0ru

Reputation: 1626

You can use a regex directly and a loop, like this:

List<String> parts = new ArrayList<>();
Pattern p = Pattern.compile("(#|\\?|[^#\\?]+)");
Matcher m = p.matcher(s);
while(m.find()) {
  parts.add(m.group(1));
}

Note that the regexp is just a string. If you want to use a custom delimiter, you can dynamically create the pattern.

Upvotes: 0

Related Questions