Reputation: 53
I am trying to write a simple program that takes two user inputs: a String to be split, and a String that specifies one or more delimiters. The program should print an array of strings consisting of the substrings split AND the delimiters. I must implement the public static String[] split(String s, String regex)
If the String to be split is
cd#34#abef#1256
My current code correctly outputs
[cd, 34, abef, 1256]
What I need outputted is
[cd, #, 34, abef, #, 1256]
And what if the String to be split has two user-specified delimiters
cd?34?abef#1256
How can I split that so it looks like
[cd, ?, 34, ?, abef, #, 1256]
None of the previous questions I've looked into used user-specified Strings and delimiters.
Here's my current code:
import java.util.Arrays;
import java.util.Scanner;
public class StringSplit
{
public static void main(String[] args)
{
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String str = scan.next();
System.out.print("Specify delimiter(s): ");
String del = scan.next();
String[] result = split(str, del);
System.out.print(Arrays.toString(result));
}
public static String[] split(String s, String regex)
{
String[] myString = s.split(regex);
return myString;
}
}
Upvotes: 0
Views: 1800
Reputation: 1623
You can use regex based on your delimiter and hack on appendReplacement/appendTail to capture non-matched characters. Here the code with explanation :
public class SplitWithDelimiter {
//Do compilation on build, make instance thread-safe !
private final Pattern pattern;
public SplitWithDelimiter(String regex) {
pattern = Pattern.compile(regex);
}
public List<String> split(String string) {
List<String> substrings = new ArrayList<>(); // Value to return
Matcher m = pattern.matcher(string); // Matcher to find delimiters
StringBuffer buffer = new StringBuffer(); // Buffer to reuse (see hack belows)
while (m.find()) { // Find next
m.appendReplacement(buffer, ""); // Hack: Append non-matched characters to the empty buffer
substrings.add(buffer.toString()); // Adds buffer content
buffer.delete(0, buffer.length()); // Reset buffer (but keep allocate char array)
substrings.add(m.group()); // Adds matched delimiter
}
m.appendTail(buffer); // Hack: Append left characters to the empty buffer
substrings.add(buffer.toString()); // Adds buffer content
return substrings;
}
public static void main(String[] args) {
String input = "?cd?34?abef#1256"; // User input
String chars = "#?";
String regex = "[" + Pattern.quote(chars) + "]"; // Builds a regular expression from char list
List<String> splits = new SplitWithDelimiter(regex).split(input); // Do the split
System.out.println(splits);
}
}
Note: I assume that sequences of delimiter character are independant. If not just adapt the poor regex generation from user input. I also assume that you want to capture of empty sequences of "non-matcher characters". If not required, it's easy to filter on when buffer
is empty.
Upvotes: 0
Reputation: 785128
You can use this lookahead and lookbehind based regex for splitting:
(?<=#)|(?=#)
Which means split on positions where next char is #
or previous char is #
For multiple delimiters:
(?<=[?#])|(?=[?#])
Your Java method can be this:
public static String[] split(String s, String d) {
String del = Pattern.quote(d);
String[] myString = s.split("(?<=[" + del + "])|(?=[" + del + "])");
return myString;
}
And call it as:
System.out.println(
Arrays.toString(split("aa{bb}(cc)[dd]ee#ff...gg?hh*+ii", "#.?*+-[](){}"))
);
Output:
[aa, {, bb, }, (, cc, ), [, dd, ], ee, #, ff, ., ., ., gg, ?, hh, *, +, ii]
Upvotes: 4
Reputation: 159086
split()
by definition excludes the delimiters, so you can't use it unless you use zero-width look-ahead/-behind groups, and even then you may have trouble with special characters.
Do it yourself:
public static List<String> split(String text, String delimiters) {
List<String> result = new ArrayList<>();
int start = 0;
for (int i = 0; i < text.length(); i++)
if (delimiters.indexOf(text.charAt(i)) != -1) {
if (start < i)
result.add(text.substring(start, i));
result.add(text.substring(i, i + 1));
start = i + 1;
}
if (start < text.length())
result.add(text.substring(start));
return result;
}
If you need to return value to be String[]
, change the return
statement:
return result.toArray(new String[result.size()]);
Test
System.out.println(split("cd#34#abef#1256", "#"));
System.out.println(split("cd?34?abef#1256", "#?"));
System.out.println(split("aa{bb}(cc)[dd]ee#ff...gg?hh*+ii", "#.?*+[](){}"));
Output
[cd, #, 34, #, abef, #, 1256]
[cd, ?, 34, ?, abef, #, 1256]
[aa, {, bb, }, (, cc, ), [, dd, ], ee, #, ff, ., ., ., gg, ?, hh, *, +, ii]
Note: The third test case will likely fail on any implementation that tries to use regex.
Upvotes: 3
Reputation: 1270
in your case regexp must be look like this [?#]
this is how your split method looks like
public static String[] split(String s, String regex)
{
String[] myString = s.split("["+regex+"]");
return myString;
}
Upvotes: -1
Reputation: 5742
Simple solution using char[] and comparing each char :
public static void main(String[] args)
{
// example string
String str = "vv*aabb?eegg?fff";
char[] chars=str.toCharArray();
// list of delimiters
List<Character> delimiters = new ArrayList<Character>();
delimiters.add('*');
delimiters.add('?');
StringBuilder sb=new StringBuilder();
for(int i=0 ; i<chars.length;i++){
if (delimiters.contains(chars[i])){
// if its a delimiter - add commas
sb.append(", " + chars[i] + ", ");
} else {
// if not - add the char only
sb.append(chars[i]);
}
}
System.out.println(sb.toString());
}
Upvotes: 0
Reputation: 202
This is for one delimeter, you expand it for second delimeter
import java.util.Arrays;
import java.util.Scanner;
public class StringSplit
{
public static void main(String[] args)
{
Scanner scan = new Scanner(System.in);
System.out.print("Enter a string: ");
String str = scan.next();
System.out.print("Specify delimiter(s): ");
String del = scan.next();
String[] result = split(str, del);
System.out.print(Arrays.toString(result));
}
public static String[] split(String s, String regex)
{
String[] myString = s.split(regex);
int templength = myString.length;
String[] temp = new String[(2*templength)];
int y=0;
for (int i=0;i<templength ;i++) {
temp[y] = myString[i];
temp[++y] = regex;
y++;
}
String temp2[]= Arrays.copyOf(temp, temp.length-1);
return temp2;
}
}
Upvotes: -1
Reputation: 1626
You can use a regex directly and a loop, like this:
List<String> parts = new ArrayList<>();
Pattern p = Pattern.compile("(#|\\?|[^#\\?]+)");
Matcher m = p.matcher(s);
while(m.find()) {
parts.add(m.group(1));
}
Note that the regexp is just a string. If you want to use a custom delimiter, you can dynamically create the pattern.
Upvotes: 0