Reputation: 2461
I have a list or an array of string
String [] elements = {"cat", "dog", "fish"};
and a string
String str = "This is a caterpillar and that is a dogger.";
I want to remove all the items of the array/list from the string if any exists in the string. so that the function should return a string
str = "This is a erpillar and that is a ger." (cat and dog removed from the string)
I can do something like this
private String removeElementsFromString (String str, String [] elements) {
if(Arrays.stream(elements).anyMatch(str::contains)){
for(String item : elements){
str = str.replace(item, "");
}
}
return str;
}
but what is the elegant way to change the for loop to something else.
Upvotes: 3
Views: 469
Reputation: 9566
Arrays.stream(elements).reduce(str, (r, w) -> r.replace(w, ""))
with the expected output.
If you want to reduce the input string until it is no longer possible, it is best to iterate until there are no changes
String n = str, o = null;
do {
n = stream(elements).reduce(o = n, (r, w) -> r.replace(w, ""));
} while(!n.equals(o));
System.out.println(n);
then, with input string
This is a caterpillar and that is a docatg.
you'll get
This is a erpillar and that is a .
If really want a fast algorithm use Aho-Corasick with cost O(n)
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : Trie.builder().addKeywords(elements).build().parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
System.out.println(sb.toString());
Aside solution performance comparison (with Oussama ZAGHDOUD's solution):
Equals = true // check all output are equals
Time1 = 18,548822 // Oussama ZAGHDOUD's solution O(n^2)
Time2 = 0,134459 // Aho-Corasick O(n) without precompute Trie
Time3 = 0,065056 // Aho-Corasick O(n) precomputed Trie
full bench code
static String alg1(String[] elements, String str) {
StringBuilder bf = new StringBuilder(str);
str =null;
Stream.of(elements).forEach(e -> {
int index = bf.indexOf(e);
while (index != -1) {
index = bf.indexOf(e);
if (index != -1) {
bf.delete(index, index + e.length());
}
}
});
return bf.toString();
}
static String alg2(String[] elements, String str) {
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : Trie.builder().addKeywords(elements).build().parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
return sb.toString();
}
static String alg3(Trie trie, String str) {
StringBuilder sb = new StringBuilder();
int begining = -1;
for (Emit e : trie.parseText(str)) {
sb.append(str, begining + 1, e.getStart());
begining = e.getEnd();
}
sb.append(str, begining + 1, str.length());
return sb.toString();
}
public static void main(String... args) throws JsonProcessingException {
final ThreadLocalRandom rnd = ThreadLocalRandom.current();
// test, use random numbers as words
String[] elements = range(0, 1_000).mapToObj(i -> "w" + rnd.nextInt()).toArray(String[]::new);
// intercalate random elements word with other random word
String str = range(0, 100_000)
.mapToObj(i -> "z" + rnd.nextInt() + " " + elements[rnd.nextInt(elements.length)])
.collect(joining(", "));
Trie trie = Trie.builder().addKeywords(elements).build();
long t0 = System.nanoTime();
String s1 = alg1(elements, str);
long t1 = System.nanoTime();
String s2 = alg2(elements, str);
long t2 = System.nanoTime();
String s3 = alg3(trie, str);
long t3 = System.nanoTime();
System.out.printf("Equals = %s%nTime1 = %f%nTime2 = %f%nTime3 = %f%n",
s1.equals(s2) && s2.equals(s3), (t1 - t0) * 1e-9, (t2 - t1) * 1e-9, (t3 - t2) * 1e-9);
}
Upvotes: 2
Reputation: 79015
The following one-liner does the job:
str = str.replaceAll(Arrays.stream(elements).map(s -> "(?:" + s + ")").collect(Collectors.joining("|")), "");
Demo:
import java.util.Arrays;
import java.util.stream.Collectors;
public class Main {
public static void main(String[] args) {
String[] elements = { "cat", "dog", "fish" };
String str = "This is a caterpillar and that is a dogger.";
str = str.replaceAll(Arrays.stream(elements).map(s -> "(?:" + s + ")").collect(Collectors.joining("|")), "");
System.out.println(str);
}
}
Output:
This is a erpillar and that is a ger.
Explanation:
Arrays.stream(elements).map(s -> "(?:" + s + ")").collect(Collectors.joining("|"))
results into the regex, (?:cat)|(?:dog)|(?:fish)
which means cat
or dog
or fish
.
The next step is to replace this resulting regex by ""
.
Upvotes: 6
Reputation: 2159
I think that using StringBuilder instead of String is more appropriate here:
import java.io.IOException;
import java.util.stream.Stream;
public class Bounder {
public static void main(String[] args) throws IOException {
String[] elements = { "cat", "dog", "fish" };
String str = "This is a catcatcatcatcatcatcaterpillar ancatcatcatcatd thcatcatcatat is a dogdogdogdogdogdogger.";
// Use StringBuilder here instead of String
StringBuilder bf = new StringBuilder(str);
str =null;
System.out.println("Original String = " + bf.toString());
Stream.of(elements).forEach(e -> {
int index = bf.indexOf(e);
while (index != -1) {
index = bf.indexOf(e);
if (index != -1) {
bf.delete(index, index + e.length());
}
}
});
System.out.println("Result = " + bf.toString());
}
}
Output :
Original String = This is a catcatcatcatcatcatcaterpillar ancatcatcatcatd thcatcatcatat is a dogdogdogdogdogdogger.
Result = This is a erpillar and that is a ger.
Upvotes: 3
Reputation: 40034
You can do it like this. Just use a simple loop.
for (String word : elements) {
str = str.replace(word,"");
}
Upvotes: 0
Reputation: 59950
I would simply use:
private String removeElementsFromString(String str, String[] elements) {
for (String item : elements) {
str = str.replace(item, "");
}
return str;
}
I don't see any advantage of the first condition:
if(Arrays.stream(elements).anyMatch(str::contains)) {
Upvotes: 1
Reputation: 140309
The most concise way would be to use replaceAll
, which accepts a regular expression as the first parameter:
String newStr = str.replaceAll(String.join("|", elements), "");
This only works because the things in elements
have no special regex characters. If any of them did (or there was a chance they did), you'd have to quote them:
String pattern = Arrays.stream(elements).map(Pattern::quote).collect(Collectors.joining("|"));
Note, however, that this would operate in a single pass. So if you had a string like:
docatg
this approach would result in dog
, whereas an approach which does input.replace("cat", "").replace("dog", "")
would remove the dog
as well.
Upvotes: 0