user3430235
user3430235

Reputation: 439

remove elements in array which are substring of other elements in Java

I have a Arraylist of strings. I want to remove the strings that are substring of another string in that array. I have a python implementation but with java is tricky. Python

def filterSublist(lst):
 uniq = lst
 for elem in lst:
    uniq = [x for x in uniq if (x == elem) or (x not in elem)]
 return uniq

For java, I need to check if the element is contained in another element, if yes, then nothing, if not adding it to another one.

for(String element : list){
    for(int j = 0; j < list.size(); j++)
        if (! element.contains(list.get(j))){
            listUniq.add(date);}
}

The java solution doesn't work as it should. one reason is that it also compares element to the element itself. Any help is appreciated.

Upvotes: 1

Views: 2191

Answers (3)

condorcraft110 II
condorcraft110 II

Reputation: 261

You could try brute-force comparing every string to every other (except itself):

List<String> toRemove = new ArrayList<>();
for(int i = 0; i < list.size(); i++)
{
    String element0 = list.get(i);
    for(int j = 0; j < list.size(); j++)
    {
        String element1 = list.get(j);
        if(!element0.equals(element1) && element0.contains(element1) && !toRemove.contains(element1)) toRemove.add(element1);
    }
}

list.removeAll(toRemove);

Upvotes: 2

fps
fps

Reputation: 34460

With Java 8, you could use lambdas and the streaming API in a straightforward manner:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class Sample {

    public List<String> filterSublist(List<String> lst) {
        List<String> uniq = new ArrayList<String>(lst);
        lst.forEach(elem -> uniq.removeIf(x -> !x.equals(elem) && elem.contains(x)));
        return uniq;
    }

    public static void main(String[] args) {
        Sample sample = new Sample();

        List<String> filtered = sample.filterSublist(
            Arrays.asList("hello", "there", "the", 
                          "low", "hell", "lower", "here"));

        System.out.println(filtered); // [hello, there, lower]
    }
}

I've just negated the predicate in the removeIf() method, since I'm removing elements instead of adding them.

Upvotes: 4

Vladislav Lezhnin
Vladislav Lezhnin

Reputation: 837

Here is my suggested solution.

public Set<String> getUnique(List<String> source) {
            HashSet<String> result = new HashSet<String>();

            boolean contains = false;
            for (String s : source) {
                for (String unique : result) {
                    if (unique.contains(s)) {
                        contains = true;
                        break;
                    } else if (s.contains(unique)) {
                        result.remove(unique);
                        result.add(s);
                        contains = true;
                        break;
                    }
                }
                if (!contains) {
                    result.add(s);
                }
            }

            return result;

        }

In this solution we don't iterate over the whole collection each time but only checking if element is contained in result set. If there are many matches, we can save a lot of iterations.

Upvotes: 2

Related Questions