Reputation: 736

Best way to filter out collections based on certain rules

I am trying to find a best way to filter out some items from a list based on certain rules. For example we have

public class Person{
    String name;
    String sex;
    String dob;
    String contactNo;
    Person(String name, String sex, String dob, String contactNo) {
        this.name = name;
        this.sex = sex;
        this.dob = dob;
        this.contactNo = contactNo;
    }
}

List<Person> persons = Arrays.asList(new Person("Bob", "male", "19800101", "12345"),                
        new Person("John", "male", "19810101", "12345"),
        new Person("Tom", "male", "19820101", "12345"),
        new Person("Helen", "female", "19800101", "12345"),
        new Person("Jack", "male", "19830101", "12345"),
        new Person("Suan", "female", "19850101", "12345"));

I want to remove the pair of male and female which have the same dob and contactNo (Remove Bob and Helen in above example). I implemented this as below using a nested loop which worked but looks ugly. Is there a better to achieve this please? Can I implement predicate to do this?

public void filterPersons() {       
    List<Person> filtered = new ArrayList<Person>();

    for (Person p: persons) {
        boolean pairFound = false;
        for (Person t: persons) {
            if ((p.sex.equals("male") && t.sex.equals("female")) || (p.sex.equals("female") && t.sex.equals("male"))) {
                if (p.dob.equals(t.dob) && p.contactNo.equals(t.contactNo)) {                       
                    pairFound = true;
                    break;
                }
            }
        }
        if (!pairFound) {filtered.add(p);}          
    }

    System.out.println("filtered size is: " + filtered.size());
    for (Person p: filtered) {
        System.out.println(p.name);
    }
}

Many thanks.

I've rewritten the above method something like below which looks better imho:

public void testFilter() {      
    Predicate<Person> isPairFound = new Predicate<Person>() {
        @Override public boolean apply(Person p) {              
            boolean pairFound = false;
            for (Person t: persons) {
                if ((p.sex.equals("male") && t.sex.equals("female")) || 
                        (p.sex.equals("female") && t.sex.equals("male"))) {
                    if (p.dob.equals(t.dob) && p.contactNo.equals(t.contactNo)) {                       
                        pairFound = true;
                        break;
                    }
                }
            }
            return pairFound;
        }
    };

    Iterable<Person> filtered = Iterables.filter(persons, isPairFound);     
    for (Person p: filtered) {
        System.out.println(p.name);
    }
}

Upvotes: 0

Answers (3)

pillingworth

Reputation: 3238

I don't think the nested for loops are particularly ugly. You are looking for matches between items in your list based on, effectively, arbitrary criteria so you need to compare every entry with every other entry.

One improvement you could consider is to separate the iterating code from the comparison logic. This is where you were heading with the Predicate. To do this you would need a Predicate that takes two objects instead of one.

public interface PredicateComparator<T> {
    boolean compare(T o1, T o2);
}

Your code would now look something like this

public void filterPersons() {

    PredicateComparator<Person> predicate = new PredicateComparator<Person>() {
        public boolean compare(Person o1, Person o2) {
            // comparison logic in here
        }

    };

    List<Person> filtered = new ArrayList<Person>();
    for (Person p : persons) {
        for (Person t : persons) {
            if (predicate.compare(p, t)) {
                filtered.add(p);
            }
        }
    }

    System.out.println("filtered size is: " + filtered.size());
    for (Person p: filtered) {
        System.out.println(p.name);
    }
}

Upvotes: 1

Vitaliy

Reputation: 8206

Is there only one way to determine identity between two persons? If so, it would be best to encapsulate this by overriding 'equals' and 'hashcode'.

After you do this you can take one of the following approaches:

If you are creating a collection of Person instances and want to make sure that when adding the same person more than once will preserver only a single instance in the collection- use the Set interface as the underlying collection (and probably the HashSet implementation). With equals and hashcode properly in place, the set will not allow duplicates.
If you are given a collection (meaning you have no control of it's creation and therefore can't use the above approach to validate that it is constructed without duplicates) and want to filter out duplicate instances you can simply feed it to the constructor of a HashSet, like this:
```
Collection<Integer> containsRepeatingNumbers = Arrays.asList(1,2,3,4,3,3,3,3);
Set<Integer> alldistincts = new HashSet<>(containsRepeatingNumbers);
System.out.println(alldistincts);   //[1, 2, 3, 4]
```

BTW, if you anticipate multiple criteria for identity in the future, you can use the strategy propsed here

Upvotes: 0

Ankit Rustagi

Reputation: 5637

You can use hashmap to remove duplicates. Each entry in the map would denote

(DOB+ContactNo -> persons index in the original list)

The function

public void filterPersons() {       
    List<Person> filtered = new ArrayList<Person>(persons); // COPY the whole list
    HashMap<String,Integer> map = new HashMap<String,Integer>();
    int count=-1;

    for (Person p: persons) {
        count++;
        String g = p.sex; 
        String g_opp = g.equals("male")? "female":"male";

        if(!map.contains(p.dob+p.contactNo+g_opp))
        {
            // if not exists, add to map
            map.put(p.dob+p.contactNo,count+g);
        }
        else
        {
            // if duplicate found in map, remove both people from list
            filtered.remove(count);
            filtered.remove(map.get(p.dob+p.contactNo+g));

            // now filtered has 2 less elements, update count
            count -= 2;
        }
   }
}

Upvotes: 0

Best way to filter out collections based on certain rules

Answers (3)

Related Questions