ak2020
ak2020

Reputation: 95

Group by multiple fields and filter by common value of a field

@Data
public class Employee{

    private int empid;
    private  String empPFcode;
    private String collegeName;
}

Employee emp1=new Employee (1334090,"220","AB");
Employee emp2=new Employee (1334091,"220","AB");
Employee emp3=new Employee (1334092,"220","AC");
Employee emp4=new Employee (1434091,"221","DP");
Employee emp5=new Employee (1434091,"221","DP");
Employee emp6=new Employee (1434092,"221","DP");

I want to filter this Employee object based on the EmpPFcode . If collegeName has common value for 3 EmpPFcode, we will collect otherwise we will skip that records.

So my result would be like below.

Employee emp4=new Employee (1434091,"221","DP");
Employee emp5=new Employee (1434091,"221","DP");
Employee emp6=new Employee (1434092,"221","DP");

Below one will skip because collageName is different.

I try to do some logic below but it doesn't not filter properly.

List<CombinedDTO> distinctElements = list.stream().filter(distinctByKeys(Employee ::empPFcode,Employee ::collegeName))
                .collect(Collectors.toList());


public static <T> Predicate <T> distinctByKeys(Function<? super T, Object>... keyExtractors) {
     Map<Object, Boolean> uniqueMap = new ConcurrentHashMap<>();

     return t ->
     {
         final List<?> keys = Arrays.stream(keyExtractors)
                 .map(ke -> ke.apply(t))
                 .collect(Collectors.toList());

         return uniqueMap.putIfAbsent(keys, Boolean.TRUE) == null;
     };
}

Upvotes: 1

Views: 906

Answers (2)

H&#252;lya
H&#252;lya

Reputation: 3433

I. Solution:

A more cleaner and readable solution would be to have a set of empPFcode values ([221]), then filter the employee list only by this set.

First you can use Collectors.groupingBy() to group by empPFcode, then you can use Collectors.mapping(Employee::getCollegeName, Collectors.toSet()) to get a set of collegeName values.

Map<String, Set<String>> pairMap = list.stream().collect(Collectors.groupingBy(Employee::getEmpPFcode,
        Collectors.mapping(Employee::getCollegeName, Collectors.toSet()))); 

will result in: {220=[AB, AC], 221=[DP]}

Then you can remove the entries which includes more than one collegeName:

pairMap.values().removeIf(v -> v.size() > 1); 

will result in: {221=[DP]}

The last step is filtering the employee list by the key set. You can use java.util.Set.contains() method inside the filter:

List<Employee> distinctElements = list.stream().filter(emp -> pairMap.keySet().contains(emp.getEmpPFcode()))
        .collect(Collectors.toList());

II. Solution:

If you use Collectors.groupingBy() nested you'll get a Map<String,Map<String,List<Employee>>>:

{
   220 = {AB=[...], AC=[...]}, 
   221 = {DP=[...]}
}

Then you can filter by the map size (Map<String,List<Employee>>) to eliminate the entries which has more than one map in their values (AB=[...], AC=[...]).

You still have a Map<String,Map<String,List<Employee>>> and you only need List<Employee>. To extract the employee list from the nested map, you can use flatMap().

Try this:

List<Employee> distinctElements = list.stream()
                .collect(Collectors.groupingBy(Employee::getEmpPFcode, Collectors.groupingBy(Employee::getCollegeName)))
                .entrySet().stream().filter(e -> e.getValue().size() == 1).flatMap(m -> m.getValue().values().stream())
                .flatMap(List::stream).collect(Collectors.toList());

Upvotes: 1

WJS
WJS

Reputation: 40062

Here is one way to do it. I added some extra values to the list of Employee's.

List<Employee> list =
        List.of(new Employee(1334090, "220", "AB"),
                new Employee(1334091, "220", "AB"),
                new Employee(1334092, "220", "AC"),
                new Employee(1434091, "221", "DP"),
                new Employee(1434091, "221", "DP"),
                new Employee(1434092, "221", "DP"),
                new Employee(1434091, "222", "AB"),
                new Employee(1434091, "222", "AB"));
  • create a map of maps based first on empPFcode as the key
  • then as collegeName as the key.
  • if the inner map size is greater than 1 for the college key, then exclude it since it has extra collegeNames for the empPFcode.
  • then simply flatten the inner map to its entrySet and pull off the value which will be the Employee object.
List<Employee> results = list.stream()
        .collect(Collectors.groupingBy(Employee::getEmpPFcode,
                Collectors.groupingBy(
                        Employee::getCollegeName)))
        .values().stream().filter(map -> map.size() == 1)
        .flatMap(m -> m.entrySet().stream())
        .flatMap(e -> e.getValue().stream())
        .collect(Collectors.toList());

 results.forEach(System.out::println);
    

Prints

[1434091,  221,  DP]
[1434091,  221,  DP]
[1434092,  221,  DP]
[1434091,  222,  AB]
[1434091,  222,  AB]

Here is the modified class

class Employee {
    
    private int empid;
    private String empPFcode;
    private String collegeName;
    
    public Employee(int empid, String empPFcode,
            String collegeName) {
        this.empid = empid;
        this.empPFcode = empPFcode;
        this.collegeName = collegeName;
    }
    
    public int getEmpid() {
        return empid;
    }
    
    public String getEmpPFcode() {
        return empPFcode;
    }
    

    public String getCollegeName() {
        return collegeName;
    }
    
    @Override
    public String toString() {
        return String.format("[%s,  %s,  %s]", empid, empPFcode,
                collegeName);
    }
    
}

Upvotes: 0

Related Questions