Hazim
Hazim

Reputation: 381

Java 8 streams: sum values on a distinct key

I have a file with rows with the following column headers:

CITY_NAME  COUNTY_NAME  POPULATION

Atascocita  Harris  65844
Austin  Travis  931820
Baytown Harris  76335
...

I am using streams to attempt to generate an output similar to:

COUNTY_NAME  CITIES_IN_COUNTY  POPULATION_OF_COUNTY
Harris  2  142179
Travis  1  931820
...

So far I have been able to use streams to get a list of distinct county names (as these are repetitive), but now I am having issues with getting the count of cities in a distinct county and consequently the sum of the populations of cities in these counties. I have read the file into an ArrayList of type texasCitiesClass, and my code thus far looks like:

public static void main(String[] args) throws FileNotFoundException, IOException {
    PrintStream output = new PrintStream(new File("output.txt"));
    ArrayList<texasCitiesClass> txcArray = new ArrayList<texasCitiesClass>();
    initTheArray(txcArray); // this method will read the input file and populate an arraylist
    System.setOut(output);

    List<String> counties;
    counties = txcArray.stream()
            .filter(distinctByKey(txc -> txc.getCounty())) // grab distinct county names
            .distinct() // redundant?
            .sorted((txc1, txc2) -> txc1.getCounty().compareTo(txc2.getCounty())); // sort alphabetically

}

public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) {
    Map<Object, String> seen = new ConcurrentHashMap<>();
    return t -> seen.put(keyExtractor.apply(t), "") == null;
}    

At this point, I have a stream containing the names of unique counties. Since the sorted() operator will return a new stream, how can I obtain (and thus sum) the population values for the counties?

Upvotes: 3

Views: 2839

Answers (1)

user140547
user140547

Reputation: 8200

Given the classes (ctor, getter, setter omitted)

class Foo {
    String name;
    String countyName;
    int pop;
}

class Aggregate {
      String name;
      int count;
      int pop;
}

You could aggregate your values by mapping them to Aggregate Objects using Collectors.toMap and merging them using its mergeFunction. Using the TreeMap, its entries are ordered by its key.

TreeMap<String, Aggregate> collect = foos.stream()
        .collect(Collectors.toMap(
                Foo::getCountyName,
                foo -> new Aggregate(foo.countyName,1,foo.pop),
                (a, b) -> new Aggregate(b.name, a.count + 1, a.pop + b.pop),
                TreeMap::new)
        );

Using

List<Foo> foos = List.of(
        new Foo("A", "Harris", 44),
        new Foo("C", "Travis  ", 99),
        new Foo("B", "Harris", 66)
);

the map is

{Harris=Aggregate{name='Harris', count=2, pop=110}, Travis =Aggregate{name='Travis ', count=1, pop=99}}

Upvotes: 4

Related Questions