Reputation: 4023
I have a DataFrame and I want to count the uniqe lines of two columns in this Data Frame. For example:
a x
a x
a y
b y
b y
b y
should be to:
a x 2
a y 1
b y 3
I know the solution for this operation in pandas DataFrame, but now I want to do it direct in Java (the best way is Java 8).
Upvotes: 1
Views: 456
Reputation: 4023
I have found the next solution by myself. Copy here, if somebody has an interest....
DataFrame df2 = df.groupBy("Column_one", "Column_two").count();
df2.show();
Upvotes: 0
Reputation: 328568
I am not sure what input type you have, but assuming you have a List<DataFrame> list
and DataFrame
implements equals/hashcode as expected, you could use a combination of two collectors:
Map<DataFrame, Long> count = list.stream().collect(groupingBy(x -> x, counting()));
which requires the following static imports:
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;
Upvotes: 3