Guforu
Guforu

Reputation: 4023

DataFrame, count unique values, Java

I have a DataFrame and I want to count the uniqe lines of two columns in this Data Frame. For example:

a x
a x
a y
b y 
b y
b y

should be to:

a x 2
a y 1
b y 3

I know the solution for this operation in pandas DataFrame, but now I want to do it direct in Java (the best way is Java 8).

Upvotes: 1

Views: 456

Answers (2)

Guforu
Guforu

Reputation: 4023

I have found the next solution by myself. Copy here, if somebody has an interest....

DataFrame df2 = df.groupBy("Column_one", "Column_two").count();
df2.show();

Upvotes: 0

assylias
assylias

Reputation: 328568

I am not sure what input type you have, but assuming you have a List<DataFrame> list and DataFrame implements equals/hashcode as expected, you could use a combination of two collectors:

Map<DataFrame, Long> count = list.stream().collect(groupingBy(x -> x, counting()));

which requires the following static imports:

import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;

Upvotes: 3

Related Questions