Reputation: 285
This is similar to my previous questions posted: Create new column with distinct character values
But I also wanted some additional information.
df:
ID <- c(1,1,1,1,1,1,1,2,2,2,2,2)
color <- c("red","red","red","blue","green","green","blue",
"yellow","yellow","red","blue","green")
df <- data.frame(ID,color)
ID color
1 1 red
2 1 red
3 1 red
4 1 blue
5 1 green
6 1 green
7 1 blue
8 2 yellow
9 2 yellow
10 2 red
11 2 blue
12 2 green
Creating n_distinct_color (number of distinct colors each ID has):
df %>%
group_by(ID) %>%
distinct(color, .keep_all = T) %>%
mutate(n_distinct_color = n(), .after = ID) %>%
ungroup()
# A tibble: 7 × 3
ID n_distinct_color color
<dbl> <int> <chr>
1 1 3 red
2 1 3 blue
3 1 3 green
4 2 4 yellow
5 2 4 red
6 2 4 blue
7 2 4 green
Now I want to create:
ID n_distinct_color color frequency_of_color most_frequent_color
<dbl> <int> <chr> <int> <chr>
1 1 3 red 3 red
2 1 3 blue 2 red
3 1 3 green 2 red
4 2 4 yellow 2 yellow
5 2 4 red 1 yellow
6 2 4 blue 1 yellow
7 2 4 green 1 yellow
Also, what if there's a case where there are 2 colors with the same frequency (ie, ID 2's most frequent color are yellow and red, how will the data table be like?)
df_new:
ID <- c(1,1,1,1,1,1,1,2,2,2,2,2,2)
color <- c("red","red","red","blue","green","green","blue",
"yellow","yellow","red","blue","green","red")
df_new <- data.frame(ID,color)
ID color
1 1 red
2 1 red
3 1 red
4 1 blue
5 1 green
6 1 green
7 1 blue
8 2 yellow
9 2 yellow
10 2 red
11 2 blue
12 2 green
13 2 red
I would appreciate all the help there is! Thanks!!!
Upvotes: 0
Views: 42
Reputation: 52319
With a series of mutate
and summarise
you can achieve your goal. In case of ties, here [1]
means the first tied color is chosen:
library(dplyr) #1.1.0 or above required
df %>%
mutate(n_distinct = n_distinct(color), .by = ID) %>%
summarise(frequency = n(), .by = c(ID, n_distinct, color)) %>%
mutate(most_frequent = color[which.max(frequency)[1]], .by = ID)
output
ID n_distinct color frequency most_frequent
1 1 3 red 3 red
2 1 3 blue 2 red
3 1 3 green 2 red
4 2 4 yellow 2 yellow
5 2 4 red 2 yellow
6 2 4 blue 1 yellow
7 2 4 green 1 yellow
Upvotes: 2