Reputation: 532
I have a data frame like the following:
a b
1 23
2 34
1 34
3 45
1 56
3 567
2 67
2 90
1 91
3 98
I want to get the data frame with rows grouped according to the frequency of values in the first column. The output should be like the following:
a b freq
1 23 4
1 34 4
1 56 4
1 91 4
2 34 3
2 67 3
2 90 3
3 45 3
3 567 3
3 98 3
I have written the following code in R:
import library(dplyr)
setDT(df)[,freq := .N, by = "a"]
sorted = df[order(freq, decreasing = T),]
sorted
However, I get the following data frame as the output.
a b freq
1: 1 23 4
2: 1 34 4
3: 1 56 4
4: 1 91 4
5: 2 34 3
6: 3 45 3
7: 3 567 3
8: 2 67 3
9: 2 90 3
10: 3 98 3
How can I solve this problem?
Upvotes: 1
Views: 218
Reputation: 2716
> df <- read.table(text = 'a b
+ 1 23
+ 2 34
+ 1 34
+ 3 45
+ 1 56
+ 3 567
+ 2 67
+ 2 90
+ 1 91
+ 3 98', header = T, stringsAsFactors = F)
>
> df %>% group_by(a) %>%
+ mutate(Freq = n()) %>%
+ ungroup() %>%
+ arrange(a)
# A tibble: 10 × 3
a b Freq
<int> <int> <int>
1 1 23 4
2 1 34 4
3 1 56 4
4 1 91 4
5 2 34 3
6 2 67 3
7 2 90 3
8 3 45 3
9 3 567 3
10 3 98 3
Upvotes: 1
Reputation: 886938
We can use n()
library(dplyr)
df1 %>%
group_by(a) %>%
mutate(freq = n()) %>%
arrange(a, desc(freq))
# A tibble: 10 x 3
# Groups: a [3]
# a b freq
# <int> <int> <int>
# 1 1 23 4
# 2 1 34 4
# 3 1 56 4
# 4 1 91 4
# 5 2 34 3
# 6 2 67 3
# 7 2 90 3
# 8 3 45 3
# 9 3 567 3
#10 3 98 3
Upvotes: 1
Reputation: 3650
It looks like you want to use setorder
from data.table
package.
You have ordered your data by freq
, but you want also to apply order on column a
.
setorder
example:
> set.seed(12)
> df <- data.table(freq = sample(5, 5), a = sample(5, 5))
> df
freq a
1: 1 1
2: 4 5
3: 3 2
4: 5 4
5: 2 3
> setorder(df, freq, a)
> df
freq a
1: 1 1
2: 2 3
3: 3 2
4: 4 5
5: 5 4
Upvotes: 1