Reputation: 303
I am very new to R.I want to create a new frequency column based on a particular column.
City age
ABC 47
AAB 48
AAB 41
AAB 984
ZZZ 984
MNO 1
MNO 34
VVC 34
VVC 36
VVC 41
VVC 32
MNO 20
BB 29
VVC 4
VVC 984
VVC 59
ABC 30
ABC 984
ABC 36
BB 69
ABC 32
ZZZ 3
ABC 29
ABC 29
AAB 1
AAB 984
ABC 59
I want data frame looks like -
City age Frequency
ABC 47 0.296296296
AAB 48 0.185185185
AAB 41 0.185185185
AAB 984 0.185185185
ZZZ 984 0.074074074
MNO 1 0.111111111
MNO 34 0.111111111
VVC 34 0.259259259
VVC 36 0.259259259
VVC 41 0.259259259
VVC 32 0.259259259
MNO 20 0.111111111
BB 29 0.074074074
VVC 4 0.259259259
VVC 984 0.259259259
VVC 59 0.259259259
ABC 30 0.296296296
ABC 984 0.296296296
ABC 36 0.296296296
BB 69 0.074074074
ABC 32 0.296296296
ZZZ 3 0.074074074
ABC 29 0.296296296
ABC 29 0.296296296
AAB 1 0.185185185
AAB 984 0.185185185
ABC 59 0.296296296
In frequency column, I used this formula -
ABC 8 0.296296296 (8/27)
MNO 3 0.111111111 (3/27)
BB 2 0.074074074 (2/27)
VVC 7 0.259259259 (7/27)
ZZZ 2 0.074074074 (2/27)
AAB 5 0.185185185 (5/27)
You can ignore 'Age' column. How to do it in R?
Thanks in advance.
Regards, John
Upvotes: 1
Views: 4063
Reputation: 887821
After grouping by 'City', create the 'Frequency' by dividing the number of rows (n()
) with the number of rows of the whole dataset
library(dplyr)
df1 %>%
group_by(City) %>%
mutate(Frequency = n()/nrow(.))
Upvotes: 3