Reputation: 171
I have the following data:
df <- read.table(text =
" id country
1 IT
1 IT
1 USA
2 USA
2 FR
2 IT
3 USA
3 USA
3 IT
3 FR", header = T)
I need to find frequency of each country within every ID. So, the desired output is:
id IT USA FR
1 2 1 0
2 1 1 1
3 1 2 1
I know how to calculate with count() the number of rows for each id but I don't know how to display by each country. Thanks for help!
Upvotes: 0
Views: 357
Reputation: 757
it can be done with xtabs
in simple way :
xtabs(~df$id+df$country) or
xtabs(~df+country,data=df)
output:
df$country
df$id FR IT USA
1 0 2 1
2 1 1 1
3 1 1 2
Upvotes: 0
Reputation: 8374
With dplyr
:
library(dplyr)
df %>%
group_by(id) %>%
count(country) %>% # count having grouped by ids
spread(country, n) # we spread the values, in order to have long format
# A tibble: 3 x 4
# Groups: id [3]
id FR IT USA
<int> <int> <int> <int>
1 1 NA 2 1
2 2 1 1 1
3 3 1 1 2
This if you want to replace NA
with 0
:
df %>%
group_by(id) %>%
count(country) %>%
spread(country, n) %>%
mutate_each(funs(replace(., is.na(.), 0))) # mutate applied for all variables, where we find NA
# A tibble: 3 x 4
# Groups: id [3]
id FR IT USA
<int> <dbl> <dbl> <dbl>
1 1 0 2 1
2 2 1 1 1
3 3 1 1 2
Upvotes: 3