Reputation: 1238

How to produce a frequency dataframe using 3 columns?

From a dataframe of 3 columns

df <- data.frame(month = c(1,1,1,2,2,2,3,3,4,4), country = c("US", "US", "US", "Brazil", "US", "Brazil", "UK", "UK", "Brazil", "US"), id = c(1,2,3,4,5,6,7,8,9,10))

I would like to count the id for every country for every month. How is it possible to take the result of table() with more than one columns?

Here an example output:

df2 <- data.frame(month=c(1,1,1,2,2,2,3,3,3,4,4,4), country = c("US", "Brazil", "UK","US", "Brazil", "UK","US", "Brazil", "UK","US", "Brazil", "UK"), frequency = c(3,0,0,1,2,0,0,0,2,1,1,0))
> df2
   month country frequency
1      1      US         3
2      1  Brazil         0
3      1      UK         0
4      2      US         1
5      2  Brazil         2
6      2      UK         0
7      3      US         0
8      3  Brazil         0
9      3      UK         2
10     4      US         1
11     4  Brazil         1
12     4      UK         0

Upvotes: 0

Answers (2)

r2evans

Reputation: 160852

In general, whenever I think about frequency tables with more than one column (table(...)), I think of xtabs(~each + and + every + column, data=myframe). That's not to say that table doesn't support multiple columns ... it does, but I find the formula interface a little more intuitive to me.

xtabs returns 3D tables as well(see xtabs(~cyl+vs+am,data=mtcars)), and as you can tell, it can easily be converted into a "long"-style frame (which it has in common with table). (In fact, the output from xtabs includes the class "table", so anything that works on the latter should work on xtabs. *shrug*)

as.data.frame(xtabs(~ month + country, data = df))
#    month country Freq
# 1      1  Brazil    0
# 2      2  Brazil    2
# 3      3  Brazil    0
# 4      4  Brazil    1
# 5      1      UK    0
# 6      2      UK    0
# 7      3      UK    2
# 8      4      UK    0
# 9      1      US    3
# 10     2      US    1
# 11     3      US    0
# 12     4      US    1

Upvotes: 3

Shree

Reputation: 11150

Here's a way with dplyr and tidyr::complete() -

df %>% 
  count(month, country) %>% 
  complete(month, country, fill = list(n = 0))

# A tibble: 12 x 3
   month country     n
   <dbl> <fct>   <dbl>
 1     1 Brazil      0
 2     1 UK          0
 3     1 US          3
 4     2 Brazil      2
 5     2 UK          0
 6     2 US          1
 7     3 Brazil      0
 8     3 UK          2
 9     3 US          0
10     4 Brazil      1
11     4 UK          0
12     4 US          1

Upvotes: 3

How to produce a frequency dataframe using 3 columns?

Answers (2)

Related Questions