Reputation: 323
Using following dataset:
set.seed(2)
origin <- rep(c("DEU", "GBR", "ITA", "NLD", "CAN", "MEX", "USA", "CHN", "JPN", "KOR","DEU", "GBR", "ITA", "NLD", "CAN", "MEX", "USA", "CHN", "JPN", "KOR"), 4)
year <- rep(c(rep(1998, 10), rep(2000, 10)), 2)
type <- sample(1:10, size=length(origin), replace=TRUE)
value <- sample(100:10000, size=length(origin), replace=TRUE)
test.df <- as.data.frame(cbind(origin, year, type, value))
rm(origin, year, type, value)
### add some (6) missing values
test.df$value[sample(1:length(test.df$value), 6, replace = FALSE)] <- NA
I want to count how many type
s by country (origin
) per year
I tryed:
count(trade.df, origin, year)
and
test.df %>% group_by(origin, year) %>% count()
but I am not sure of how I can interpret these results.
of course, if value == NA
, R should not count it...
Upvotes: 0
Views: 61
Reputation: 7592
To remove the rows where value
is NA
, use filter
:
test.df %>% group_by(origin,year) %>%
filter(!is.na(value)) %>% count()
# A tibble: 20 x 3
# Groups: origin, year [20]
origin year n
<fct> <fct> <int>
1 CAN 1998 4
2 CAN 2000 3
3 CHN 1998 3
4 CHN 2000 4
5 DEU 1998 4
6 DEU 2000 4
7 GBR 1998 4
8 GBR 2000 4
9 ITA 1998 3
10 ITA 2000 4
11 JPN 1998 3
12 JPN 2000 3
13 KOR 1998 4
14 KOR 2000 4
15 MEX 1998 4
16 MEX 2000 4
17 NLD 1998 3
18 NLD 2000 4
19 USA 1998 4
20 USA 2000 4
Note, however, that this doesn't count how many type
s there are in each group, but how many rows there are. If you want to count the number of unique type
s, you can do this:
test.df %>% group_by(origin,year) %>%
filter(!is.na(value)) %>%
summarize(n_distinct(type)) #Merci, @Frank!
# A tibble: 20 x 3
# Groups: origin [?]
origin year `length(unique(type))`
<fct> <fct> <int>
1 CAN 1998 3
2 CAN 2000 3
3 CHN 1998 2
4 CHN 2000 3
5 DEU 1998 4
6 DEU 2000 3
7 GBR 1998 4
8 GBR 2000 4
9 ITA 1998 3
10 ITA 2000 4
11 JPN 1998 3
12 JPN 2000 2
13 KOR 1998 4
14 KOR 2000 4
15 MEX 1998 3
16 MEX 2000 3
17 NLD 1998 2
18 NLD 2000 3
19 USA 1998 3
20 USA 2000 4
Upvotes: 2